Federated Reasoning Distillation Framework with Model Learnability-Aware Data Allocation

• Addresses bidirectional model learnability gap in federated LLM-SLM reasoning collaboration. • Introduces LaDa framework with learnability-aware data filter for high-reward sample selection. • Enables SLMs to identify samples matching their learning constraints for effective knowledge transfer. • LLMs can now pick samples that add novel knowledge beyond existing data. • Adds domain-adaptive reasoning distillation aligning joint probabilities of reasoning paths via contrastive learning. • LaDa functions as a plug‑in, seamlessly integrating with existing federated collaboration systems.

Article Summaries:

A new federated learning framework, LaDa, tackles two key gaps in large‑language‑model (LLM) and small‑language‑model (SLM) collaboration. Existing methods struggle with a bidirectional learnability mismatch: SLMs cannot pinpoint high‑reward samples that fit their learning limits, while LLMs cannot easily select data that adds novel knowledge. LaDa introduces a learnability‑aware data filter that allocates samples based on the specific gap between each LLM‑SLM pair, enabling more effective knowledge transfer. It also adds a domain‑adaptive distillation step that aligns reasoning paths through contrastive learning, helping SLMs acquire step‑by‑step reasoning tailored to local data. The system functions as a plug‑in for existing federated frameworks.

Sources:

https://arxiv.org/abs/2602.18749