Latent Context Compilation: Distilling Long Context into Compact Portable Memory

• Computer Science > Machine Learning [Submitted on 31 Jan 2026] Title:Latent Context Compilation: Distilling Long Context into Compact Portable Memory View PDF HTML (experimental)Abstract:Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that complicate concurrent serving • We propose Latent Context Compilation, a framework that fundamentally shifts context processing from adaptation to compilation • By utilizing a disposable LoRA module as a compiler, we distill long contexts into compact buffer tokens – stateless, portable memory artifacts that are plug-and-play compatible with frozen base models • Crucially, we introduce a self-aligned optimization strategy that eliminates the need for synthetic context-relevant QA pairs • By regularizing context reconstruction task with context-agnostic random queries, we force compressed tokens to reside within the model’s existing instruction-following manifold • Experiments with Llama-3

Article Summaries:

Computer Science > Machine Learning [Submitted on 31 Jan 2026] Title:Latent Context Compilation: Distilling Long Context into Compact Portable Memory View PDF HTML (experimental)Abstract:Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that complicate concurrent serving. We propose Latent Context Compilation, a framework that fundamentally shifts context pro

Sources:

https://arxiv.org/abs/2602.21221 (Latest source article published: 2026-02-26 05:00 UTC)