FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigate Head-of-Line Blocking in LLM Serving
• Computer Science > Distributed, Parallel, and Cluster Computing [Submitted on 18 Feb 2026] Title:FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigat