Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling

Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling

• Computer Science > Machine Learning [Submitted on 6 Mar 2025 (v1), last revised 24 Feb 2026 (this version, v4)] Title:Semantic Parallelism: Redefining Efficient MoE Inference via