Scaling State-Space Models on Multiple GPUs with Tensor Parallelism
• Computer Science > Distributed, Parallel, and Cluster Computing [Submitted on 24 Feb 2026] Title:Scaling State-Space Models on Multiple GPUs with Tensor Parallelism View PDF HTML
• Computer Science > Distributed, Parallel, and Cluster Computing [Submitted on 24 Feb 2026] Title:Scaling State-Space Models on Multiple GPUs with Tensor Parallelism View PDF HTML
• Computer Science > Machine Learning [Submitted on 6 Mar 2025 (v1), last revised 24 Feb 2026 (this version, v4)] Title:Semantic Parallelism: Redefining Efficient MoE Inference via
• Computer Science > Machine Learning [Submitted on 24 Feb 2026] Title:Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking View PDF HTML (experimental)Abstra
• This post introduces Dynamic Context Parallelism (Dynamic-CP), a scheduling approach in NVIDIA Megatron Core used for LLM post-training or DiT pre-training. • It dynamically sele