• Introduces Ringleader ASGD, the first asynchronous SGD achieving optimal time complexity under data heterogeneity. • Eliminates unrealistic similarity assumptions across workers’ data distributions, enabling truly heterogeneous federated learning. • Proves theoretical lower bounds for smooth nonconvex regime, matching optimal performance of parallel first‑order methods. • Maintains optimality even with arbitrary, time‑varying worker computation speeds, closing a long‑standing theory gap. • Demonstrates scalability on smartphones and edge devices, promising faster, more efficient distributed training. • Provides rigorous convergence analysis and practical algorithmic insights for real‑world deployments.

Article Summaries:

  • Mathematics > Optimization and Control [Submitted on 26 Sep 2025 (v1), last revised 19 Feb 2026 (this version, v3)] Title:Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity View PDF HTML (experimental)Abstract:Asynchronous stochastic gradient methods are central to scalable distributed optimization, particularly when devices differ in computational capabilities. Such settings arise naturally in federated learning, where training takes place on smartphones and other heterogeneous edge devices. In addition to varying computation speeds, these device

Sources: