• We’re sharing details of the role backend aggregation (BAG) plays in building Meta’s gigawatt-scale AI clusters likePrometheus. • BAG allows us to seamlessly connect thousands of GPUs across multiple data centers and regions. • Our BAG implementation is connecting two different network fabrics -Disaggregated Schedule Fabric (DSF)andNon-Scheduled Fabric (NSF). • Once it’s complete our AI cluster,Prometheus, will deliver 1-gigawatt of capacity to enhance and enable new and existing AI experiences across Meta products. • Prometheus’ infrastructure will span several data center buildings in a single larger region, interconnecting tens of thousands of GPUs. • A key piece of scaling and connecting this infrastructure is backend aggregation (BAG), which we use to seamlessly connect GPUs and data centers with robust, high-capacity networking.
Article Summaries:
- Meta has unveiled its Backend Aggregation (BAG) system, a central Ethernet‑based super‑spine network layer that will underpin the company’s gigawatt‑scale AI cluster, Prometheus. BAG connects thousands of GPUs across multiple data centers and regions, linking Meta’s Disaggregated Schedule Fabric (DSF) and Non‑Scheduled Fabric (NSF) networks. Distributed BAG layers use planar or spread topologies to balance management simplicity with path diversity, and support petabit‑range inter‑BAG bandwidth (16‑48 Pbps per region pair). The modular chassis, powered by Jericho‑3 ASICs, provides high‑capacity routing and oversubscription ratios around 4.5:1, enabling Prometheus to deliver 1 gigawatt of AI compute across several buildings.
Sources:
- https://engineering.fb.com/2026/02/09/data-center-engineering/building-prometheus-how-backend-aggregation-enables-gigawatt-scale-ai-clusters/ (Latest source article published: 2026-02-09 17:00 UTC)