• Computer Science > Networking and Internet Architecture [Submitted on 17 Jul 2025 (v1), last revised 19 Feb 2026 (this version, v3)] Title:NetForge: A Programmable Substrate for Bottleneck-Centric Network Data Generation View PDF HTML (experimental)Abstract:The behavior of Internet applications is shaped by congestion dynamics at bottleneck links, yet data capturing application behavior across diverse bottleneck regimes remains scarce. • Bridging this gap requires a data-generation substrate that simultaneously provides controllability, composability, fidelity, and replicability–capabilities existing approaches struggle to achieve simultaneously. • This paper introduces NetForge, a programmable substrate for bottleneck-centric data generation guided by progressive disaggregation: NetForge (i) decouples bottleneck intent from execution, (ii) separates static bottleneck attributes from dynamic congestion pressure, and (iii) disaggregates observed demand dynamics from their original trace context via Cross-Traffic Profiles (CTPs). • CTPs transform passive packet traces into reusable, composable pressure signals that can be selected and transformed to specify dynamic bottleneck behavior. • Our evaluation shows that NetForge satisfies the four requirements and, in an ABR case study, generates data that remains realistic, expands coverage into underrepresented regimes, and, in turn, improves model performance by up to 47% by reducing transmission-time prediction error of the Fugu model.

Article Summaries:

  • NetForge is a new software framework that lets researchers generate realistic network traffic data focused on bottleneck links. By separating a bottleneck’s static characteristics from its dynamic congestion pressure, NetForge decouples intent from execution and introduces Cross‑Traffic Profiles (CTPs) that convert passive packet traces into reusable pressure signals. The authors evaluate the system on an adaptive‑bit‑rate (ABR) case study, showing that NetForge’s data covers previously under‑represented bottleneck regimes and improves the Fugu transmission‑time predictor by up to 47 % compared with conventional traces. The work demonstrates a practical, reproducible substrate for studying Internet application behavior across diverse congestion scenarios.

Sources: