• Nayef Ghattas In Part 1: How we tracked down a Go 1.24 memory regression across hundreds of pods, we shared how upgrading to Go 1.24 introduced a subtle runtime regression that increased physical memory usage (RSS) across Datadog services. • We worked with the Go community to identify the issue, validate the fix, and plan a safe rollout. • But while tracking the rollout, we noticed something surprising: in our highest-traffic environments, memory usage didnât just recoverâit dropped significantly. • That raised new questions: What changed in Go 1.24 to make some workloads more memory efficient? • And why wasnât the improvement consistent across all environments? • In this post, weâll explore how Goâs new Swiss Tables implementation helped reduce memory usage in a large in-memory map, show how we profiled and sized the change, and share the struct-level optimizations that led to even larger fleet-wide savings.
Article Summaries:
- Datadog’s engineering team uncovered a memory regression in Go 1.24 that increased physical memory usage across hundreds of pods. After collaborating with the Go community to patch the issue, they observed an unexpected drop in memory consumption in high‑traffic environments. The improvement stems from Go 1.24’s new Swiss Tables map implementation, which reduces the size of large in‑memory maps such as the shardRoutingCache. Profiling revealed roughly 500 MiB of live heap savings per instance, translating to about 1 GiB of RSS reduction fleet‑wide. The post details the map’s structure, the optimization’s impact, and the resulting fleet‑wide savings.
Sources: