• Khayyam Guliyev Duarte Nunes Ming Chen Justin Jaffray As Datadog continues to scale, the volume, complexity, and cardinality of the metrics we ingest and store steadily grow by orders of magnitude. • This growth pushes the boundaries of our core timeseries databaseâthe internal system responsible for storing raw metric data and serving it to customer queries in real time. • As with any system facing growing traffic, over time, we encounter new performance challenges, especially under high-cardinality workloads, increasingly complex queries, and bursty traffic patterns. • Designing a new storage engine is never a decision to take lightly. • Itâs a deeply complex undertaking with far-reaching implications for performance, reliability, and operational risk. • So we always push the existing system as far as we can while building the next generation.
Article Summaries:
- Datadog has unveiled the sixth‑generation real‑time timeseries storage engine, built from scratch in Rust to meet escalating metric volumes. The new system splits real‑time storage into two services-an RTDB for raw data and aggregates, and an index database for metric identifiers and tags-allowing the ingestion router to balance load across nodes. Engineers report a 60‑fold boost in ingestion throughput and queries that are five times faster at peak scale, achieved through aggressive performance optimizations and a redesigned architecture that supports high‑cardinality workloads and bursty traffic. The upgrade reflects Datadog’s ongoing effort to scale its metrics platform while maintaining low latency and reliability.
Sources: