• Sami Tabet At Datadog, we process more than 100 trillion events and billions of queries every dayâacross logs, traces, network data, and more. • To support that scale, we built Husky, our third-generation event store. • We detailed its architecture in a series of posts on exactly-once ingestion and multi-tenancy and massively parallel compaction. • But all of that engineeringâefficient storage, compaction, reliability under bursty trafficâwas in service of a single goal: interactive querying at scale. • Storing the data is just the beginning. • The real challenge is making that data queriableâquickly, cheaply, and reliablyâeven when: - Thereâs no fixed schema or column types - Data shape and volume vary across tenants - Queries span millions of files in object store (called fragments) and petabytes of data In this post, weâll explore how Huskyâs query engine tackles these problems head-on, and how its architecture enables interactive performance, even under extreme workloads.

Article Summaries:

  • Datadog has unveiled Husky, its third‑generation event store designed to support real‑time querying of over 100 trillion events and billions of queries daily. Husky’s architecture focuses on efficient, reliable storage and massively parallel compaction, enabling interactive performance across logs, traces, and network data. The system handles heterogeneous schemas, variable tenant workloads, and petabyte‑scale data by dividing the query path into four services-planner, orchestrator, metadata, and reader-each multi‑tenant and distributed across regions. Husky targets two main query patterns: highly selective “needle‑in‑a‑haystack” searches and broader analytics‑style aggregations, delivering low‑latency results even under extreme traffic bursts.

Sources: