• Yoann Ghigoff Jonathan Ribas Sylvain Afchain Sylvain Baubeau Guillaume Fournier File integrity monitoring (FIM) helps teams detect unauthorized changes to sensitive files and is a critical part of any security posture. • Yet building an FIM system that works reliably across modern, large-scale infrastructure is harder than it looks. • When we set out to build an FIM system that could handle the realities of Datadogâs product environments, we quickly realized that existing approaches werenât going to cut it. • Periodic filesystem scans, for example, seem simple and reliable on paper. • In practice, they miss the kinds of events we care about most: If an attacker tampers with a file and reverts the change before the next scan, itâs as if nothing ever happened. • Even when a scan does catch something, it only tells us that a file changedânot how it changed, why it changed, or who changed it.

Article Summaries:

  • Datadog’s engineers built a file‑integrity monitoring (FIM) system that can handle the scale of modern cloud infrastructure. Traditional periodic scans and Linux tools such as inotify and auditd were insufficient because they missed rapid changes, lacked process and container context, and suffered from high overhead. By leveraging eBPF, the team gained real‑time kernel‑level visibility into file events, including the triggering process and container. The main challenge was the volume-over 10 billion file events per minute-so they implemented kernel‑side pre‑filtering that discards 94 % of events before they reach the agent, preserving critical signals while keeping CPU, memory, and network usage manageable.

Sources: