• Reddit’s Flaky Test Quarantine Service (FTQS) evolved from static config to dynamic, sequence-based quarantine. • Static configuration caused bottlenecks as test suite grew, leading to time‑travel errors across branches. • Sequence numbers track quarantine state, ensuring tests run only when truly fixed in the current branch. • The new system maintains Configuration‑as‑Code benefits while scaling with the engineering team. • Dynamic quarantine improves CI stability, reduces on‑call firefighting, and enhances developer experience. • This shift‑left approach exemplifies how automated test hygiene can grow with product complexity.

Article Summaries:

  • Reddit’s Flaky Test Quarantine Service (FTQS) has evolved from a static, repository‑based quarantine file to a sequence‑based dynamic system. The original approach, introduced in a prior post, halted flaky test failures by committing a quarantine list alongside code, improving CI stability and developer experience. However, as test volumes grew, the static file became a bottleneck: feature branches could become out‑of‑date, forcing developers to rebase to receive new quarantine updates and increasing merge conflicts, build times, and cognitive load. The new dynamic system aims to eliminate this “rebase‑for‑update” friction by automatically updating quarantine status in real time.

Sources: