• Written by Vignesh Raja and Jerry Chu. • Background and Motivation In a previous post , we introduced Signals-Joiner, a Flink application that enriches input for our real-time, anti-abuse rules-engine, Rule-Executor V2 (REV2) , with complex ML signals. • Since then, the application has been widely adopted to enrich more safety signals, powering Reddit’s real-time actioning needs. • Recall the high-level architecture of Signals-Joiner below: https://preview.redd.it/ib96zxso7qtf1.png?width=1354&format=png&auto=webp&s=ee4e862e176b381ed0a23cfc2fc859f90c05a954 As is often the case, running a system in production uncovers opportunities for improvement. • For Signals-Joiner, we observed that there was room to improve signal enrichment rates, the primary metric we track to measure system efficacy. • Enrichment rate is defined as the percentage of messages that are successfully enriched with a relevant signal, measured independently for each signal stream flowing into Signals-Joiner.
Article Summaries:
- Reddit’s Signals‑Joiner, a Flink‑based enrichment engine for its real‑time anti‑abuse rules, was re‑architected to boost signal enrichment rates. The original design used chained tumbling windows aligned to the Unix epoch, which caused missed joins when content arrived near a window boundary. To address this, engineers replaced the built‑in tumbling windows with custom join logic that creates a separate, key‑aligned window for each content item. This change eliminates boundary‑related misses, allows precise measurement of signal delay, and improves overall enrichment performance while maintaining throughput and reliability.
Sources: