• Authors: Andrew Garrett, Md Mansurul Bhuiyan With 10s of thousands of new posts on Reddit each day, identifying content that is simultaneously timely, newsworthy, and engaging presents a significant challenge. • Our standard notification recommendation system, which focuses on what you already like and what’s popular, often misses out on fast-moving, important events. • To address this, we developed a new system that mixes the smart predictions of machine learning with the deep understanding of LLMs to pinpoint and deliver those crucial, breaking stories. • Here’s how it works: We have a three-step scoring system. • First, an XGBoost model gives us an “Engagement Score” by looking at how people react to a post early on, predicting how many eyes will be on it in 24 hours. • Second, we use an LLM with a detailed editorial guide to create a “Breakingness Score.” This score checks how urgent the content is, how trustworthy the source is, and its overall newsworthiness, all while filtering out anything sensitive or inappropriate.
Article Summaries:
- Reddit researchers Andrew Garrett and Md Mansurul Bhuiyan unveiled a hybrid machine‑learning and large‑language‑model framework to spot timely, high‑impact posts. The system first uses an XGBoost model to predict a post’s 24‑hour engagement, then an LLM-guided by an editorial rubric-assigns a “breakingness” score that evaluates urgency, source credibility, and newsworthiness while filtering sensitive content. The two scores are multiplied, and only posts above the 99.8th percentile are selected for push notifications. By shifting from a user‑first to a content‑first recommendation strategy, the approach aims to deliver breaking news to users immediately while remaining computationally efficient.
Sources: