• Han Wang | Machine Learning Engineer; Alex Whitworth | Staff Data Scientist; Pak Ming Cheung | Sr. • Staff Machine Learning Engineer; Zhenjie Zhang | Sr. • Staff Machine Learning Engineer Introduction Search relevance measures how well search results align with a user’s search query. • For personalized search systems, it’s important to ensure that displayed content is pertinent to the user’s information needs, rather than over-relying on the user’s past engagement. • At Pinterest Search, we track whole-page relevance in online A/B experiments to evaluate new ranking models and ensure a high-quality user experience. • Relevance measurement typically relies on human annotations, but is limited by the low availability of human labels and the high marginal cost of generating them.

Article Summaries:

  • Pinterest Search has introduced a new relevance‑assessment system that leverages large language models (LLMs) to replace costly human annotations. By fine‑tuning open‑source multilingual LLMs on a 5‑level relevance scale (Highly Relevant to Highly Irrelevant), the team builds a cross‑encoder model that scores how well a Pin matches a user query. The model incorporates text from Pin titles, descriptions, image captions, linked pages, and board titles, and is used to evaluate ranking results in online A/B tests. This approach cuts labeling costs, speeds up evaluation, and enables finer‑grained metric analysis across larger query sets.

Sources: