Detecting backdoored language models at scale

Detecting backdoored language models at scale

• Today, we are releasing new research on detecting backdoors in open-weight language models. • Our research highlights several key properties of language model backdoors, laying t

Cybersecurity · February 4, 2026 (updated February 24, 2026) · 2 min · 252 words