DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

• DeepContext introduces stateful monitoring for LLM safety, tracking intent across turns. • Uses RNN to process fine‑tuned turn‑level embeddings, preserving conversation context. • Detects multi‑turn adversarial intent drift, outperforming stateless guardrails like Crescendo. • Achieves F1 score of 0.84, surpassing Llama‑Prompt‑Guard‑2 and Granite‑Guardian. • Maintains sub‑20 ms inference on T4 GPU, enabling real‑time deployment. • Demonstrates that sequential intent modeling is more efficient than massive stateless models.

Article Summaries:

DeepContext: Stateful Real‑Time Detection of Multi‑Turn Adversarial Intent Drift in LLMs

A new research paper introduces DeepContext, a stateful monitoring framework that tracks user intent across multi‑turn dialogues in large language models (LLMs). Unlike existing stateless safety guardrails, DeepContext employs a recurrent neural network to ingest a sequence of fine‑tuned turn‑level embeddings, capturing incremental risk that can slip through one‑off filters. In benchmark tests, the system achieves a state‑of‑the‑art F1 score of 0.84 for jailbreak detection, surpassing leading open‑weight models (≈0.67) and hyperscaler guardrails. It adds less than 20 ms latency on a T4 GPU, indicating feasibility for real‑time deployment. The work highlights the importance of temporal awareness in LLM safety.

Sources:

https://arxiv.org/abs/2602.16935