• Computer Science > Artificial Intelligence [Submitted on 23 Feb 2026] Title:Implicit Intelligence – Evaluating Agents on What Users Don’t Say View PDF HTML (experimental)Abstract:Real-world requests to AI agents are fundamentally underspecified. • Natural human communication relies on shared context and unstated constraints that speakers expect listeners to infer. • Current agentic benchmarks test explicit instruction-following but fail to evaluate whether agents can reason about implicit requirements spanning accessibility needs, privacy boundaries, catastrophic risks, and contextual constraints. • We present Implicit Intelligence, an evaluation framework testing whether AI agents can move beyond prompt-following to become genuine goal-fulfillers, paired with Agent-as-a-World (AaW), a harness where interactive worlds are defined in human-readable YAML files and simulated by language models. • Our scenarios feature apparent simplicity in user requests, hidden complexity in correct solutions, and discoverability of constraints through environmental exploration. • Evaluating 16 frontier and open-weight models across 205 scenarios, we find that even the best-performing model achieves only 48.3% scenario pass rate, revealing substantial room for improvement in bridging the gap between literal instruction-following and human-like contextual reasoning.

Article Summaries:

  • Computer Science > Artificial Intelligence [Submitted on 23 Feb 2026] Title:Implicit Intelligence – Evaluating Agents on What Users Don’t Say View PDF HTML (experimental)Abstract:Real-world requests to AI agents are fundamentally underspecified. Natural human communication relies on shared context and unstated constraints that speakers expect listeners to infer. Current agentic benchmarks test explicit instruction-following but fail to evaluate whether agents can reason about implicit requirements spanning accessibility needs, privacy boundaries, catastrophic risks, and contextual constraints

Sources: