TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks
• TemporalBench offers a multi-domain benchmark for temporal reasoning in LLM agents. • Four-tier taxonomy tests historical structure, context-free, contextual, and event-condition