Reinforcementlearning

Deep Reinforcement Learning for Optimizing Energy Consumption in Smart Grid Systems

• PINNs replace costly smart grid simulators, reducing sample inefficiency in RL-based OPF solutions. • RL policy learning converges 50% faster using PINN surrogates versus traditi

FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations

• FineRef introduces fine-grained error reflection for citation mismatch and irrelevance in long‑form LLM generation. • Two‑stage training: supervised fine‑tuning with attempt‑refl

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

• RLVR scaling limited by scarce verifiable training signals, especially for complex logic tasks. • Logical reasoning offers formal constraints and programmatically checkable answe

UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

• AI-driven radiology report generation boosts provider efficiency and reduces reporting burden. • Traditional models overfit to institutional phrasing, limiting generalization to

Multimodal reinforcement learning with agentic verifier for AI agents

• Argos trains multimodal RL agents to reward answers grounded in visual and temporal evidence, not just plausibility. • Automated verification selects specialized tools per answer