Quantifying construct validity in large language model evaluations

Quantifying construct validity in large language model evaluations

• Computer Science > Artificial Intelligence [Submitted on 17 Feb 2026] Title:Quantifying construct validity in large language model evaluations View PDF HTML (experimental)Abstrac

Research · February 18, 2026 (updated February 19, 2026) · 2 min · 222 words
ResearchGym: Evaluating Language Model Agents on Real-World AI Research

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

• Computer Science > Artificial Intelligence [Submitted on 16 Feb 2026] Title:ResearchGym: Evaluating Language Model Agents on Real-World AI Research View PDF HTML (experimental)Ab

Research · February 18, 2026 (updated February 19, 2026) · 1 min · 209 words
Secure and Energy-Efficient Wireless Agentic AI Networks

Secure and Energy-Efficient Wireless Agentic AI Networks

• Computer Science > Artificial Intelligence [Submitted on 16 Feb 2026] Title:Secure and Energy-Efficient Wireless Agentic AI Networks View PDF HTML (experimental)Abstract:In this

Research · February 18, 2026 (updated February 19, 2026) · 2 min · 297 words
When Remembering and Planning are Worth it: Navigating under Change

When Remembering and Planning are Worth it: Navigating under Change

• Computer Science > Artificial Intelligence [Submitted on 17 Feb 2026] Title:When Remembering and Planning are Worth it: Navigating under Change View PDF HTML (experimental)Abstra

Research · February 18, 2026 (updated February 19, 2026) · 2 min · 293 words
World-Model-Augmented Web Agents with Action Correction

World-Model-Augmented Web Agents with Action Correction

• Computer Science > Artificial Intelligence [Submitted on 17 Feb 2026] Title:World-Model-Augmented Web Agents with Action Correction View PDF HTML (experimental)Abstract:Web agent

Research · February 18, 2026 (updated February 19, 2026) · 2 min · 276 words
People who switched to cannabis drinks cut their alcohol use nearly in half

People who switched to cannabis drinks cut their alcohol use nearly in half

• Study shows cannabis-infused drinks can cut alcohol consumption by nearly 50%.\n• Researchers from University at Buffalo highlight cannabis as a harm‑reduction tool for heavy dri

Science · February 18, 2026 (updated February 24, 2026) · 1 min · 179 words
Why Europe barred China from flagship Horizon research programmes

Why Europe barred China from flagship Horizon research programmes

• Email Bluesky Facebook LinkedIn Reddit Whatsapp X Credit: Cheng Xin/Getty Chinese research organizations can no longer take part in most of the research grants funded by Horizon

Science · February 18, 2026 (updated February 24, 2026) · 2 min · 295 words

Author Correction: The genomic landscape of response to EGFR blockade in colorectal cancer

• Nature article on EGFR blockade response in colorectal cancer corrected figure duplication. • Micrograph in Extended Data Fig. 8 mistakenly duplicated from Fig. 10, misrepresenti

Science · February 18, 2026 (updated February 24, 2026) · 1 min · 166 words
Don't deprioritize curiosity-driven research

Don't deprioritize curiosity-driven research

• UKRI paused medical, biological, and physical science grants amid policy review, unsettling researchers. • Short‑term contract holders risk career setbacks due to funding freeze.

Science · February 18, 2026 (updated February 24, 2026) · 1 min · 167 words

Bioinspired adaptive pupil reflex based on liquid-metal shape-shifters for machine vision

• Science Robotics, Volume 11, Issue 111, February 2026.

Research · February 18, 2026 (updated February 19, 2026) · 1 min · 21 words

Scalable robot collective resilience by sharing resources

• Science Robotics, Volume 11, Issue 111, February 2026.

Research · February 18, 2026 (updated February 19, 2026) · 1 min · 21 words

Horses with over 30 minutes of REM sleep show better persistence in learning tasks

• Horses with ≥30 minutes REM sleep daily outperform peers in field learning tasks. • Short REM periods reduce perseverance and performance during demanding training. • New field‑a

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 148 words

Silenced no more: Why U.S. online reviews turned longer and more negative

• Legal threats historically silenced detailed, negative consumer reviews online in. • Removing these threats instantly spurred longer, more candid feedback from U.S. consumers. •

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 122 words

Prehistoric fossil poses puzzles in shark research

• A newly examined prehistoric shark from the age of dinosaurs provides surprising insights into the early evolution of modern sharks. • It cannot be confidently assigned to any sh

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 96 words

Seal pup communication is more similar to that of humans than previously thought, researcher finds

• Seal pups exhibit turn-taking in vocal exchanges, mirroring human conversational patterns. • Their calls converge over time, becoming more similar when pups interact closely. • R

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 164 words
Unify now or pay later: New research exposes the operational cost of a fragmented SOC

Unify now or pay later: New research exposes the operational cost of a fragmented SOC

• Share Link copied to clipboard! • Content types Industry trends Topics AI and agents Defending against advanced tactics Security management Security operations SIEM and XDR Secur

Cybersecurity · February 17, 2026 (updated February 25, 2026) · 2 min · 295 words
New research claims wormholes are temporal mirrors, not interstellar tunnels

New research claims wormholes are temporal mirrors, not interstellar tunnels

• New research claims wormholes are temporal mirrors, not interstellar tunnels The ‘mirror’ framework addresses the black hole information paradox, a conflict between quantum mecha

Science on the Double: How an AI-Powered 'Digital Twin' Accelerates Chemistry and Materials Discoveries

• The post Science on the Double: How an AI-Powered ‘Digital Twin’ Accelerates Chemistry and Materials Discoveries appeared first on Berkeley Lab News Center . • The post What Is a

Research · February 17, 2026 (updated February 19, 2026) · 1 min · 203 words
What Is a Digital Twin?

What Is a Digital Twin?

• What Is a Digital Twin? • Video AI Digital twins are transforming how scientists study and improve complex systems - reducing the time between discovery and delivery. • Learn wha

Research · February 17, 2026 (updated February 18, 2026) · 1 min · 205 words
SCS Faculty Named 2026 Sloan Research Fellows

SCS Faculty Named 2026 Sloan Research Fellows

• Mallory LindahlTuesday, February 17, 2026Print this page. • Three faculty members in Carnegie Mellon University’s School of Computer Science are among five CMU researchers select

Research & Labs · February 17, 2026 (updated February 25, 2026) · 2 min · 214 words

Growing evidence that freshwater wildlife is impacted by microplastics

• Microplastics detected in droppings of freshwater birds across Europe • Study led by University of Glasgow, published in Environmental Research • Findings confirm widespread pres

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 128 words

New research calls for 'heat literacy' in Australia

• James Cook University (JCU) research argues Australians urgently need better education about heat to prepare for longer, hotter and more dangerous heat waves driven by climate ch

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 59 words

Tropical forests generate rainfall worth billions, study finds

• Tropical forests produce 2.4 million liters of rainfall per hectare annually. • Study quantifies rainfall generation, equating to Olympic-sized pool per hectare. • Findings highl

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 143 words

Quantum sensor research advances the pursuit of dark matter

• Researchers at the Department of Energy’s Oak Ridge National Laboratory are helping to pave a path for the eventual discovery of dark matter. • With new approaches to measurement

Science · February 17, 2026 (updated February 24, 2026) · 1 min · 120 words
Authorization of prognostic AI medical devices

Authorization of prognostic AI medical devices

• Subjects Health policy Medical humanities Less than 2% of artificial intelligence devices authorized by the US Food and Drug Agency are prognostic, with prediction horizons rangi

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 248 words
Research Bits: Feb. 17

Research Bits: Feb. 17

• Home Systems & Design Low Power - High Performance Manufacturing, Packaging & Materials Test, Measurement & Analytics Auto, Security & Enabling Technologies Special Reports Busin

A Geometric Taxonomy of Hallucinations in LLMs

A Geometric Taxonomy of Hallucinations in LLMs

• Computer Science > Artificial Intelligence [Submitted on 26 Jan 2026] Title:A Geometric Taxonomy of Hallucinations in LLMs View PDF HTML (experimental)Abstract:The term ‘hallucin

Research · February 17, 2026 (updated February 19, 2026) · 1 min · 210 words
Agentic AI for Commercial Insurance Underwriting with Adversarial Self-Critique

Agentic AI for Commercial Insurance Underwriting with Adversarial Self-Critique

• Computer Science > Artificial Intelligence [Submitted on 21 Jan 2026] Title:Agentic AI for Commercial Insurance Underwriting with Adversarial Self-Critique View PDF HTML (experim

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 218 words
AST-PAC: AST-guided Membership Inference for Code

AST-PAC: AST-guided Membership Inference for Code

• Computer Science > Artificial Intelligence [Submitted on 30 Jan 2026] Title:AST-PAC: AST-guided Membership Inference for Code View PDF HTML (experimental)Abstract:Code Large Lang

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 216 words
BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors

BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors

• Computer Science > Artificial Intelligence [Submitted on 22 Jan 2026] Title:BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors View PDF HTML (experimental)Abstract:Large

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 243 words
DPBench: Large Language Models Struggle with Simultaneous Coordination

DPBench: Large Language Models Struggle with Simultaneous Coordination

• Computer Science > Artificial Intelligence [Submitted on 2 Feb 2026] Title:DPBench: Large Language Models Struggle with Simultaneous Coordination View PDF HTML (experimental)Abst

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 254 words
General learned delegation by clones

General learned delegation by clones

• Computer Science > Artificial Intelligence [Submitted on 3 Feb 2026] Title:General learned delegation by clones View PDF HTML (experimental)Abstract:Frontier language models impr

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 286 words
Human-Centered Explainable AI for Security Enhancement: A Deep Intrusion Detection Framework

Human-Centered Explainable AI for Security Enhancement: A Deep Intrusion Detection Framework

• Computer Science > Artificial Intelligence [Submitted on 4 Feb 2026] Title:Human-Centered Explainable AI for Security Enhancement: A Deep Intrusion Detection Framework View PDF H

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 232 words
Intelligence as Trajectory-Dominant Pareto Optimization

Intelligence as Trajectory-Dominant Pareto Optimization

• Computer Science > Artificial Intelligence [Submitted on 28 Jan 2026] Title:Intelligence as Trajectory-Dominant Pareto Optimization View PDF HTML (experimental)Abstract:Despite r

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 255 words
Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains

Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains

• Computer Science > Artificial Intelligence [Submitted on 29 Jan 2026] Title:Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains View PDF HTML (exp

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 223 words
MAPLE: A Sub-Agent Architecture for Memory, Learning, and Personalization in Agentic AI Systems

MAPLE: A Sub-Agent Architecture for Memory, Learning, and Personalization in Agentic AI Systems

• Computer Science > Artificial Intelligence [Submitted on 3 Feb 2026] Title:MAPLE: A Sub-Agent Architecture for Memory, Learning, and Personalization in Agentic AI Systems View PD

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 284 words
NL2LOGIC: AST-Guided Translation of Natural Language into First-Order Logic with Large Language Models

NL2LOGIC: AST-Guided Translation of Natural Language into First-Order Logic with Large Language Models

• Computer Science > Artificial Intelligence [Submitted on 29 Jan 2026] Title:NL2LOGIC: AST-Guided Translation of Natural Language into First-Order Logic with Large Language Models

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 250 words
PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading

PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading

• Computer Science > Artificial Intelligence [Submitted on 29 Jan 2026] Title:PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading View P

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 280 words
ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs

ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs

• Computer Science > Artificial Intelligence [Submitted on 5 Feb 2026] Title:ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs View PDF HTML (e

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 232 words
Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

• Computer Science > Artificial Intelligence [Submitted on 23 Jan 2026] Title:Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning View PDF HTML (experimental)Abstr

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 243 words
Stay in Character, Stay Safe: Dual-Cycle Adversarial Self-Evolution for Safety Role-Playing Agents

Stay in Character, Stay Safe: Dual-Cycle Adversarial Self-Evolution for Safety Role-Playing Agents

• Computer Science > Artificial Intelligence [Submitted on 29 Jan 2026] Title:Stay in Character, Stay Safe: Dual-Cycle Adversarial Self-Evolution for Safety Role-Playing Agents Vie

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 250 words
TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks

TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks

• Computer Science > Artificial Intelligence [Submitted on 5 Feb 2026] Title:TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 267 words
TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks

TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks

• TemporalBench offers a multi-domain benchmark for temporal reasoning in LLM agents. • Four-tier taxonomy tests historical structure, context-free, contextual, and event-condition

Research & Labs · February 17, 2026 (updated February 24, 2026) · 1 min · 154 words
Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection

Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection

• Computer Science > Artificial Intelligence [Submitted on 27 Jan 2026] Title:Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection View PDF HTML (expe

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 233 words
X-Blocks: Linguistic Building Blocks of Natural Language Explanations for Automated Vehicles

X-Blocks: Linguistic Building Blocks of Natural Language Explanations for Automated Vehicles

• Computer Science > Artificial Intelligence [Submitted on 2 Feb 2026] Title:X-Blocks: Linguistic Building Blocks of Natural Language Explanations for Automated Vehicles View PDF H

Research · February 17, 2026 (updated February 19, 2026) · 2 min · 269 words
The Open Source Community Continues to Criticize Prusa Research's OCL

The Open Source Community Continues to Criticize Prusa Research's OCL

• Some continue to criticize Prusa Research’s new Open Community License, but why? • The backstory: Last December, Prusa Research announced anew license for their products, specifi

Basics2Breakthroughs: Studying the Hydraulic Mechanism in Jumping Spiders

• The post Basics2Breakthroughs: Studying the Hydraulic Mechanism in Jumping Spiders appeared first on Berkeley Lab News Center .

Research · February 13, 2026 (updated February 19, 2026) · 1 min · 41 words

Scaling social science research

• GABRIEL is a new open-source toolkit from OpenAI that uses GPT to turn qualitative text and images into quantitative data, helping social scientists analyze research at scale.

Gemini 3 Deep Think: Advancing science, research and engineering

Gemini 3 Deep Think: Advancing science, research and engineering

• Gemini 3 Deep Think: Advancing science, research and engineering Feb 12, 2026 Our most specialized reasoning mode is now updated to solve modern science, research and engineering

Crunching Big Data Into 3D Images Accelerates Discovery

• The post Crunching Big Data Into 3D Images Accelerates Discovery appeared first on Berkeley Lab News Center .

Research · February 12, 2026 (updated February 19, 2026) · 1 min · 41 words
Safety review completed at South African research reactor

Safety review completed at South African research reactor

• Safety review completed at South African research reactor The five-day six-person Safety Review Mission on Ageing Management and Continued Safe Operation was at the invitation of

Nuclear & Fusion · February 12, 2026 (updated February 24, 2026) · 2 min · 287 words

A 'Robot Pizza Chef' Serving Up Better Quantum Computers

• The post A ‘Robot Pizza Chef’ Serving Up Better Quantum Computers appeared first on Berkeley Lab News Center .

Research · February 11, 2026 (updated February 19, 2026) · 1 min · 43 words

New AI Sensor 'Sniffs' Out Spectral Targets

• The post New AI Sensor ‘Sniffs’ Out Spectral Targets appeared first on Berkeley Lab News Center .

Research · February 11, 2026 (updated February 19, 2026) · 1 min · 39 words
Jahanian Among National Academy of Engineering's Class of 2026

Jahanian Among National Academy of Engineering's Class of 2026

• Carnegie Mellon’s Cagan, Jahanian, and alumnus Pitel elected to the National Academy of Engineering 2026 class. • Cagan leads Mechanical Engineering, pioneering design automation

Research & Labs · February 11, 2026 (updated February 24, 2026) · 1 min · 177 words
Sven Koenig wins the 2026 ACM/SIGAI Autonomous Agents Research Award

Sven Koenig wins the 2026 ACM/SIGAI Autonomous Agents Research Award

• News Views Podcast Learn team about contribute republish AIhub resources AIhub events News Views Podcast Learn News Views Podcast Learn Sven Koenig wins the 2026 ACM/SIGAI Autono

US funding for research into recycling used nuclear fuel

US funding for research into recycling used nuclear fuel

• US funding for research into recycling used nuclear fuel The Department of Energy (DOE) noted that less than 5% of the potential energy in the USA’s nuclear fuel is extracted aft

Nuclear & Fusion · February 9, 2026 (updated February 24, 2026) · 2 min · 290 words
When space plasmas collide!!

When space plasmas collide!!

• Space is dominated by plasma, the most common state of matter in the cosmos. • Modeling plasma is tough: fluid dynamics, turbulence, and Maxwell’s equations all intertwine. • The

Group Note Drafts: Cognitive Accessibility Research Modules

• The Accessible Platform Architectures Working Group has published the first draft of the Group Note titled Cognitive Accessibility Research Modules . • This set of modules looks

Berkeley Lab Leads Effort to Build AI Assistant for Energy Materials Discovery

• The post Berkeley Lab Leads Effort to Build AI Assistant for Energy Materials Discovery appeared first on Berkeley Lab News Center .

Research · February 3, 2026 (updated February 19, 2026) · 1 min · 49 words
The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

• Open source is now the dominant strategy for Chinese AI firms, driving large‑scale deployment and integration. • DeepSeek leads Hugging Face followers, while Qwen ranks fourth, s