• Replication crisis threatens empirical research credibility, driven by high costs and low incentives for replication. • LLMs accelerate scientific output by automating writing, coding, and reviewing, but risk outpacing verification. • Authors propose an LLM-based system that reproduces statistical analyses and flags discrepancies. • The prototype iterates text interpretation, code generation, execution, and discrepancy analysis on public datasets. • Demonstrated by reproducing key results from a seminal sociology paper, validating the approach. • Potential applications include pre-submission checks, peer-review support, and meta-scientific audits to strengthen integrity.

Article Summaries:

  • Summary

A new prototype uses large language models (LLMs) to automate the replication of statistical analyses in quantitative social science. By iteratively interpreting paper text, generating code, executing it, and comparing results, the system flags discrepancies that may indicate errors or irreproducibility. The approach leverages the field’s reliance on standard statistical models, publicly available datasets, and uniform reporting formats such as regression tables. Demonstrated on a seminal sociology paper, the tool shows promise for pre‑submission checks, peer‑review support, and meta‑scientific audits, positioning AI‑driven verification as an assistive infrastructure to strengthen research integrity.

Sources: