Models That Prove Their Own Correctness

• Models That Prove Their Own Correctness Models That Prove Their Own Correctness AuthorsNoga Amitâ , Shafi Goldwasserâ , Orr Paradiseâ , Guy N. • Rothblum View publication Copy Bibtex How can we trust the correctness of a learned model on a particular input of interest? • Model accuracy is typically measured on average over a distribution of inputs, giving no guarantee for any fixed input. • This paper proposes a theoretically-founded solution to this problem: to train Self-Proving models that prove the correctness of their output to a verification algorithm V via an Interactive Proof. • Self-Proving models satisfy that, with high probability over an input sampled from a given distribution, the model generates a correct output and successfully proves its correctness to V. • The soundness property of V guarantees that, for every input, no model can convince V of the correctness of an incorrect output.

Article Summaries:

Models That Prove Their Own Correctness AuthorsNoga Amitâ , Shafi Goldwasserâ , Orr Paradiseâ , Guy N. Rothblum Models That Prove Their Own Correctness AuthorsNoga Amitâ , Shafi Goldwasserâ , Orr Paradiseâ , Guy N. Rothblum How can we trust the correctness of a learned model on a particular input of interest? Model accuracy is typically measured on average over a distribution of inputs, giving no guarantee for any fixed input. This paper proposes a theoretically-founded solution to this problem: to train Self-Proving models that prove the correctness of their output to a verification algorithm

Sources:

https://machinelearning.apple.com/research/correctness (Latest source article published: 2026-02-17 00:00 UTC)