What exactly does word2vec learn?

• What exactly does word2vec learn, and how? • Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. • Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitative and predictive theory describing its learning process. • In our new paper, we finally provide such a theory. • We prove that there are realistic, practical regimes in which the learning problem reduces to unweighted least-squares matrix factorization. • We solve the gradient flow dynamics in closed form; the final learned representations are simply given by PCA.

Article Summaries:

What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitative and predictive theory describing its learning process. In our new paper, we finally provide such a theory. We prove that there are realistic, practical regimes in which the learning problem reduces to unweighted least-squares matrix factorization. We solve the gradient flow dynamics in closed for

Sources:

http://bair.berkeley.edu/blog/2025/09/01/qwem-word2vec-theory/