Duality Models: An Embarrassingly Simple One-step Generation Paradigm

• Computer Science > Machine Learning [Submitted on 4 Feb 2026] Title:Duality Models: An Embarrassingly Simple One-step Generation Paradigm View PDF HTML (experimental)Abstract:Consistency-based generative models like Shortcut and MeanFlow achieve impressive results via a target-aware design for solving the Probability Flow ODE (PF-ODE). • Typically, such methods introduce a target time $r$ alongside the current time $t$ to modulate outputs between a local multi-step derivative ($r = t$) and a global few-step integral ($r = 0$). • However, the conventional “one input, one output” paradigm enforces a partition of the training budget, often allocating a significant portion (e.g., 75% in MeanFlow) solely to the multi-step objective for stability. • This separation forces a trade-off: allocating sufficient samples to the multi-step objective leaves the few-step generation undertrained, which harms convergence and limits scalability. • To this end, we propose Duality Models (DuMo) via a “one input, dual output” paradigm. • Using a shared backbone with dual heads, DuMo simultaneously predicts velocity $v_t$ and flow-map $u_t$ from a single input $x_t$.

Article Summaries:

Duality Models (DuMo) introduce a “one‑input, dual‑output” framework for consistency‑based generative models. Rather than splitting training data between a multi‑step objective and a few‑step generation, DuMo uses a shared backbone with two heads that simultaneously predict the velocity (v_t) and the flow‑map (u_t) from a single sample (x_t). This design applies geometric constraints from the multi‑step objective to every training example, tightening the few‑step estimation without sacrificing stability. On ImageNet 256 × 256, a 679 M‑parameter Diffusion Transformer with SD‑VAE achieves a state‑of‑the‑art FID of 1.79 in just two steps, demonstrating improved efficiency and scalability.

Sources:

https://arxiv.org/abs/2602.17682