LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs

• Computer Science > Machine Learning [Submitted on 4 Feb 2026] Title:LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs View PDF HTML (experimental)Abstract:Post-training quantization (PTQ) is a widely used approach for reducing the memory and compute costs of large language models (LLMs). • Recent studies have shown that applying invertible transformations to activations can significantly improve quantization robustness by reducing activation outliers; however, existing approaches are largely restricted to rotation or Hadamard-based transformations. • Moreover, most studies focused primarily on traditional quantization schemes, whereas modern hardware increasingly supports the microscaling (MX) data format. • Attempts to combine both showed severe performance degradation, leading prior work to introduce assumptions on the transformations. • In this work, we take a complementary perspective. • First, we provide a theoretical analysis of transformations under MX quantization by deriving a bound on the quantization error.

Article Summaries:

Summary

Researchers have introduced LATMiX, a new post‑training quantization technique for large language models (LLMs). Unlike earlier methods that used fixed rotations or Hadamard transforms, LATMiX learns invertible affine transformations tailored to the activation distribution and the microscaling (MX) quantization format now common in modern hardware. The authors provide a theoretical bound on quantization error under MX, highlighting the need to consider both activation statistics and the quantization structure. Experiments show that LATMiX consistently improves average accuracy for low‑bit MX quantization across a range of zero‑shot benchmarks and model sizes, outperforming strong baseline methods.

Sources:

https://arxiv.org/abs/2602.17681