LiteRT: The Universal Framework for On-Device AI

• LiteRT: The Universal Framework for On-Device AI Facebook Twitter LinkedIn Mail Since we firstintroduced LiteRTin 2024, we have focused on evolving our ML tech stack from its TensorFlow Lite (TFLite) foundation into a modern on-device AI framework. • While TFLite set the standard for classical ML, our mission is to empower developers to deploy today’s cutting-edge AI on-device just as seamlessly as they integrated classical ML in the past. • At Google I/O ‘25, we shared apreview of this evolution: a high-performance runtime designed specifically for advanced hardware acceleration. • Today, we are excited to announce that these advanced acceleration capabilities havefully graduated into the LiteRT production stack, available now for all developers. • This milestone solidifiesLiteRT as the universal on-device inference framework for the AI era, representing a significant leap over TFLite for being: Faster: delivers 1.4x faster GPU performance than TFLite, and introduces new, state-of-the-art NPU acceleration. • Simpler: provides a unified, streamlined workflow for GPU and NPU acceleration across edge platforms.

Article Summaries:

Google has announced that LiteRT, its next‑generation on‑device AI framework, has fully graduated advanced hardware acceleration into its production stack. Building on its TensorFlow Lite roots, LiteRT now offers comprehensive GPU support across Android, iOS, macOS, Windows, Linux, and Web, leveraging OpenCL, OpenGL, Metal, and WebGPU through the ML Drift engine. On Android it automatically prioritizes OpenCL for peak performance. Benchmarks show LiteRT delivers roughly 1.4× faster inference than the legacy TFLite GPU delegate, with up to 2× speed‑ups when using asynchronous execution and zero‑copy buffers. The release also introduces a new CompiledModel API to simplify cross‑platform deployment.

Sources:

https://developers.googleblog.com/litert-the-universal-framework-for-on-device-ai/