Intel Releases OpenVINO 2026 With Improved NPU Handling, Expanded LLM Support

• Intel Releases OpenVINO 2026 With Improved NPU Handling, Expanded LLM Support Intel’s open-source OpenVINO AI toolkit is out with its first major release of 2026. • With today’s OpenVINO 2026.0 release there is expanded large language model (LLM) support, improved Intel NPU support for Core Ultra systems, and a variety of other enhancements for benefiting Intel’s CPU / NPU / GPU range of products for AI. • OpenVINO 2026.0 adds support for CPU and GPU execution of the GPT-OSS-20B, MiniCPM-V-4_5-8B, and MiniCPM-o-2.6 models. • It’s a bit surprising it took them until now to formally support OpenAI’s GPT-OSS-20B but in any event it’s now supported with OpenVINO 2026.0. • For NPUs with smaller models there is also now support for MiniCPM-o-2.6, Qwen2.5-1B-Instruct, Qwen3-Embedding-0.6B, and Qwen-2.5-coder-0.5B. • OpenVINO GenAI meanwhile added support for word-level timestamps to enhance the functionality for more accurate transcriptions and subtitling to better position itself with the OpenAI and FasterWhisper implementations.

Article Summaries:

Intel has released OpenVINO 2026.0, its first major update of 2026, adding broad support for large language models (LLMs) and enhanced handling of Intel NPUs on Core Ultra systems. The toolkit now supports CPU and GPU execution of GPT‑OSS‑20B, MiniCPM‑V‑4_5‑8B, and MiniCPM‑o‑2.6, while NPUs can run smaller models such as Qwen2.5‑1B‑Instruct and Qwen‑2.5‑coder‑0.5B. New features include word‑level timestamps for transcription, int4 weight compression for MoE LLMs, VLM pipeline support, and speculative decoding on NPUs. Compiler integration enables ahead‑of‑time and on‑device compilation without OEM driver updates, aiming to reduce integration friction and accelerate deployment.
Intel has released OpenVINO 2026.0, its first major 2026 update, adding broad support for large‑language models (LLMs) and enhanced handling of Intel NPUs on Core Ultra systems. The toolkit now supports CPU and GPU execution of GPT‑OSS‑20B, MiniCPM‑V‑4_5‑8B, and MiniCPM‑o‑2.6, while NPUs can run smaller models such as Qwen2.5‑1B‑Instruct and Qwen‑2.5‑coder‑0.5B. New GenAI features include word‑level timestamps, int4 weight compression for MoE LLMs, VLM pipeline support, and speculative decoding on NPUs. Intel also integrated compiler support for on‑device and ahead‑of‑time NPU compilation, aiming to reduce integration friction and accelerate deployment.

Sources:

https://www.phoronix.com/news/Intel-OpenVINO-2026.0-Released (Latest source article published: 2026-02-23 18:19 UTC)