Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

• Computer Science > Artificial Intelligence [Submitted on 15 Feb 2026] Title:Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents View PDF HTML (experimental)Abstract:The paper introduces GUI-Owl-1.5, the latest native GUI agent model that features instruct/thinking variants in multiple sizes (2B/4B/8B/32B/235B) and supports a range of platforms (desktop, mobile, browser, and more) to enable cloud-edge collaboration and real-time interaction. • GUI-Owl-1.5 achieves state-of-the-art results on more than 20+ GUI benchmarks on open-source models: (1) on GUI automation tasks, it obtains 56.5 on OSWorld, 71.6 on AndroidWorld, and 48.4 on WebArena; (2) on grounding tasks, it obtains 80.3 on ScreenSpotPro; (3) on tool-calling tasks, it obtains 47.6 on OSWorld-MCP, and 46.8 on MobileWorld; (4) on memory and knowledge tasks, it obtains 75.5 on GUI-Knowledge Bench. • GUI-Owl-1.5 incorporates several key innovations: (1) Hybird Data Flywheel: we construct the data pipeline for UI understanding and trajectory generation based on a combination of simulated environments and cloud-based sandbox environments, in order to improve the efficiency and quality of data collection. • (2) Unified Enhancement of Agent Capabilities: we use a unified thought-synthesis pipeline to enhance the model’s reasoning capabilities, while placing particular emphasis on improving key agent abilities, including Tool/MCP use, memory and multi-agent adaptation; (3) Multi-platform Environment RL Scaling: We propose a ne

Article Summaries:

A new paper introduces GUI‑Owl‑1.5, a native graphical‑user‑interface agent that runs on desktop, mobile, browser and other platforms, enabling cloud‑edge collaboration. The model comes in several sizes (2B-235B) and achieves state‑of‑the‑art scores on more than 20 GUI benchmarks, such as 71.6 on AndroidWorld and 80.3 on ScreenSpotPro. Key technical contributions include a Hybrid Data Flywheel that blends simulated and cloud‑sandbox data, a unified thought‑synthesis pipeline to boost reasoning, and a multi‑platform RL algorithm (MRPO) that improves training efficiency for long‑horizon tasks. The models are open‑source, with an online sandbox demo available.

Sources:

https://arxiv.org/abs/2602.16855