Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents

• Computer Science > Networking and Internet Architecture [Submitted on 10 Nov 2025 (v1), last revised 18 Feb 2026 (this version, v2)] Title:Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents View PDF HTML (experimental)Abstract:Internet of Agents (IoA) envisions a unified, agent-centric paradigm where heterogeneous large language model (LLM) agents can interconnect and collaborate at scale. • Within this paradigm, federated fine-tuning (FFT) serves as a key enabler that allows distributed LLM agents to co-train an intelligent global LLM without centralizing local datasets. • However, the FFT-enabled IoA systems remain vulnerable to model poisoning attacks, where adversaries can upload malicious updates to the server to degrade the performance of the aggregated global LLM. • This paper proposes a graph representation-based model poisoning (GRMP) attack, which exploits overheard benign updates to construct a feature correlation graph and employs a variational graph autoencoder to capture structural dependencies and generate malicious updates. • A novel attack algorithm is developed based on augmented Lagrangian and subgradient descent methods to optimize malicious updates that preserve benign-like statistics while embedding adversarial objectives. • Experimental results show that the proposed GRMP attack can substantially decrease accuracy across different LLM models while remaining statistically consistent with benign updates, thereby evading detection by

Article Summaries:

Researchers have identified a new threat to the Internet‑of‑Agents (IoA) framework, where large language model (LLM) agents collaborate via federated fine‑tuning (FFT). The study introduces a graph‑representation‑based model poisoning (GRMP) attack that constructs a feature‑correlation graph from observed benign updates and uses a variational graph autoencoder to generate malicious model updates. An optimization routine based on augmented Lagrangian and subgradient descent crafts updates that mimic benign statistics while embedding adversarial objectives. Experiments show the attack significantly degrades accuracy across multiple LLMs yet remains statistically indistinguishable from legitimate updates, bypassing existing defenses and highlighting a serious vulnerability in IoA systems.

Sources:

https://arxiv.org/abs/2511.07176