Wanspec

WANSpec: Leveraging Global Compute Capacity for LLM Inference

• WANSpec leverages under‑utilized global data centers for LLM inference to reduce latency and cost. • Uses speculative decoding by moving draft model to low‑demand GPUs, cutting f