WANSpec: Leveraging Global Compute Capacity for LLM Inference
• WANSpec leverages under‑utilized global data centers for LLM inference to reduce latency and cost. • Uses speculative decoding by moving draft model to low‑demand GPUs, cutting f
• WANSpec leverages under‑utilized global data centers for LLM inference to reduce latency and cost. • Uses speculative decoding by moving draft model to low‑demand GPUs, cutting f
• Canada signals review of Amazon’s cloud contracts after Quebec warehouse closures. • Amazon shut 4 Quebec warehouses, laying off 1,700 employees. • Government concerned about imp