• Introduction Uber’s service-oriented architecture processes hundreds of millions of RPCs (remote procedure calls) per second across thousands of services. • Keeping this system reliable requires strong overload protection, ensuring that no single caller, client, or service can overwhelm another. • To achieve this, we set out to design a rate-limiting service that would make it easy for services to configure limits per caller or per procedure, without code changes. • This service, later known as the GRL (Global Rate Limiter), integrates directly into the service mesh, which relays RPC requests for most of Uber’s services. • Over time, that foundation evolved into a fully automated control system powered by the RLC (Rate Limit Configurator), which keeps limits fresh and accurate based on historical world traffic patterns. • Together, these systems ensure that Uber’s platform remains resilient, even as it processes hundreds of millions of requests per second across thousands of microservices.

Article Summaries:

  • Uber has rolled out a unified rate‑limiting framework to protect its sprawling microservice ecosystem, which processes hundreds of millions of RPCs per second. The new Global Rate Limiter (GRL) plugs directly into Uber’s service mesh, allowing teams to set per‑caller or per‑procedure limits without code changes. An automated control layer, the Rate Limit Configurator (RLC), refreshes thresholds based on historical traffic patterns, eliminating the need for manual redeploys. This system replaces fragmented, Redis‑based throttles that caused inconsistent behavior, added latency, and increased operational risk. By centralizing limits, Uber gains consistent protection, lower overhead, and better observability across its global platform.

Sources: