• 12 Nov 2024: Canva.com outage lasted ~52 minutes, impacting global users. • Root cause: API Gateway cluster failure triggered by editor deployment, locking, Cloudflare network issues. • Editor is a single‑page app deployed multiple times daily, bundling 100+ static assets to S3. • Cloudflare tiered caching pulls assets from local, regional, then origin S3 if missing. • API Gateway built on Netty handles auth, rate‑limiting, and routes all API traffic. • Post‑incident actions: improved deployment guardrails, lock‑free releases, and enhanced CDN monitoring.
Article Summaries:
- Post incident review Canva incident report: API Gateway outage An incident report for the Canva outage on November 12, 2024. High-level summary On November 12, 2024, Canva experienced a critical outage that affected the availability of canva.comâ (opens in a new tab or window). From 9:08 AM UTC to approximately 10:00 AM UTC, canva.com was unavailable. This was caused by our API Gateway cluster failing due to multiple factors, including a software deployment of Canva’s editor, a locking issue, and network issues in Cloudflare, our CDN provider. This report details the root cause, timeline of ev
- Canva announced a public incident report for a November 12, 2024 outage that rendered canva.com unavailable from 9:08 AM to ~10:00 AM UTC. The failure stemmed from a combination of a new editor deployment, a locking issue in the API Gateway cluster, and a Cloudflare network problem. A stale Cloudflare traffic‑management rule routed IPv6 traffic over a lossy public transit link between Singapore and Ashburn, causing 66 % packet loss and a 20‑minute fetch for a critical JavaScript asset. The outage halted API Gateway requests, disabling the editor. Canva has removed the rule, applied mitigations, and outlined steps to prevent recurrence.
Sources: