Case Study: Cloudflare achieves seconds-level mean-time-to-action with PagerDuty

A PagerDuty Case Study

Preview of the Cloudflare Case Study

Cloudflare Reduces Mean-Time-To-Action to Seconds with PagerDuty

Cloudflare, a global web performance and security provider serving millions of Internet properties, faced growing pains as rapid customer growth produced vast amounts of monitoring data. That volume diluted their ability to categorize and prioritize incidents, while in-house communication and escalation processes were manual and fragmented and security-driven data decentralization prevented consolidated automated alerts.

By adopting PagerDuty, Cloudflare gained automated, centralized incident alerts, dynamic triage and escalation, and integrated collaboration tools (Operations Command Console, Major Incidents Application, HipChat integration) that created a single source of truth. The result: streamlined SRE communication, faster prioritization and recruitment of the right responders, and a drop in mean time-to-action from minutes to seconds, improving infrastructure stability and customer reliability.


Open case study document...

Cloudflare

Michael Daly

Engineering Manager


PagerDuty

121 Case Studies