Percona
35 Case Studies
A Percona Case Study
PagerDuty, a service that routes alerts and on-call schedules for IT and DevOps teams, was running its primary MySQL infrastructure on EC2 using MySQL Community Edition with DRBD-based synchronous replication. That setup required manual failovers that caused at least two minutes of downtime (with longer cold-server spin-up times), relied on EBS storage that could suffer network-related outages, and forced slow recoveries—MySQL dumps took ~15 hours to restore.
PagerDuty migrated to Percona’s stack—Percona Server with Percona XtraDB Cluster (three-node cluster on EC2 behind HAProxy), Percona XtraBackup, and Percona Toolkit. The new architecture enables fast automated failover with no customer impact, safe use of EC2 instance-store for higher-performance local disks, and much faster recovery: binary restores in 2–3 hours and node-based resyncs that can get replication running in minutes. Overall, the move eliminated previous downtime pain points and delivered reliable, production-proven high availability.
Doug Barth
Operations Engineer