Datadog
90 Case Studies
A Datadog Case Study
Medium, a fast-growing publishing platform, relies on DynamoDB to scale its infrastructure but ran into performance problems caused by throttling. Whole-table provisioned capacity hid per-partition limits and “hot keys” (viral posts) could exhaust a single partition’s throughput, leading to high latency and user-facing errors unless anticipated and managed.
To address this, Medium uses Datadog plus an ELK pipeline: they estimate partition counts to compute per-partition limits, log and surface hottest keys, and report a custom throttling metric from the app to Datadog for real-time alerts. They also front DynamoDB with Redis, set staged/prod alerting (email/Slack/PagerDuty), track backups with a custom metric, and plan automated capacity tuning—resulting in better visibility, fewer user-facing failures, faster response to incidents, and opportunities to optimize costs.