Datadog
90 Case Studies
A Datadog Case Study
Gremlin, a Chaos Engineering company, needed a way to monitor its own platform and prove reliability across diverse environments before failures reached customers. The challenge was getting real-time, actionable visibility into system health and key user flows while running intentional failure experiments across microservices, VMs, and APIs.
Gremlin solved this by integrating with Datadog: using template variables to build dynamic dashboards, Datadog Synthetic Monitoring to watch critical user journeys, and a Gremlin–Datadog integration that publishes chaos events and annotates graphs in real time. The result is faster detection and troubleshooting during experiments, clear visual correlation of attacks to system behavior, and greater confidence to run controlled failures that uncover and fix issues before they impact customers.
Matthew Fornaciari
Co-Founder and CTO