Case Study: Sephora improves performance and availability with Gremlin

A Gremlin Case Study

Preview of the Sephora Case Study

How Sephora improves performance and availability

Sephora, a leading prestige beauty retailer, was undertaking a complex, multi-year migration from a legacy monolithic system to a Kubernetes-based microservices architecture. The challenge for their Performance Engineering team was to ensure this migration was seamless and to guarantee the new, more complex system could handle production traffic, especially during critical periods like Black Friday, without failure. The lengthy, hybrid state of the migration, with microservices still dependent on legacy systems, added a significant layer of complexity.

Using Gremlin for fault injection testing, the Performance Engineering team built standardized reliability tests based on real failure conditions. They proactively uncovered and resolved P0 and P1 issues in a pre-production environment, validated system scalability, and ensured components like circuit breakers functioned correctly. As a result, Sephora successfully switched to its new microservices platform for the 2024 holiday season and experienced zero major issues or outages despite massive traffic spikes, contributing to a tremendously successful sales period. The Gremlin solution helped institutionalize reliability testing across product teams.


View this case study…

Gremlin

4 Case Studies