Amazon Web Services
2483 Case Studies
A Amazon Web Services Case Study
Fireworks.ai worked with Amazon Web Services to optimize the compute power behind its demanding generative AI inference engine. The company had been using Amazon EC2 P4d instances and needed a more flexible, cost-optimized way to meet growing performance demands while keeping latency and costs under control.
Amazon Web Services helped Fireworks.ai move to Amazon EC2 P5 instances, enabling up to 4x higher throughput per instance and cost reductions of 4x for some customers. The results included a 30–50% latency reduction for one summarization model, more than 2x faster backend latency and doubled completion acceptance for Sourcegraph’s Cody, and strong cost-per-performance gains across customer use cases.
Dmytro Dzhulgakov
Co-Founder and Chief Technology Officer