Fireworks AI
7 Case Studies
A Fireworks AI Case Study
Cursor, an AI-native IDE, was facing challenges with the slow performance and inaccuracies of existing frontier models like GPT-4 when handling large code edits. These issues were disrupting developer workflows. To overcome this, Cursor sought a solution with Fireworks AI to build their "Fast Apply" feature.
Fireworks AI deployed Cursor's custom fine-tuned Llama-3-70b model using their Speculative Decoding API. This solution enabled parallel token generation, which led to a remarkable speed of approximately 1000 tokens per second. This represented a 13x speedup over standard inference and a 9x improvement over Cursor's previous GPT-4 deployment, allowing developers to apply code changes instantly.