Case Study: Duolingo cuts AWS EMR Spark costs by up to 55% with Sync Computing's Spark Cluster Autotuner

A Sync Computing Case Study

Preview of the Duolingo Case Study

How Sync Cut Duolingo’s Etl and Ml Costs in Half

Duolingo, the language-learning company serving over 500 million users, needed a better way to control the high cost of its recurring Spark jobs on AWS EMR. As a cloud-native business processing terabytes of data daily, it faced expensive manual tuning and parameter sweeps to balance cluster cost, runtime, and reliability. Sync Computing’s Spark Cluster Autotuner was used to address this challenge.

Sync Computing’s Spark Cluster Autotuner analyzed Duolingo’s recent Spark event logs and recommended new cluster and Spark settings for two ETL jobs without any code changes. The optimized configuration reduced the cluster size by 4x, cut runtime only slightly from 17 to 22 minutes, and lowered costs by 55%, including nearly a 10x reduction in driver cost.


Open case study document...

Sync Computing

4 Case Studies