Databricks
398 Case Studies
A Databricks Case Study
Edmunds.com, a leading car-shopping site serving about 20 million visitors a month, faced rapidly growing data volumes (tens to hundreds of TB) and widespread missing or inaccurate vehicle details on listing pages. Engineers spent large amounts of time maintaining ad hoc MapReduce/Oozie reporting jobs and could not easily quantify data-quality gaps or the ROI of various data sources used to decode VINs and enrich listings.
Edmunds adopted Apache Spark via the Databricks managed service to simplify cluster management, democratize data access, audit APIs, and build Spark SQL workflows for VIN decoding and reporting. The change sped ad hoc analysis six-fold, cut report-job processing time by about 60% (e.g., 30–60 min queries to 5–10 min), reduced weekly reporting effort from ~10–15 to 3–5 hours, and improved site data quality by roughly 35%, enabling better recommendations and faster, data-driven product decisions.
Shaun Elliott
Technical Lead of Service Engineering