Databricks
457 Case Studies
A Databricks Case Study
Scribd, the online reading platform with over 60 million titles, needed a more scalable way to support real-time data processing and personalization. Its legacy Hadoop infrastructure was rigid, hard to maintain, and struggled with large batch and streaming datasets, creating performance issues, small-file problems, and collaboration silos.
Databricks helped Scribd move to AWS and adopt Delta Lake as a unified Lakehouse platform. With Databricks, Scribd streamlined batch and streaming pipelines, improved collaboration through interactive notebooks, and simplified infrastructure management. The result was 30–50% better performance for most Spark workloads and an estimated 30–50% reduction in operational costs, while enabling fresher data and more personalized customer experiences.
R Tyler Croy
Director of Platform Engineering