Case Study: Scribd achieves faster, more scalable data pipelines with Databricks on AWS

A Databricks Case Study

Preview of the Scribd Case Study

Moving to the cloud enables reading without limits

Scribd, the online reading platform with over 60 million titles, needed a more scalable way to support real-time data processing and personalization. Its legacy Hadoop infrastructure was rigid, hard to maintain, and struggled with large batch and streaming datasets, creating performance issues, small-file problems, and collaboration silos.

Databricks helped Scribd move to AWS and adopt Delta Lake as a unified Lakehouse platform. With Databricks, Scribd streamlined batch and streaming pipelines, improved collaboration through interactive notebooks, and simplified infrastructure management. The result was 30–50% better performance for most Spark workloads and an estimated 30–50% reduction in operational costs, while enabling fresher data and more personalized customer experiences.


Open case study document...

Scribd

R Tyler Croy

Director of Platform Engineering


Databricks

457 Case Studies