Case Study: MediaMath achieves rapid, scalable report production with Databricks

A Databricks Case Study

Preview of the MediaMath Case Study

MediaMath - Customer Case Study

MediaMath is a demand‑side media buying and data management platform that serves over a billion ads and tracks billions of events daily. The team needed to turn a promising proof‑of‑concept—the Audience Index Report, which compares observed vs. expected site visitors by demographic segment—into a scalable, production web service; doing so required heavy ETL, complex joins across user and site data, and aggregation over 30 days of activity at massive scale.

The solution used PySpark on Databricks: segment and pixel state were stored as S3 sequence files (UDB), processed with RDDs then converted to DataFrames, joined and aggregated to compute the required counts, and written to an AWS PostgreSQL RDS via Spark’s JDBC connector. Databricks notebooks and the job scheduler simplified development, orchestration and monitoring, enabling the team to condense hundreds of terabytes into a consumable 30‑day report that now serves clients reliably and sped up delivery of new reports.


Open case study document...

Databricks

398 Case Studies