Case Study: Rockerbox achieves cost-effective scaling of a 120TB-per-day data pipeline with DigitalOcean

A DigitalOcean Case Study

Preview of the Rockerbox Case Study

Building a 120TB-daily data pipeline on DigitalOcean

Rockerbox is a New York–based advertising and analytics company that models user behavior to help advertisers find similar audiences. The team needed a dependable, redundant, low-latency platform that could collect and process massive volumes of data (capacity for up to 120 TB on peak days and handling hundreds of thousands of requests per second) while keeping operational costs under control.

They built a split architecture on DigitalOcean: many static‑IP Droplets for network‑sensitive collection (data providers, bidders, analytics) and a Mesos cluster with HDFS on 50+ 8‑core Droplets for processing (running Kafka, Spark, Dockerized apps), with DigitalOcean Spaces for low‑cost long‑term storage accessed via standard S3/Hive connectors. Running ~200 Droplets, Rockerbox met latency targets and high throughput, simplified billing and avoided bandwidth/IP penalties, reducing infrastructure costs to roughly 20% of what other cloud providers were charging.


Open case study document...

Rockerbox

Rick O'Toole

Co-Founder and CTO


DigitalOcean

105 Case Studies