Qubole
28 Case Studies
A Qubole Case Study
Pinterest processes massive amounts of data to power its personalized discovery engine—over 30 billion Pins, ~20 TB of new logs per day and about 10 PB in S3. As usage grew, Amazon EMR became unstable at scale and proprietary Hive limitations, dependency management, and the need to onboard non-technical users made Hadoop hard to operate as a self-serve platform.
Pinterest implemented an executor abstraction and migrated jobs to Qubole Data Service, enabling on-demand, horizontally scalable clusters with strong Hive integration, 100% spot-instance support and simplified user access. The move delivered stable petabyte-scale performance with 30–60% higher throughput than EMR, supported 100+ MapReduce users running >2,000 jobs/day, six clusters (3,000+ nodes) and nearly a petabyte processed daily, while reducing operational overhead and speeding onboarding.
Mohammad Shahangian
Data Engineer, Pinterest