Case Study: Yelp achieves scalable big-data processing and $55,000 in hardware savings with Amazon Web Services

A Amazon Web Services Case Study

Preview of the Yelp Case Study

Yelp - Customer Case Study

Yelp, founded in 2004 to help people discover and review local businesses, grew from a San Francisco startup to an international platform with millions of users and tens of millions of reviews. To protect the user experience and power features like review highlights, search autocomplete and sponsored local search, Yelp needed to scale log storage and Hadoop processing—generating about 1.2 TB of logs per day—while avoiding shill content and keeping mobile and web features responsive.

Yelp migrated from local RAIDs and a single Hadoop instance to Amazon S3 for storage and Amazon Elastic MapReduce (EMR) for processing, using mrjob and boto to run Hadoop streaming jobs. The move enabled about 250 EMR jobs per day processing ~30 TB of data, supported key product features, and saved roughly $55,000 in upfront hardware costs; more importantly, it cut deployment time from months to days and freed engineering resources to focus on new capabilities.


Open case study document...

Yelp

Dave Marin

Data-Mining Engineer


Amazon Web Services

2483 Case Studies