Case Study: Uber achieves real-time, low-latency marketplace analytics with Elastic

A Elastic Case Study

Preview of the Uber Case Study

Powering Uber Marketplace’s Real-Time Data Needs with Elasticsearch

Uber faced the challenge of powering real-time marketplace intelligence—answering questions like “How many UberX drivers were available in SF’s Financial District in the last 10 minutes?”—across fine-grained geo-temporal grids, many vehicle types and trip states, and multiple event streams (requested, accepted, completed). The problem required low-latency ingestion, joins across events, complex aggregations, and protections against expensive or malicious queries while supporting dynamic pricing, routing recommendations and marketplace health metrics.

Their solution combined Kafka ingestion with stream processing (Samza) and batch (Spark) feeding Elasticsearch as the real-time query layer, using an idempotent entity-style data model (upserts by trip ID), tiered ES clusters with query routing, and safeguards like cardinality estimation, query splitting, and circuit-breaker improvements. The stack delivered near-real-time ingestion (seconds), linear scalability and extendibility for large-scale aggregations, and operational improvements (monitoring, slow-node detection and query limits) that enabled timely marketplace metrics and routing recommendations.


Open case study document...

Uber

Isaac Brodsky

Software Engineer


Elastic

349 Case Studies