Case Study: Regeneron achieves 600x faster genomic queries and accelerated drug discovery with Databricks

A Databricks Case Study

Preview of the Regeneron  Case Study

Regeneron Pharmaceuticals - Customer Case Study

Regeneron, a biopharmaceutical company leveraging genomic and clinical data from more than 400,000 people, struggled to turn massive, decentralized datasets into actionable insights. Their legacy stack couldn’t scale to analyze over 80 billion data points across a 10 TB cohort, forcing data teams to spend days or weeks on ETL and preventing end-to-end analysis needed for drug target discovery.

Deploying the Databricks Lakehouse on AWS—featuring automated cluster management, interactive workspaces, and performant Spark-powered pipelines—streamlined operations and accelerated analyses. The result: queries across the entire dataset dropped from 30 minutes to 3 seconds (600x), ETL time fell from about 3 weeks to 2 days (~10x), and teams can now support more studies and focus on discovering new therapies.


Open case study document...

Regeneron

Lukas Habegger

Associate Director of Bioinformatics


Databricks

457 Case Studies