Databricks
457 Case Studies
A Databricks Case Study
Regeneron, a biopharmaceutical company leveraging genomic and clinical data from more than 400,000 people, struggled to turn massive, decentralized datasets into actionable insights. Their legacy stack couldn’t scale to analyze over 80 billion data points across a 10 TB cohort, forcing data teams to spend days or weeks on ETL and preventing end-to-end analysis needed for drug target discovery.
Deploying the Databricks Lakehouse on AWS—featuring automated cluster management, interactive workspaces, and performant Spark-powered pipelines—streamlined operations and accelerated analyses. The result: queries across the entire dataset dropped from 30 minutes to 3 seconds (600x), ETL time fell from about 3 weeks to 2 days (~10x), and teams can now support more studies and focus on discovering new therapies.
Lukas Habegger
Associate Director of Bioinformatics