Case Study: Biogen accelerates petabyte-scale genomic analysis and discovers drug targets with Databricks

A Databricks Case Study

Preview of the Biogen Case Study

Advancing disease therapies through cloud-based AI

Biogen, a biopharmaceutical company focused on neurological diseases, faced major challenges handling petabytes of genomics data from projects like the UK Biobank. Their legacy on‑premises infrastructure lacked storage and network capacity—causing processing bottlenecks and even a week‑long HPC outage—preventing the large‑scale genotype‑phenotype analyses needed to prioritize drug targets and advance therapies.

Partnering with Databricks and DNAnexus, Biogen migrated to Databricks for Genomics on AWS and adopted Delta Lake, scalable ETL, and ML workflows. The move dramatically reduced processing times (from two weeks for 700k variants to annotating 2 million variants in ~15 minutes), improved partitioning and data security, accelerated discovery workflows, and helped identify two new drug targets for neurodegenerative diseases.


Open case study document...

Biogen

David Sexton

Senior Director, Genome Technology and Informatics


Databricks

398 Case Studies