Case Study: Genome Institute of Singapore achieves 15x faster joint-calling of 10,000 whole genomes with Databricks

A Databricks Case Study

Preview of the Genome Institute of Singapore Case Study

Combining whole genomes to improve patient outcomes

The Genome Institute of Singapore (GIS), part of A*STAR and a partner in the national Precision Medicine (NPM) program with PRECISE, needed to scale genomic variant discovery to support Singapore’s shift toward precision healthcare. Existing workflows were inefficient for joint calling at population scale, and GIS required a solution to process 10,000 whole genomes so the data could reliably inform prevention and treatment strategies for diverse patient groups.

Using the Databricks Lakehouse Platform (with Delta Lake and open-source tools like Glow) GIS built a GATK‑compatible, scalable analytics pipeline accessed through interactive notebooks. The platform enabled joint calling of 10,000 genomes in under 72 hours—a roughly 15x improvement (from about 10 weeks to 3 days)—with a prototype deployed in six weeks instead of six months, positioning GIS to expand to the next phase of sequencing 100,000 genomes.


Open case study document...

Genome Institute of Singapore

Nicolas Bertin

Program Manager


Databricks

457 Case Studies