Databricks
457 Case Studies
A Databricks Case Study
The Genome Institute of Singapore (GIS), part of A*STAR and a partner in the national Precision Medicine (NPM) program with PRECISE, needed to scale genomic variant discovery to support Singapore’s shift toward precision healthcare. Existing workflows were inefficient for joint calling at population scale, and GIS required a solution to process 10,000 whole genomes so the data could reliably inform prevention and treatment strategies for diverse patient groups.
Using the Databricks Lakehouse Platform (with Delta Lake and open-source tools like Glow) GIS built a GATK‑compatible, scalable analytics pipeline accessed through interactive notebooks. The platform enabled joint calling of 10,000 genomes in under 72 hours—a roughly 15x improvement (from about 10 weeks to 3 days)—with a prototype deployed in six weeks instead of six months, positioning GIS to expand to the next phase of sequencing 100,000 genomes.
Nicolas Bertin
Program Manager