Google Cloud Platform
1968 Case Studies
A Google Cloud Platform Case Study
Autism Speaks’ MSSNG project set out to sequence 10,000 whole genomes to advance autism research, but quickly faced a major data challenge: each genome generates 100–200 GB of raw data, pushing the project into petabyte scale and beyond the capacity of traditional academic partners. The team needed a secure, scalable way to store, process and share large, complex genomic and phenotypic datasets with researchers worldwide.
Working with Google Genomics and Google Cloud Platform, MSSNG built a cloud-based pipeline and web portal that securely stores and processes sequencing data, exposes it via GA4GH APIs and BigQuery for interactive analysis, and imports phenotypic data for integrated studies. The solution has already uploaded ~100 TB from 1,300+ genomes (with thousands more queued), supported published discoveries, enabled instant global access for qualified researchers, and freed staff to focus on science rather than infrastructure.
Stephen Scherer
MSSNG Program Director