Elastic
349 Case Studies
A Elastic Case Study
Merck, a global healthcare company with a 125‑year history, tasked its Scientific Computing group at Merck Research Laboratories to find a better way to predict drug efficacy and safety earlier in discovery. Motivated by evidence that human genetic support doubles the probability of success for drug targets (Nelson et al., Nat Genet. 2015), the team needed a scalable, standardized system to harmonize diverse genetic datasets and surface actionable genotype–phenotype insights for target selection.
The solution was Merck Genetics & Pharmacogenomics’ variant‑centric data platform: a harmonized pipeline that ingests GWAS, eQTL, allele‑frequency, genotype and expression data (sources like 1000G, GTEx, ExAC), indexes them in Elasticsearch with optimized mappings, and exposes fast APIs and visualizations. Today the system stores terabytes of harmonized observations (≈3 TB, with billions of documents including ~1B GWAS and ~730M allele‑freq records), delivering one‑click analyses instead of multi‑day queries and enabling genetics‑driven target prioritization that improves probability of success and can lower drug development cost.
Daniel Myung
Senior Engineer