Denodo Technologies
109 Case Studies
A Denodo Technologies Case Study
The National Institutes of Health (NIH), through the National Cancer Institute and the National Human Genome Research Institute, needed to share The Cancer Genome Atlas (TCGA) sequencing data with the International Cancer Genome Consortium (ICGC). Moving and reformatting hundreds of millions of rows from multiple sources (XML, Oracle, MySQL) into ICGC’s required formats using custom PERL scripts proved not scalable, costly to maintain, and error-prone.
NIH implemented data virtualization to connect directly to source systems, apply TCGA→ICGC mappings, produce >50 final views (over 100M rows) and run a quarterly FTP upload of CSV files. This eliminated redundant copies, sped development, improved accuracy, created reusable workflows that scaled across 25 cancer types, and was later extended to similar projects such as TARGET.