Altoros
124 Case Studies
A Altoros Case Study
QIAGEN, a global provider of sample and assay technologies for molecular diagnostics and next‑generation sequencing, needed to replace a legacy pyrosequencing analysis pipeline that could de-duplicate only ~1,000 samples and took hours or days to run. QIAGEN engaged Altoros to design a scalable, production-ready solution using Hadoop technologies (Cloudera CDH 5.2, HDFS, MapReduce) and modern analytics tools.
Altoros installed and configured a distributed Cloudera CDH cluster, built a MapReduce-based mini-framework with custom partitioners, created a converter to Hadoop sequence format, and proposed a Spark SQL–based reporting architecture. The result: a highly scalable system that processes 10,000+ DNA samples at a time (10x improvement) and reduced analysis time from hours to minutes, while enabling an open-source reporting stack that will save thousands on BI licensing.