Case Study: Ancestry speeds up historical document processing with Comet

A Comet Case Study

Preview of the Ancestry Case Study

Leveraging MLOps to Speed Up Historical Document Processing

Ancestry, a global leader in family history and consumer genomics, faced the immense challenge of quickly making the 1950 US Census searchable. The release of 6.6 million scanned document images required rapid and accurate processing to extract structured data, a task with extremely high stakes as failure would mean resorting to expensive and slow manual transcription by vendors. Partnering with Comet, the Ancestry data science team needed a robust MLOps solution to manage this high-risk project, ensure collaboration, and maintain transparency across teams.

Using Comet's platform for experiment tracking and visualization, the team built a multi-model AI pipeline. This solution employed deep learning for document layout extraction and specialized transformer models for handwriting recognition, all while leveraging Comet for logging, monitoring confidence scores, and collaborating with non-technical partners. The implementation was a resounding success; Ancestry processed the entire census in just nine days—a task that took nine months for a previous census—successfully extracting data for 171 million individuals and demonstrating a massive improvement in efficiency and operational capability.


View this case study…

Ancestry

Stanley Fujimoto

Senior Data Scientist


Comet

8 Case Studies