Case Study: V7 Labs releases an open, unbiased annotated X‑ray dataset to accelerate COVID‑19 lung research with CloudFactory

A CloudFactory Case Study

Preview of the V7 Labs Case Study

V7 Labs & CloudFactory Release Annotated X-Ray Dataset to Aid in COVID-19 Research

V7 Labs faced the challenge of studying COVID-19 lung damage without biased or insufficiently detailed imaging data. The company collected 6,000 chest X-rays from multiple open-source datasets and wanted annotations that isolate lung tissue (removing ribs, heart and diaphragm) so models wouldn’t learn shortcuts tied to age, source, or machine — a problem that can mislead COVID-19 classification. V7 Labs engaged CloudFactory to help create a high-quality, unbiased annotated dataset using V7’s Darwin annotation tool.

CloudFactory trained its managed workforce in Nepal to combine AI-driven auto-labeling with precise human-led segmentation, producing lung-only masks and annotations that “greatly improve” classification performance. The annotated 6,000-image dataset was released on GitHub for free and is directly importable into PyTorch and TensorFlow; preliminary tests show models can identify COVID-19 and other lung ailments, and the work by CloudFactory is expected to help clinicians triage severity and reduce bias in future lung-imaging research.


Open case study document...

V7 Labs

Alberto Rizzoli

Co-Founder


CloudFactory

26 Case Studies