Snorkel AI
30 Case Studies
A Snorkel AI Case Study
Georgetown University’s CSET, a policy research organization within Georgetown’s Walsh School of Foreign Service, needed a faster way to build NLP applications for classifying complex research documents and surfacing scientific articles of analytic interest. Manual labeling and a fragmented workflow using spreadsheets, Slack, and scripts made collaboration between data scientists and subject-matter experts slow and inefficient. Snorkel AI’s Snorkel Flow was used to accelerate programmatic labeling and improve collaboration.
With Snorkel AI’s Snorkel Flow, CSET created 60+ labeling functions to programmatically label 107K data points, used advanced techniques like auto-suggest and cluster LFs, and improved model quality through active learning and guided error analysis. The team reached 85% precision on the positive class in just a few days, an eight-percentage-point improvement over the earlier open-source approach, while significantly reducing labeling time and speeding model development.
Catherine Aiken
Director of Data Science and Research