Kaggle
7 Case Studies
A Kaggle Case Study
Merck ran the Merck Molecular Activity Challenge, a $40,000 Kaggle competition to improve QSAR-style prediction of molecule–target activity across 15 diverse datasets containing chemical structures and activity measurements for thousands of compounds. The goal was to identify molecules likely to be active against therapeutic targets while avoiding activity on targets that cause side effects—a hard problem because each dataset had different characteristics and units.
A team led by graduate student George Dahl applied a deep learning model adapted from speech recognition, requiring little domain-specific feature engineering, and achieved a 17% improvement over an industry benchmark. Their win—one of nearly 3,000 submissions in 60 days and the first Kaggle victory for deep learning—demonstrated that neural networks can significantly accelerate computer-aided drug discovery.