Case Study: Carlos III University of Madrid analyzes bots across website categories with WhoisXML API's Website Categorization API

A WhoisXML API Case Study

Preview of the Carlos III University of Madrid Case Study

Carlos III University of Madrid and WhoisXML API Analyzing the Presence of Bots across Website Categories

Carlos III University of Madrid researcher Sergio Diaz, a Master in Cybersecurity student, set out to analyze robots.txt files from the Tranco top 100,000 domains to understand which crawlers and bots were blocked or allowed across different website categories. He needed a fast way to obtain accurate website classifications and WHOIS data before he could correlate domain categories with bot behavior, and his initial AI-based categorization approach was too slow and impractical. He used WhoisXML API’s Website Categorization API for the study.

With WhoisXML API’s Website Categorization API, Sergio Diaz quickly classified all 100,000 domains into clear categories and saved the results in JSON for analysis. WhoisXML API’s accurate, well-parsed outputs helped him identify which website categories were more likely to allow or disallow certain crawlers, making the research more comprehensive and reliable. The measurable impact was the successful classification of 100,000 domains and a detailed correlation between domain categories and bot activity.


View this case study…

WhoisXML API

50 Case Studies