-
AI web scrapers: A data point
In other words, a lot of these bots are checking for a robots.txt file. When they see one, they jumble up their user-agent and keep going. Yesterday's user-agent file had 641 unique user-agents. Today's had almost 18000. It would be hilarious if it weren't assholes destroying the Internet for speculative profit.
The numbers also imply a lot of bots that don't do this -- they ignore the robots.txt entirely. (Which is what I expected.) Looks like Scrapy and Amazonbot are most prone to ignore. In contrast, the appearance of GPTBot and ClaudeBot dropped way off.