What is Cloudflare?
Cloudflare is an American company that offers a range of internet services. This includes content delivery network services, cybersecurity services, DDoS (distributed denial-of-service) mitigation and domain name services. Cloudflare is a global network that works to enable faster and safer websites through increased performance and security.
Recently the company announced that it will be blocking AI crawlers by default from any new customer site that uses its services. From last year, it had already offered customers the ability to block AI bots completely from crawling sites or to pick and choose which would be allowed or not. This new announcement of blocking by default means it will be an opt out policy rather than an opt in one.
What is Cloudflare pay-per-crawl?
Cloudflare pay-per-crawl will be a new option that content publishers can enable to stop AI crawlers freely scaping their sites for information. With pay-per-crawl enabled the crawlers will be faced with a view of pricing when they attempt to access the site that will inform them a charge is required to access the content.
Cloudflare explain it like this:
Pay per crawl integrates with existing web infrastructure, leveraging HTTP status codes and established authentication mechanisms to create a framework for paid content access. Each time an AI crawler requests content, they either present payment intent via request headers for successful access (HTTP response code 200), or receive a
402 Payment Requiredresponse with pricing. Cloudflare acts as the Merchant of Record for pay per crawl and also provides the underlying technical infrastructure.
What does Cloudflare pay-per-crawl mean for businesses?
This option from Cloudflare gives back control of content to the publishers, allowing them to vet who can and can’t access their content and who should have to pay to access it. The loss of power over content published and freely accessed by AI systems will be regained in a bid to protect authentic, reliable content.
The system is now in beta, with a small number of businesses trailing the scheme. Cloudflare will give content publishers more options in terms of how their content is viewed. By partnering with AI companies, they will be able to determine what AI bots are crawling for. Through this verification process, content publishers then have the option to either enforce a charge on all AI bots, or just those crawling for content generation for example. AI bots crawling for training, learning and general search purposes may be allowed access.
What does Cloudflare pay-per-crawl mean for AI crawlers?
With 20% of the internet using Cloudflare, the impact could be significant to AI crawlers who use the web to train from. If a number of companies opted to enable the pay-per-crawl option crawlers could stand to lose up to 20% of information they are reliant on to train and learn from.
Whilst there are already some methods in place to block crawlers such as the robots.txt rules, they aren’t always followed. The robots.txt method works by placing a text file within your website listing all the sites you want to disallow certain crawling from. This technically should allow crawlers to see what they can and can’t access on that specific site.
Conclusion
This action from Cloudflare, blocking AI crawlers by default, is a bold move. Websites using Cloudflare will have to seriously consider whether they want to allow AI crawlers to access their sites going forward and the implications either way.
The Pay Per Crawl option, giving power back to the content publishers through paid access and essentially providing a new revenue stream seems like to could be an answer in ensuring the longevity of these platforms in the new AI world.
If you would like more information on our services, or have any thoughts on this article you would like to share, please send us a message.