Should my business block AI crawlers?
For most businesses that sell something, no: blocking AI crawlers makes you invisible to the exact systems your buyers now ask for recommendations. For publishers whose content IS the product, the calculus genuinely differs, and blocking can be rational. The mistake is treating this as one decision. It is several: different bots do different jobs, and robots.txt lets you choose per bot. Here is the honest trade, both sides, and a framework instead of a slogan.
- Blocking = invisible to AI recommendations. That is the price.
- For businesses selling products/services: the price is too high.
- For publishers monetizing content itself: sometimes rational.
- Decide per bot: search/citation bots vs training-only bots differ.
Why so many sites blocked, and what they were buying
When AI crawlers appeared, thousands of sites disallowed them on principle: content scraped for training, no permission, no payment, no clicks back. For news organizations and paywalled publishers, that position is coherent, their content is the product, and a system that ingests it and answers readers directly is a competitor wearing their work. If that is you, blocking, or negotiating licensing, is a legitimate strategy, and nothing in this article argues otherwise.
What blocking costs everyone else
Now the other side, the one that applies if you sell services, products, or expertise rather than content itself. Your buyers ask ChatGPT and Perplexity who to hire. Those systems answer from what their crawlers can read. Block GPTBot and OAI-SearchBot and you have not protected anything a service business needs protecting, your marketing pages were written to be read, but you have removed yourself from the answer. Your competitor who stayed readable inherits your citations. It is the only marketing decision I know where the downside is total, invisible, and free for your rival.
Is your content the product you sell, or the marketing for the product you sell? Product: blocking can be rational. Marketing: blocking is self-erasure.
Deciding per bot, the part everyone skips
Robots.txt is per-agent, and the agents differ in what your blocking actually withholds:
- Search and citation bots, OAI-SearchBot, ChatGPT-User, PerplexityBot: power live answers that name and link sources. Blocking these removes you from recommendations, the core GEO surface. For any business that wants buyers, leave them open.
- Training-oriented crawling, GPTBot, ClaudeBot, and Google-Extended (which governs AI training use, not Google Search): mainly affects future model knowledge. Blocking them is the philosophically cleaner protest, and even here a business wanting AI visibility benefits from being in the models' baseline picture of its category.
- The nuance that bites: some access paths overlap, and the ecosystem shifts. A middle position exists, open the citation bots, restrict training-only agents, and it is the reasonable compromise for the genuinely conflicted.
My position, stated plainly
For service businesses, consultancies, local businesses, product companies, anyone whose site exists to win customers: open them all. The training question is real but abstract for you; the recommendation question is concrete and immediate, and being absent from AI answers costs actual pipeline today. My own robots.txt welcomes every AI crawler by name, which you can verify at the scan or by reading the file. I practice the position because the trade is not close for a business like mine, or, probably, like yours. And whichever way you decide, decide on purpose: check what the AI currently says about you first, because plenty of sites blocked years ago in a template default and never knew.
Common questions
Does blocking GPTBot remove my business from ChatGPT?
It removes ChatGPT's ability to read your site, which guts accurate, current citations of you. The model may retain older knowledge and can still encounter third-party pages about you, so you degrade into being described secondhand, stalely, or not at all. For a business wanting buyers, all three outcomes are losses.
How do I check if my site is blocking AI crawlers?
Open yourdomain.com/robots.txt and look for Disallow rules under GPTBot, PerplexityBot, ClaudeBot, OAI-SearchBot or Google-Extended. Or run my free scan, which checks exactly this bot by bot. Many sites block via old templates without ever having decided to.
Can I block AI training but stay in AI search results?
Approximately, and it is the reasonable middle: allow citation-oriented agents like OAI-SearchBot, ChatGPT-User and PerplexityBot while disallowing training-oriented ones like GPTBot and Google-Extended. The separation is not perfectly clean and the ecosystem shifts, so revisit the file periodically.
Not sure what your robots.txt is doing right now?
Scan your site free. It checks every AI crawler by name and tells you exactly who can and cannot read you today.
Run my visibility check