| OAI-SearchBot | OpenAI | Answer engine | OAI-SearchBot | Yes | Surfaces pages in ChatGPT Search results and citations. Allow it to be eligible for ChatGPT citations. |
| ChatGPT-User | OpenAI | User action | ChatGPT-User | Yes | Fetches a page when a ChatGPT user opens or asks about a specific link. |
| Claude-SearchBot | Anthropic | Answer engine | Claude-SearchBot | Yes | Indexes content Claude can cite when answering with web search. |
| Claude-User | Anthropic | User action | Claude-User | Yes | Fetches a page on behalf of a Claude user request. |
| PerplexityBot | Perplexity | Search index | PerplexityBot | Disputed | Builds Perplexity’s index. Independent tests have reported crawling despite disallow rules — verify in your logs. |
| Perplexity-User | Perplexity | User action | Perplexity-User | No (user-initiated) | Fetches a page when a user asks; Perplexity states user-initiated requests are not bound by robots.txt. |
| Googlebot | Google | Search index | Googlebot | Yes | Powers Google Search and AI Overviews. AI Overviews draw from the normal Google index — blocking Googlebot removes you from both. |
| Google-Extended | Google | Model training | Google-Extended | Yes (control token) | Opt-out token for Gemini/Vertex training. Disallowing it does NOT affect Search or AI Overviews inclusion. |
| Bingbot | Microsoft | Search index | Bingbot | Yes | Bing’s index feeds Microsoft Copilot and is a major source for ChatGPT — Bing presence is part of the GEO pipeline. |
| DuckAssistBot | DuckDuckGo | Answer engine | DuckAssistBot | Yes | Powers DuckDuckGo’s DuckAssist AI answers. |
| Applebot | Apple | Search index | Applebot | Yes | Powers Siri and Spotlight suggestions. |
| Applebot-Extended | Apple | Model training | Applebot-Extended | Yes (control token) | Opt-out token for Apple AI training; does not affect Siri/Spotlight search inclusion. |
| GPTBot | OpenAI | Model training | GPTBot | Yes | Collects data to train OpenAI models. Separate from the citation/search bots above. |
| ClaudeBot | Anthropic | Model training | ClaudeBot | Yes | Collects data to train Anthropic models. |
| CCBot | Common Crawl | Model training | CCBot | Yes | Open web archive widely used as a training corpus by many AI labs. |
| Amazonbot | Amazon | Model training | Amazonbot | Yes | Crawls for Amazon AI products and Alexa answers. |
| Meta-ExternalAgent | Meta | Model training | Meta-ExternalAgent | Yes | Meta’s AI training crawler (Llama and Meta AI). |
| Bytespider | ByteDance | Model training | Bytespider | No (reported) | Aggressive crawler for ByteDance AI; frequently reported ignoring robots.txt — block at the edge if needed. |