IP Ranges & User Agents
Complete reference of AI bot identifiers used by AI Search Index for detection and verification.
AI Training Crawlers
Crawlers used to collect data for training AI models:
| Bot | Provider | User Agent Pattern |
|---|---|---|
| GPTBot | OpenAI | GPTBot/1.0 |
| ClaudeBot | Anthropic | ClaudeBot/1.0 |
| Google-Extended | Google-Extended | |
| DeepSeekBot | DeepSeek | DeepSeekBot |
| xAI-Grok | xAI | xAI-Grok |
| CCBot | Common Crawl | CCBot/2.0 |
| Bytespider | ByteDance | Bytespider |
| cohere-ai | Cohere | cohere-ai |
| Applebot-Extended | Apple | Applebot-Extended |
| QwenBot | Alibaba | QwenBot |
| HuggingFaceBot | Hugging Face | HuggingFaceBot |
| AI2Bot | Allen AI | AI2Bot |
AI Search & Chat Bots
Crawlers that fetch content in real-time to answer user queries:
| Bot | Provider | User Agent Pattern |
|---|---|---|
| ChatGPT-User | OpenAI | ChatGPT-User/1.0 |
| ChatGPT-Agent | OpenAI | ChatGPT-Agent (+ Signature-Agent header) |
| OAI-SearchBot | OpenAI | OAI-SearchBot/1.0 |
| OpenAI-Operator | OpenAI | OpenAI-Operator |
| Claude-Web | Anthropic | Claude-Web/1.0 |
| PerplexityBot | Perplexity | PerplexityBot |
| MistralAI-User | Mistral | MistralAI-User |
| Gemini-Deep-Research | Gemini-Deep-Research | |
| Meta-ExternalAgent | Meta | meta-externalagent |
| Copilot | Microsoft | Copilot |
| Genspark-Webagent | Genspark | Genspark-Webagent/1.0 |
| ArcSearch | The Browser Company | ArcSearch/1.0 |
| YouBot | You.com | YouBot |
| KagiBot | Kagi | KagiBot |
| Brave-Leo | Brave | Brave-Leo |
| PhindBot | Phind | PhindBot |
Official IP Range Sources
We automatically fetch and validate IP ranges from these official sources:
OpenAI
Official JSON endpoints for GPTBot, SearchBot, and ChatGPT-User:
Perplexity
Official JSON endpoints for PerplexityBot and Perplexity-User:
Anthropic (Claude)
Documented IP ranges from Anthropic's official documentation:
docs.anthropic.com/en/api/ip-addressesIPv4: 160.79.104.0/23, 160.79.104.0/21
IPv6: 2607:6bc0::/48
Google crawler verification documentation:
developers.google.com/search/docs/crawling-indexing/verifying-googlebotHTTP Message Signatures
Some AI providers use cryptographic signatures (RFC 9421) for high-assurance bot verification:
ChatGPT-Agent (OpenAI)
ChatGPT's agentic features send signed requests with these headers:
Signature-Agent: "https://chatgpt.com"
Signature-Input: sig1=("@method" "@path"...);keyid="..."
Signature: sig1=:base64-encoded-signature:
Public key for verification: platform.openai.com/.well-known/agent-signing-key
Detection Confidence Levels
We assign confidence levels based on how bots are identified:
| Method | Confidence | Description |
|---|---|---|
| HTTP Signature | Highest | Cryptographically verified (RFC 9421) |
| IP Range Match | High | IP matches official published ranges |
| Reverse DNS | High | DNS resolves to provider domain |
| User Agent Only | Medium | UA matches pattern, IP unverified |
| Client Fingerprint | Medium | Browser automation indicators detected |
Automatic Updates
Our detection database is continuously updated:
- IP ranges refreshed hourly from official JSON endpoints
- New bot user agents added as they emerge
- Detection patterns improved based on real traffic analysis
- Agentic AI detection updated for new browser-based agents
Found a bot we're missing?
If you notice an AI bot that we're not detecting, please let us know! Contact us at support with the user agent and any other details you have.