# namestrace robots policy # Source-of-truth for crawler access. See docs/site-protections.md §4 for # the rationale behind blocking AI-training crawlers while keeping search # and ad crawlers fully open. # ──────────────────────────────────────────────────────────────────── # 1. Google's search + ad crawlers — fully allowed. # Explicit rules so they're never accidentally caught by future # wildcard changes. # ──────────────────────────────────────────────────────────────────── User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: Googlebot-News Allow: / User-agent: AdsBot-Google Allow: / User-agent: AdsBot-Google-Mobile Allow: / User-agent: Mediapartners-Google Allow: / # ──────────────────────────────────────────────────────────────────── # 2. All other search engines — fully allowed by default. # ──────────────────────────────────────────────────────────────────── User-agent: * Allow: / # ──────────────────────────────────────────────────────────────────── # 3. AI / generative-model training crawlers — blocked. # Per /terms clause 4, site content may not be used to train, # fine-tune, benchmark, or evaluate machine-learning or generative # AI models. These specific User-agents override the wildcard above. # Note: Google-Extended is Google's AI-training crawler; it is # distinct from Googlebot (search) and AdsBot-Google (AdSense), # which remain allowed. # ──────────────────────────────────────────────────────────────────── User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Claude-Web Disallow: / User-agent: anthropic-ai Disallow: / User-agent: PerplexityBot Disallow: / User-agent: CCBot Disallow: / User-agent: Bytespider Disallow: / User-agent: Amazonbot Disallow: / User-agent: FacebookBot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: Meta-ExternalFetcher Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: cohere-ai Disallow: / User-agent: cohere-training-data-crawler Disallow: / User-agent: Diffbot Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: omgili Disallow: / Sitemap: https://namestrace.com/sitemap-index.xml