Free · No signup

Robots.txt Generator

Robots.txt is the file at your domain root that tells search engines and AI crawlers which pages they can and can't access. In 2026 it also decides whether GPTBot, ClaudeBot, and Google-Extended can train on your content. Our generator ships 5 presets (Allow all, Block all, Block AI crawlers, WordPress, Custom), per-bot toggles for 24+ crawlers across 5 categories, a path-rules editor, and a live URL tester.

100% private: Your robots.txt is generated entirely in your browser. Nothing is sent to any server.

Live preview

User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

User-agent: Slurp
Allow: /

User-agent: DuckDuckBot
Allow: /

User-agent: Baiduspider
Allow: /

User-agent: YandexBot
Allow: /

User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: CCBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: Perplexity-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: FacebookBot
Allow: /

User-agent: Twitterbot
Allow: /

User-agent: LinkedInBot
Allow: /

User-agent: AhrefsBot
Allow: /

User-agent: SemrushBot
Allow: /

User-agent: MJ12bot
Allow: /

User-agent: DotBot
Allow: /

Related glossary terms

Want a deeper dive? These glossary entries explain the concepts behind this tool.

How to use it

1

Enter your site URL

Optional. We auto-fill the Sitemap line from it.

2

Pick a preset

Allow all, Block all, Block AI crawlers, WordPress, or Custom.

3

Toggle per-bot visibility

In Custom mode, expand any category and click to Allow or Block.

4

Copy the generated robots.txt

Click Copy. Upload as robots.txt to your domain's top-level directory.

Frequently Asked
Questions

Everything you need to know about robots.txt, AI crawlers, and controlling who can access your site in 2026.

A plain text file at the root of your domain (e.g. https://yoursite.com/robots.txt) that tells search engines and AI crawlers which pages they're allowed to crawl. Part of the Robots Exclusion Protocol (REP), standardized as RFC 9309. Google's crawlers download and parse it before requesting any other URL. The file must be UTF-8 plain text no larger than 500 KiB (Google ignores content beyond that limit).

robots.txt is a file at your domain root that controls crawling — whether a bot is allowed to fetch a page at all. Meta robots is an HTML <meta name="robots"> tag inside a page's <head> that controls indexing and snippet display — typically with values like noindex, nofollow, noarchive, or nosnippet. A common mistake is using robots.txt to hide a page from Google — but if another site links to that page, Google can still index the URL. To actually prevent indexing, you need a noindex meta tag.

Add a User-agent: block for each AI training crawler you want to opt out of, followed by Disallow: /. The current 2026 strings are GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended (Google's Gemini training token), CCBot (Common Crawl), PerplexityBot, and Bytespider (ByteDance). Per Cloudflare's 2025-2026 traffic data, GPTBot and ClaudeBot are among the top 4 most-blocked crawlers. The 2026 nuance: blocking the training bot does NOT block the search bots — so most marketing sites use a 'block training, allow search' strategy.

Yes, completely safe. The entire tool runs in your browser as client-side JavaScript — your site URL, your bot choices, your path rules, and the final robots.txt are never sent to our servers, never logged, never stored, and never used to train any AI. Verify yourself with DevTools → Network: zero outbound requests carry your inputs.

Want this automated across your whole site?

SERPView monitors title tags, meta descriptions, and structured data for every URL — alerting you the moment something breaks or could be improved.

Get started free