Understanding robots.txt syntax is essential for proper SEO. This reference covers every directive (User-agent, Disallow, Allow, Sitemap, Crawl-delay), wildcard patterns, and common mistakes that can accidentally block your entire site.
A single syntax error in robots.txt can block your entire site from Google. Understanding the format prevents costly mistakes and helps you write precise crawl rules.
Try It Now — Free, No Sign-up
Open the tool and get started instantly. No sign-up, no installation needed.
Open Robots.txt Generator Now100% browser-based • No upload to server • No sign-up required
How to Robots.txt Syntax — Complete Reference Guide
- Start with User-agent: line (bot name or * for all)
- Add Disallow: lines (paths to block)
- Add Allow: lines (exceptions — processed before Disallow)
- Use * wildcard to match any string
- Use $ to match end of URL
- Add Sitemap: with full URL to your XML sitemap
- Separate rule groups with blank lines
Pro Tips
- Allow takes priority over Disallow for same-length patterns
- Blank Disallow: means allow everything (not block everything)
- Comments start with # and are ignored by bots
- Each User-agent group must have at least one Disallow or Allow
- Sitemap: can appear anywhere in the file (not tied to a User-agent)
- Maximum file size: 500KB (Google ignores content beyond this)
Frequently Asked Questions
Related Tools & Guides
Ready to Use Robots.txt Generator?
Free, instant, and 100% private. No sign-up needed.
Open Robots.txt Generator