Crawl Budget
The limited number of URLs a search engine bot will crawl on your site within a given timeframe, determined by crawl rate and crawl demand.
Crawl budget is the finite number of URLs that search engines (primarily Google) will crawl on your website within a given timeframe. It is determined by two main factors: crawl rate limit (how much crawling Googlebot can do without overloading your server) and crawl demand (how much Google wants to crawl your pages, based on popularity, freshness signals, and staleness).
For most small to medium websites (under a few thousand URLs), crawl budget is rarely an issue — Googlebot can comfortably crawl every page. The problem emerges on large sites: e-commerce with 100,000+ product pages, news sites with millions of articles, or faceted-navigated catalogs. When crawl budget is exhausted before all important pages are crawled, new pages take longer to get indexed, updates to existing pages are picked up slowly, and orphan pages may never be discovered.
Common causes of crawl budget waste: low-quality or thin pages (filtered/parameter URLs, session IDs, internal search results), redirect chains and loops, soft 404 pages, duplicate content with no canonical, blocked-by-robots.txt but internally linked pages, and slow server response times (TTFB over 500ms). The fix is technical: use robots.txt and meta robots to block non-value URLs, set canonicals on near-duplicates, fix redirect chains, improve server response times, and maintain a clean XML sitemap that reflects only the URLs you want indexed.
To monitor your crawl stats, use Google Search Console's Settings → Crawl Stats report. It shows requests per day, download size, and response time. If your daily requests are flat but your page count is growing, you've hit a crawl budget wall.