Crawlability

SEO

Also: Crawl Access · Spider Access · Bot Accessibility

Whether search engines can access your pages

robots.txt controls what bots can visit

Blocked pages cannot rank

XML sitemap guides crawler prioritisation

Quick definition

Crawlability refers to how easily search engine bots (Google's Googlebot, Bing's Bingbot) can access and navigate the pages on your website. If a page cannot be crawled, it cannot be indexed. If it cannot be indexed, it cannot rank. Crawlability problems are often invisible to site owners because the page looks fine in a browser but is effectively invisible to search engines.

How it varies across Australia

Crawlability issues are more common than most businesses realise. Sites rebuilt or migrated without careful technical planning frequently block critical pages through misconfigured robots.txt files or noindex tags that were meant to be temporary.

Explore benchmarks →

robots.txt

A text file at your root domain (yoursite.com.au/robots.txt) that instructs crawlers which pages or sections to visit or skip. Misconfigured robots.txt files are one of the most common sources of accidental crawl blocking.

Noindex Tag

An HTML meta tag or HTTP header that tells search engines not to include a page in their index. Useful for admin pages, duplicate content and search result pages. Dangerous if applied to pages you want to rank.

XML Sitemap

A file listing all the URLs on your site you want search engines to crawl and index. Acts as a roadmap for Googlebot. A well-structured sitemap speeds up the discovery and indexation of new and updated content.

Crawl Budget

The number of pages Googlebot is willing to crawl on your site within a given timeframe. For small sites, this is rarely a limiting factor. For large sites with thousands of pages, wasting crawl budget on low-value pages can slow down the indexation of important ones.

What it actually means

Search engines work by sending automated programs called crawlers or spiders to visit web pages, read their content and follow links to other pages. Crawlability is the measure of how successfully those crawlers can access your site. Several things can break this: a robots.txt file that accidentally blocks important sections, noindex meta tags left in place after a migration, pages behind login walls, JavaScript that hides content from bots or broken internal links that prevent crawlers from reaching certain sections. The dangerous thing about crawlability problems is that they are invisible in the browser. Your page looks perfectly normal. A customer can visit it. But Googlebot cannot, so the page will never appear in search results.

If Google cannot reach your page, every dollar you spend on SEO for that page is wasted.

The Australian context

Many Australian businesses go through site migrations or platform changes without a technical SEO checklist. The result is frequently a mix of orphaned pages (not linked to from anywhere), accidentally blocked sections or sitemap files pointing to old URLs that no longer exist. Australian web development agencies often handle design and development well but leave the technical SEO migration component to chance. Always request a pre and post-migration crawl comparison before signing off on a site rebuild.

Where people get this wrong

The most serious mistake is blocking the entire site in robots.txt during development and forgetting to update it before launch. The second most common is applying noindex tags to pages during a build for legitimate staging reasons and not removing them on go-live. Both are entirely preventable with a launch checklist that includes a Search Console check within 24 hours of any site change.

Common questions

How do I check if Google can crawl my pages?

Google Search Console is the primary tool. The URL Inspection tool shows whether a specific URL is indexed and what Googlebot last saw when it crawled it. The Pages report (formerly Coverage) shows site-wide indexation status. You can also check your robots.txt file directly at yoursite.com.au/robots.txt.

Does crawlability affect ranking speed?

Yes. If Googlebot can easily navigate your site through clear internal links and a well-structured sitemap, it will crawl and index new or updated content faster. Sites with poor internal linking, large numbers of broken links or slow server response times get crawled less frequently.

Should I block any pages from being crawled?

Yes. Admin pages, user dashboards, cart and checkout pages, search result pages and duplicate URL variants (with parameters) are generally good candidates for blocking or noindexing. These pages rarely provide ranking value and consuming crawl budget on them is wasteful.

Do JavaScript-heavy sites have crawlability issues?

They can. Google can render JavaScript, but it sometimes takes longer than crawling static HTML. Content that only loads after a user interaction (clicks, scrolls) may never be seen by Googlebot. Server-side rendering or static generation of important content avoids these issues.

Debrief

Get the next one

No spam. No fluff. Just the next article, straight to your inbox.

Keep exploring

About New Rebellion

New Rebellion is a marketing intelligence consultancy. We build tools, score Australian businesses on how their marketing actually performs, and publish Debrief every day. This dictionary is part of how we work in the open.

How we think →