February 9, 2026

Crawling is a crucial element of SEO and search engine visibility. Crawling is when one of Google’s bots visits your website to discover and read pages, so they can be indexed and shown in the search results. It’s the very first step to ranking - if one of your pages isn’t crawled, it won’t appear in the SEPRs (Search Engine Results Page), even if it's the best piece of content ever written.
However, Google doesn’t have unlimited resources and they have to budget their time across millions of websites. For sites with hundreds of thousands of pages, Google will only allocate a limited amount of crawl resources, and technical issues can prevent pages from being discovered altogether. When Googlebot gets stuck, confused, or overwhelmed, it can delay or block the indexing of new or updated content, which can impact your rankings.
This is why making it clear which URLs you want crawling and which ones you want a bot to avoid is crucial for improving indexability.
Recently, Google released a new episode of their ‘Search Off the Record’ podcast, exploring the topic and detailing the biggest challenges they faced with crawling in 2025. The results aren’t necessarily groundbreaking but they are interesting, and it’s important to know from a technical SEO perspective. So, here’s a breakdown of Google’s main crawling difficulties.
Faceted navigation was Google’s biggest crawling issue by a mile, which isn’t all that surprising.
Faceted navigation (when a site lets users filter products by colour, size, price etc.) is great for helping users navigate a site and find the products that best suit their needs. However, from a technical SEO perspective, it can become problematic, especially when it's not managed properly.
Every combination has a different URL, but the actual contents of the page remain nearly identical. For example, if you sort products by price, a new page with a unique URL will be created but all the products stay the same, they’re just displayed in a different order.
Google sees each new URL as a separate page and will try to crawl all of them. This means you end up with a bot that’s trying to crawl hundreds of nearly identical URLs, wasting the bot’s time and your site’s crawl budget.
If filtering creates a significant amount of unique URLs, you may want to:
Doing this will help Googlebots focus on pages that actually matter and disregard the ones that don’t.
*It’s important to note that blocking content via robots.txt does need to be done with care. For example, a Googlebot will not see a canonical if the page is disallowed in robots.txt.
Action parameters in URLs refer to the part of a URL after a ‘?’.
For example: ?sort=latest
The page content doesn’t change in any significant way and to a human user, it looks the same. However, similar to faceted navigation, a Googlebot will think that each variation needs to be crawled.
These are things like session IDs, UTM tracking codes, or campaign tags. They make URLs look different but don’t really change the content on a page.
Some plugins (especially on platforms like WordPress) create URLs automatically, often with no real content behind them.
Audit your plugin output. If it’s creating lots of pointless URLs, disable them or adjust your settings so those URLs are cleaned up or blocked.
Rare technical errors such as double-encoded URLs (two stages of encoding have been applied to a string, e.g. %2520 instead of %20) or any other strange formats that can confuse a crawler.
Keep an eye out for Non-ASCII characters during technical audits, and avoid things like capital letters, spaces, and trademark signs in URLs. You can use tools like Google Search Console or Screaming Frog to spot ‘weird’ URLs.
When a Googlebot isn’t wasting time on duplicate URLs or struggling to understand which pages it should be focusing on, new and updated pages will be discovered faster and you will notice a direct impact on your visibility. If you aren’t focused on efficient crawling, you may lose priority, which could affect how quickly and broadly your content appears in search.
You can use Google Search Console’s Crawl Stats to spot things like crawl rate drops or crawling errors, and it’s important to monitor these regularly so you can catch problems before they affect your site’s indexing.
To avoid common crawl problems:
Google’s crawling system is smarter than ever but if your site creates confusing or endless variations of the same page, Googlebots will waste time on them instead of your valuable content. Cleaning up URL structure and focusing bots on the pages that matter is one of the most impactful SEO moves you can make in 2026.
If you’d like expert help managing your crawl budget or guidance on any technical SEO issues, then get in touch with us today.