How to Fix Crawl Errors in Google Search Console
Crawl errors keep pages out of Google. How to read the Pages report, what each status means, and how to fix 404s, soft 404s, 5xx errors and not-indexed pages.
Crawl and indexing errors are the reason pages that should be in Google are missing from it — and the authoritative place to find and fix them is the Pages (Indexing) report in Google Search Console. That report lists every URL Google knows about, splits them into indexed and not indexed, and groups the not-indexed URLs by reason: 404 not found, server error, redirect, blocked by robots.txt, excluded by noindex, "crawled - currently not indexed," and more. The fix depends entirely on the reason, so the workflow is always the same: read the status, diagnose the cause, apply the right fix, then validate. This guide walks through each status, what causes it, and how to clear it.
It is a practical companion to what is technical SEO and how to audit it, focused specifically on the indexing side.
Start with the Pages report
Inside Search Console, the Pages report (under "Indexing") is your command centre. At the top it shows how many of your known URLs are indexed versus not indexed, with a trend over time. Below that, the "Why pages aren't indexed" table lists each reason and the count of URLs affected. Clicking a reason opens the list of specific URLs, and from there you can inspect any one of them in detail.
Two habits make the report far more useful. First, watch the trends, not just the totals — a sudden spike in "not found (404)" or "server error" points to something that broke recently, which is your highest priority. Second, distinguish errors from intentional exclusions. Many "not indexed" reasons are pages you deliberately kept out (noindexed, canonicalised, blocked) and need no action; the report mixes those in with genuine problems, so read each reason before assuming it is a fault.
The error statuses, their causes, and their fixes
Here is the reference table for the statuses you will encounter most, followed by detail on each.
| Status | What it means | Likely cause | Fix |
|---|---|---|---|
| Not found (404) | URL returns 404 | Page deleted, broken/changed URL, bad link | Restore the page, fix the link, or 301 to a relevant page |
| Soft 404 | Returns 200 but looks empty/error-like | Thin/empty page, removed product showing "not found" | Return a real 404/410, restore content, or 301 to an alternative |
| Server error (5xx) | Server failed to respond properly | Overload, bug, timeout, misconfiguration | Fix server/app issue; check logs and hosting |
| Blocked by robots.txt | A robots rule disallows crawling | Disallow rule covering the URL | Remove/narrow the rule if the page should be crawled |
| Excluded by 'noindex' | A noindex tag/header excludes it | Intentional or leftover noindex | Remove noindex if the page should be indexed |
| Page with redirect | URL redirects elsewhere | 301/302 in place | Usually fine; check the redirect is correct and not a chain |
| Alternate page with canonical | Canonical points to another URL | Duplicate consolidated onto canonical | Usually fine if intentional; verify the canonical is right |
| Crawled - currently not indexed | Crawled but not indexed | Quality/value/duplication | Improve content depth, uniqueness, internal links |
| Discovered - currently not indexed | Known but not yet crawled | Crawl budget / low priority | Improve internal linking, sitemap, server speed |
404 (not found)
A 404 means the URL does not exist. Some 404s are expected and fine — genuinely deleted pages with no good replacement should return 404 (or 410 "gone"). The ones to act on are 404s Google reaches via your own links or sitemap, or 404s on pages that should still exist. Diagnose by checking whether the page was deleted deliberately, then either restore it, fix the broken internal link or sitemap entry pointing at it, or 301-redirect it to the closest relevant live page if it has a worthy successor. Do not reflexively redirect every 404 to the homepage — Google may treat mass homepage redirects as soft 404s.
Soft 404
A soft 404 is sneaky: the page returns 200 OK, so it claims to be fine, but its content looks like an error or an empty page to Google. Classic causes are an empty internal search-results page, a removed product that displays "product not found" while still returning 200, or a near-blank placeholder. The fix depends on intent: if the page is genuinely gone, return a true 404 or 410; if it should have content, restore real content; if there is a good alternative, 301-redirect to it. The key is to make the status code honest about the page's actual state.
Server error (5xx)
A 5xx means the server failed to deliver the page — through overload, a bug, a timeout, or a misconfiguration. These are urgent, because a server error during crawling can stop pages being indexed and, if widespread, signals an unhealthy site. Diagnose with your server logs and hosting dashboard: look for spikes in errors, slow responses, or resource limits being hit. If errors cluster when Googlebot crawls, your server may be struggling under crawl load, which ties directly into crawl budget. Fix the underlying application or infrastructure problem, then validate.
Blocked by robots.txt
This status means a rule in your robots.txt file is disallowing Google from crawling the URL. If the page should be indexed, the fix is to remove or narrow the disallow rule so Googlebot can reach it. If the block is intentional, no action is needed — but remember the crucial distinction: robots.txt blocks crawling, not indexing. A blocked URL can still appear in results (without a description) if it is linked from elsewhere, and Google cannot see a noindex tag on a page it is blocked from crawling. To reliably keep a reachable page out of the index, allow crawling and use a noindex tag instead. The full rules are in how to write a robots.txt file.
Excluded by 'noindex'
The page carries a noindex directive (a meta tag or HTTP header) telling Google not to index it. Often this is intentional — thank-you pages, internal search results, staging content. The problem case is a leftover or accidental noindex, sometimes left over from a site launch or copied across a template, silently keeping important pages out of the index. If the page should be indexed, remove the noindex and validate. It is worth periodically auditing for stray noindex tags, because they are a common and easily overlooked cause of missing pages.
Page with redirect and alternate page with canonical
These two are usually not errors at all. "Page with redirect" means the URL 301s or 302s elsewhere, which is expected for moved pages — just confirm the redirect is correct and not part of a chain (see what is a 301 redirect and when to use it). "Alternate page with proper canonical tag" means Google consolidated a duplicate onto its canonical, which is the system working as intended. Act only if the redirect target or canonical is wrong.
Crawled - currently not indexed
This one frustrates people because there is no obvious technical fault. It means Google crawled the page, read it, and chose not to index it. It is a quality and value signal, not an error. Common causes are thin or duplicate content, low perceived usefulness, or simply that on a large site Google is selective about what it indexes. The fix is editorial as much as technical: improve the page's depth and uniqueness, make sure it offers something the rest of the web does not, strengthen internal links pointing to it so its importance is clear, and remove or consolidate genuinely low-value pages rather than hoping they get indexed. If a page truly deserves to be indexed and is not, the honest question is usually whether it is good enough yet.
Discovered - currently not indexed
Here Google knows the URL exists but has not crawled it yet. It is often a crawl-budget and prioritisation signal: Google has queued the page but judged it low priority, or is throttling crawl on a large or slow site. The levers are mostly about making the page easier to reach and more clearly worth crawling: improve internal linking so it is well-connected, ensure it is in your sitemap, speed up your server so Google can crawl more in less time, and reduce low-value URLs competing for attention. On small, healthy sites this status usually resolves on its own; on large sites it points to crawl-budget work.
Diagnosing a single URL: the URL Inspection tool
When you need detail on one specific URL, the URL Inspection tool (the search bar at the top of Search Console) is indispensable. Paste in a URL and it tells you whether the page is indexed, when Google last crawled it, which canonical Google selected versus the one you declared, whether it is mobile-friendly, and whether any indexing issues apply. Crucially, it shows you Google's view rather than yours, which often reveals the real problem — for example, that Google selected a different canonical, or fetched a different version of the page than you expected.
The tool also offers "Test Live URL," which fetches the page in real time so you can confirm a fix is live before asking Google to re-index, and a "Request Indexing" button that submits the URL to Google's crawl queue. Request indexing sparingly — it is for individual important URLs after a fix, not for pushing hundreds of pages, which it is not designed for.
Fixing at scale and validating
For errors affecting many URLs, fix the root cause once rather than URL by URL. A spike of 404s usually traces back to a single broken link pattern, a botched migration, or a removed section; a wave of soft 404s often comes from one template returning 200 on empty states; a surge of 5xx errors points at one server or app issue. Find the common cause, fix it everywhere, and the whole group clears together.
Then use the report's "Validate Fix" button. Clicking it asks Google to re-crawl and re-check the affected URLs over the following days. Google marks URLs as passed as it confirms the fix and updates the report when the validation completes, giving you a clear pass or fail. This is the correct way to signal that an issue is resolved, rather than waiting for the next routine crawl. While validation runs, avoid changing the affected pages again, which can reset the process.
An SEO crawler or a broad site audit — StackOptic among them — complements Search Console by catching many of these issues before Google does: broken links, redirect chains, pages returning the wrong status code, stray noindex tags, and orphaned pages. Used together, the crawler finds problems proactively and Search Console confirms how Google actually treats your URLs.
A crawl-error fix checklist
- Open the Pages report and sort reasons by URL count and recent trend.
- Separate genuine errors from intentional exclusions (noindex, canonical, redirect).
- Fix 404s by restoring, re-linking, or 301-redirecting to a relevant page.
- Make soft 404s honest: real 404/410, restored content, or a redirect.
- Treat 5xx errors as urgent; diagnose with server logs and hosting.
- Remove accidental robots.txt blocks and stray noindex tags.
- For "crawled/discovered - not indexed," improve content quality and internal links.
- Use URL Inspection to diagnose individual URLs and confirm live fixes.
- Fix root causes once, then click "Validate Fix" and let Google re-check.
Where to start
Begin with whatever is both large and recent: a sudden spike in 404s or 5xx errors almost always means something broke, and fixing it recovers pages fast. Trace each spike to its root cause and fix it once. Next, clear the honest mistakes — accidental noindex tags and robots.txt blocks on pages that should be indexed, and soft 404s returning the wrong status. Only then turn to the slower, editorial work behind "crawled - currently not indexed," which is about making pages genuinely worth indexing and well-linked. Validate each fix in the report as you go. That order — recent breakages first, accidental exclusions next, quality work last — recovers the most visibility in the least time.
Go deeper
- The full technical picture: what is technical SEO and how to audit it.
- The crawl-budget angle behind "discovered - not indexed": what is crawl budget and how to optimize it.
- Get redirects right: what is a 301 redirect and when to use it.
- Control crawling correctly: how to write a robots.txt file.
Want broken links, bad status codes and indexing blockers flagged before Google finds them? Analyse any URL with StackOptic — one report covering technical SEO, performance and more, free, no sign-up.
Frequently asked questions
Where do I find crawl errors in Google Search Console?
Open the Pages report under the Indexing section in Google Search Console. It splits your URLs into indexed and not-indexed groups and lists the reasons pages are not indexed — such as 404 not found, server error, redirect, blocked by robots.txt, excluded by noindex, and 'crawled - currently not indexed.' Click any reason to see the affected URLs and start diagnosing. For a single URL, use the URL Inspection tool at the top.
What does 'Crawled - currently not indexed' mean?
It means Google crawled the page and read its content but chose not to index it, at least for now. It is not a technical error; it is usually a quality or value signal. Common causes are thin or duplicate content, low perceived value, or a large site where Google is selective. The fix is to improve the page's depth and uniqueness, strengthen internal links to it, and make sure it genuinely deserves a place in the index.
What is a soft 404 and how do I fix it?
A soft 404 is when a page returns a '200 OK' status but its content looks like an error or empty page to Google — for example an empty search-results page, a removed product showing 'not found,' or a near-blank page. Fix it by either returning a proper 404 or 410 status if the page really is gone, restoring real content if the page should exist, or 301-redirecting to a relevant live page when there is a good alternative.
How do I fix 'blocked by robots.txt' in Search Console?
This status means a rule in your robots.txt file is disallowing Google from crawling the URL. If the page should be indexed, remove or narrow the blocking rule so Googlebot can reach it, then validate the fix. If the block is intentional, no action is needed, but remember that robots.txt blocks crawling, not indexing, so use a noindex tag instead when you want to keep a reachable page out of the index.
What does the 'Validate Fix' button do?
After you address the cause of an error, clicking 'Validate Fix' in the Pages report asks Google to re-crawl and re-check the affected URLs. Google runs the validation over days, marking URLs as passed as it confirms the fix, and updates the report when done. It is the proper way to tell Google you have resolved an issue, rather than waiting for the next routine crawl, and it gives you a clear pass or fail result.
Analyse any website with StackOptic
Get the full technology stack, performance, security and SEO report in seconds — free.
Analyse a websiteRelated articles
How to Optimize a Blog Post for SEO and AI Search (GEO)
One workflow that serves Google and AI engines at once: intent, answer-first intros, scannable structure, schema, E-E-A-T, cited stats and freshness.
How to Handle Pagination for SEO
Pagination done wrong hides content from Google. The modern best practice: self-referencing canonicals, crawlable links, and view-all vs paginated.
How to Improve Your Click-Through Rate in Search
Ranking is half the battle — people still have to click. How to lift search CTR with better titles, meta descriptions, rich results and intent matching.