May 13, 2022
by Sofia

Why are URLs being displayed as non-indexable in crawl results?

Indexability is one of the most important characteristics to pay attention to when analyzing crawling results. We’ve created a separate segment to make work with indexable pages easier. You can use this segment in all JetOctopus reports.

Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 1

Why the page is not displayed in the segment “Indexable”

An indexable page has 200 response code and can be crawled and indexed by a search engine. All pages that meet these requirements will be displayed in the segment “Indexable”.

The page will not be displayed in the segment “Indexable” if:

  • it does not respond with 200 status code (3xx, 5xx, 4xx URLs – all these are non-indexable pages);
  • page is redirected;
  • the page is blocked by the robots.txt file;
  • the page contains the “canonical” to another page;
  • the page contains a “noindex” meta tag on HTML pages;
  • the page contains “none” or “noindex” X-Robots-Tag in an HTTP header;
  • if indexing is not prohibited and the page does not contain any other indexation rules.

In all the cases listed above, search engines will not be able to either scan the page, or read information about the indexability status, or there are noindex directives in HTML or HTTP header. To find out why the page is non-indexable, go to the “Indexation” report. There is a chart with the number of non-indexable pages for various reasons.

Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 2

We also recommend working with the data tables to see the reason the page is non-indexable.

What does mean “Is indexable” in Data Tables

The “Is indexable” filter in Data Tables is the same as the “Indexable” segment. This filter does not include pages with non-200 status codes, with canonical tag, with noindex meta tag or noindex/none specification in the X-Robots-Tag, and pages blocked by the robots.txt file.

Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 3

To find reasons for non-indexable status use additional filters in the “Indexation” block:

  • Is Robots.txt indexable – pages that are not blocked from scanning by search engines in the file robots.txt;
  • Is meta tag indexable – pages that contain the meta tag “index” in HTML;
  • X-Robots Header Index – pages without the X-Robots-Tag prohibiting indexing.
Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 4

To find out if the reason for non-indexability is in the canonical, you can use the filters in the “Canonicals” block.

  • Is Canonical Page – select “No” to see if pages contain canonicals to another pages.
Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 5

To analyze pages that are non-indexable due to a non-200 status, select the needed value in the status code filter.

Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 6

You can also set the “Non-Indexable Reason” column in the data tables. JetOctopus shows the reason for non-indexability in this column.

Why are URLs being displayed as non-indexable in crawl results - JetOctopus - Step 7

Of course, you can export your data to CSV, Excel or Google Spreadsheets.

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.

Search

Categories

Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!