May 23, 2022
by Sofia

How to find orphan pages with JetOctopus

Orphan pages are URLs that cannot be found on your website during a crawl. They are not included in the structure of internal linking. As a consequence, users will not be able to find these pages.

But search engines can follow the orphan pages from external links, from search index (for example, if the orphan page is outdated and still returns the 200 response code, search engines will scan them regularly). Orphan pages can also be submitted with sitemaps. This is a common problem for sitemaps automatically generated by CMS.

How to find orphan pages with JetOctopus - Step 1

Orphan pages can be indexed and displayed in search results. However, they are not very good for your website. First, users will not be able to access these pages while using your website. Second, the large number of orphan pages hurts your crawling budget, as these pages do not receive an internal PageRank, but they are scanning regularly. It will reduce the chances of orphaned URLs having a high positioning in search results.

To harness the potential of orphan pages and get organic traffic, you need to add internal links and use these URLs in the site structure.

If orphan pages are useless, you should block them and remove orphans from sitemaps. Also, it is recommended to redirect orphan URLs.

More information: SEO case study. How TemplateMonster found 3 mln.orphaned pages that were regularly visited by Google search bot.

With JetOctopus, you can easily find orphan pages.

Start a crawl. To find orphan pages, you need to add additional URL sources for the JetOctopus crawler.

The most important source of orphan pages is sitemaps. Activate the “Process sitemaps” checkbox in the basic settings.

How to find orphan pages with JetOctopus - Step 2

Then add a list of sitemaps in the advanced settings.

How to check orphan pages with JetOctopus - add sitemap

We recommend using one more source of orphan pages  – data from Google Search Console. Go to the “Google Search Console” menu and click “+ Add Google Account” bottom. 

How to find orphan pages with JetOctopus - Step 3

The next step is to select the account with permission to access GSC. Then give JetOctopus permission to access your data. 

In the list of websites, select the ones you want to crawl. If you select a domain, JetOctopus will scan all URLs from all subdomains that are connected to SearchConsole.

How to find orphan pages with JetOctopus - Step 4

Start a crawl. After the crawl is finished, you can analyze the orphan pages with a special report. As you can see, it is very simple to find orphan pages with JetOctopus.

To analyze which pages are orphaned, go to the “Sitemap” report. Here you can see convenient charts that allow you to compare the number of orphan pages with interlinked URLs.

How to find orphan pages with JetOctopus - Step 5

Go to the “Orphan Pages” report to see the full list of orphan URLs.

Here you can filter the source of an orphan page. For example, you can filter the pages found by JetOctopus in the sitemaps.

How to find orphan pages with JetOctopus - Step 6

By the way, you can also filter orphan pages in “Data Tables”. Just select the “Is orphaned” filter and click “Apply”.

How to find orphan pages with JetOctopus - Step 7

Please note that if you have a partial crawl, some orphan pages may not be displayed correctly. We recommend scanning the whole website.

More information is in the article “Why Is Partial Crawling Bad for Big Websites? How Does It Impact SEO?”.

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.



Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!