Orphan pages are URLs that cannot be found on your website during a crawl. They are not included in the structure of internal linking. As a consequence, users will not be able to find these pages.
But search engines can follow the orphan pages from external links, from search index (for example, if the orphan page is outdated and still returns the 200 response code, search engines will scan them regularly). Orphan pages can also be submitted with sitemaps. This is a common problem for sitemaps automatically generated by CMS.
Orphan pages can be indexed and displayed in search results. However, they are not very good for your website. First, users will not be able to access these pages while using your website. Second, the large number of orphan pages hurts your crawling budget, as these pages do not receive an internal PageRank, but they are scanning regularly. It will reduce the chances of orphaned URLs having a high positioning in search results.
To harness the potential of orphan pages and get organic traffic, you need to add internal links and use these URLs in the site structure.
If orphan pages are useless, you should block them and remove orphans from sitemaps. Also, it is recommended to redirect orphan URLs.
More information: SEO case study. How TemplateMonster found 3 mln.orphaned pages that were regularly visited by Google search bot.
With JetOctopus, you can easily find orphan pages.
Start a crawl. To find orphan pages, you need to add additional URL sources for the JetOctopus crawler.
The most important source of orphan pages is sitemaps. Activate the “Process sitemaps” checkbox in the basic settings.
Then add a list of sitemaps in the advanced settings.
We recommend using one more source of orphan pages – data from Google Search Console. Go to the “Google Search Console” menu and click “+ Add Google Account” bottom.
The next step is to select the account with permission to access GSC. Then give JetOctopus permission to access your data.
In the list of websites, select the ones you want to crawl. If you select a domain, JetOctopus will scan all URLs from all subdomains that are connected to SearchConsole.
Start a crawl. After the crawl is finished, you can analyze the orphan pages with a special report. As you can see, it is very simple to find orphan pages with JetOctopus.
To analyze which pages are orphaned, go to the “Sitemap” report. Here you can see convenient charts that allow you to compare the number of orphan pages with interlinked URLs.
Go to the “Orphan Pages” report to see the full list of orphan URLs.
Here you can filter the source of an orphan page. For example, you can filter the pages found by JetOctopus in the sitemaps.
By the way, you can also filter orphan pages in “Data Tables”. Just select the “Is orphaned” filter and click “Apply”.
Please note that if you have a partial crawl, some orphan pages may not be displayed correctly. We recommend scanning the whole website.
More information is in the article “Why Is Partial Crawling Bad for Big Websites? How Does It Impact SEO?”.