July 7, 2023
by Sofia

How to find 404 pages in XML sitemaps

Efficiently identifying 404 Page Not Found errors within sitemaps is crucial for maintaining a seamless search bots experience and enhancing your website’s performance. In this guide, we’ll explore how JetOctopus can be harnessed to detect and manage these errors effortlessly.

How to detect 404 Not Found pages in sitemaps

Using JetOctopus, you can find 404 pages in sitemaps. To do this, you need to start the new “Sitemaps Only” crawl mode. To do this, click the “New crawl” button, then select the “Site maps only” mode.

In the “URL” field, enter the URL of the home page, and in the “Sitemaps” field, enter the site maps you want to analyze. You can enter as many sitemaps as you want – we have no limit. However, all these sitemaps must belong to the domain you entered in the “URL” field.

How to find 404 pages in XML sitemaps 1

Then perform all the additional settings of your crawler and run the crawler. By the way, if you want to get more information about the URLs in your sitemaps, for example, if you want to see which pages have internal links pointing to 404 pages found in your sitemaps, you need to run a full crawl of your website. This will help remove 404 pages not only from sitemaps, but also from the site itself.

After the crawl is complete, go to the crawl results and select the “Sitemap URLs” data table.

How to find 404 pages in XML sitemaps 2

Next, add the dataset “Crawl Pages” and click the “+Add Filter” button and select the filter “Status Code” – “Equal” – “404”. Click on “Apply”.

How to find 404 pages in XML sitemaps 3

After that, you will get a detailed list of pages that returned a 404 status code and that were found in your sitemaps.

What to do with 404 pages found in sitemaps?

First, you need to analyze 404 pages. If these pages were important to you, for example, if these pages received clicks and impressions in SERPs, we recommend that you set up 301 redirects to new pages.

Secondly, all non 200 pages in the sitemaps must be removed or replaced, otherwise you will spend the crawl budget of the search robot on scanning 404 pages.

And once you’ve replaced the broken URLs and 404 pages in your sitemaps, crawling and your crawl budget should improve. You can also generate a new sitemap that does not contain 404 pages using JetOctopus. To do this, filter the required URLs that you want to add to the sitemap using the “Pages” or “Sitemap URLs” data table, click “Export” and select “to Sitemap”.

How to find 404 pages in XML sitemaps 4

Then submit this sitemap to Google, having previously uploaded it to the root of the website. The old sitemap that contains 404 links should be removed.

In conclusion, staying vigilant about 404 errors in your sitemaps is integral to maintaining a well-optimized website. By utilizing JetOctopus to pinpoint and rectify these errors, you can improve crawling efficiency, and ultimately bolstered SEO performance. Embrace these strategies to uphold the integrity of your site and provide a seamless browsing experience for your users.

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.

Search

Categories

Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!