March 10, 2023
by Sofia

Product Update. Introducing New Datatables in Sitemaps Dataset

We are excited to announce our latest update of JetOctopus, which includes the addition of new data tables to the sitemaps dataset. With this new update, you can now easily identify main issues with your sitemaps and sitemap URLs in just one click. This feature will allow you to monitor the health of your sitemaps more effectively and fast.

Sitemap files datatale: identifying problematic XML files

At JetOctopus, we have always been focused on technical SEO and developing tools that allow our customers to effectively track various issues. Previously, our customers could use the “Sitemap files” report to identify sitemaps that were very large, over 50,000 URLs, or other issues using filtering. However, we wanted to make it even more convenient for our customers to view sitemaps that were unavailable during the crawl. That’s why we added a separate dataset. Go to Data Tables – Sitemap Files – Unable to load.

Product Update. Introducing New Datatables in Sitemaps Dataset - JetOctopus crawler -1

“Unable to load” datatable includes sitemaps that did not return a 200 status code or were unavailable for some other reason. It is important for bots that sitemaps return a 200 status code; otherwise, they will not be able to crawl the URLs added to your sitemaps.

Sitemap URLs Datatable

Whether you are focusing on on-page optimization or internal linking, sitemaps play a crucial role. Googlebot and other search engines scan your sitemaps to discover new pages and update existing ones. We believe that sitemaps are extremely important and should be reviewed regularly. In our experience, many sitemaps can contain various errors, such as non-200 or non-indexable URLs. If your sitemap contains non-indexable, non-200 or canonicalized pages, search engines will spend their crawl budget. We have described a case where erroneous sitemaps have negatively impacted the SEO of a large website in the case studies “Sitemap as a True Damager”. By regularly checking the health of your sitemaps, you can ensure that your sitemaps are working for you.

To identify such issues in sitemaps quickly, simply go to the data table and select the desired dataset.

Product Update. Introducing New Datatables in Sitemaps Dataset - JetOctopus crawler - 2

With just one click, you can easily identify all URLs from all sitemaps that returned a status code other than 20: just click on “Non 200 status pages” dataset. It includes pages with redirects, 404 Not Found and 5xx pages. By clicking on “Non-indexable pages,” you can view non-indexable URLs found in the sitemaps. According to the rules of Google and other search engines, all pages in sitemaps must be indexable.

You can also view the “Non-canonical pages” and “Non-crawled pages” datasets. In each of these datasets, you will see a URL that corresponds to the problem, along with the sitemap where it was found and the date of the last modification.

You can configure additional columns with information or add necessary filters to analyze the data. You can also export the information in a convenient format to work with your own tables and analytical reports.

JetOctopus has many possibilities to check sitemaps, so we are confident that this new update will be useful and necessary.

By the way, JetOctopus also allows you to create a sitemap yourself. After creating the sitemap, simply upload it to the root directory of your site and submit it to Google. You can learn more about creating effective sitemaps using JetOctopus in our article “How to use JetOctopus to create effective sitemaps“.

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.



Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!