February 28, 2023
by Sofia

What means “Page Loaded After Error” in crawl results

Analyzing the technical data in the crawl results, you may find that some pages are marked as “Page Loaded After Error”. In this article, we will tell you what this means and why you should pay attention to such pages.

What is “Page Loaded After Error”

When JetOctopus crawls a website, it performs the same actions as Googlebot. Firstly, JetOctopus sends a GET-request to the starting URL and waits for a response from the web server. If the response is 200 OK and the page is not blocked by the robots.txt file, then JetOctopus crawls the HTML code of the page and looks for <a href> elements to identify internal links that should be scanned next.

However, there are situations when the URL does not return a 200 status code during crawling. In such cases, JetOctopus sends a repeat GET request to the page. This process can occur multiple times if the page continues to return a non-200 response code. If the page still does not return a 200 status code after repeated requests, then JetOctopus records this result in the crawl results.

If the page does not return a 200 status code during the first GET request, but returns it during repeated GET requests, then JetOctopus categorizes this page under the pages with a 200 status code. However, you can find such pages in the “Page Loaded After Error” list.

Why is it important to analyze pages that did not load on the first try?

Pages that fail to return a 200 status code on the first try can potentially create problems for your website. This is because search bots, like Googlebot, may also encounter the same issue during their crawl. While Googlebot may retry the page, it is not guaranteed to do so every time.

This means that if a page fails to return a 200 status code on the first try, the search bot may leave without crawling it. As a result, you may end up with fewer pages in the index or outdated information in the search engine results pages (SERP).

Furthermore, when search bots retry non-200 pages, they end up consuming their crawl budget. This can result in search engines spending less time crawling available pages as they attempt to get a response from non-200 pages.

It’s important to pay close attention to pages that had time out during the first request. The standard timeout period is 10 seconds, and if the client browser doesn’t receive a response from the web server within that time, it will leave the page. 10 seconds may seem like a long time, but we must remember that the crawl budget not only includes the number of pages, but also the time that bots spend on your website. Therefore, such a long wait without results will reduce the bot’s resources for crawling other pages on your website. You can read more about timeouted pages in this article: How to recrawl pages unable to load by timeout.

Similarly, 5xx pages can also cause problems. If your website returns a 5xx status code, search engines will reduce their crawl frequency to avoid overloading your website. Check your logs for more 5xx errors.

It’s essential to analyze pages that have had more than one load try. Doing so will help identify pages that may not be accessible to bots, and users may also encounter this error when attempting to load a page, leading them to leave your website.

How to find “Page Loaded After Error”

To see if there were pages that JetOctopus sent repeated requests to, go to the crawl results – “Technical” – “Statuses”.

What means “Page Loaded After Error” in crawl results - JetOctopus - 1

At the bottom of the dashboard is a “Page Loaded After Error” chart. In this chart you can find information about the number of pages with this problem and what happened when the first request was made. It can be both a timeout and non 200 status codes that indicate problems with the server.

What means “Page Loaded After Error” in crawl results - JetOctopus - 2

To find a detailed list of pages with a problem, go to the data table – “Pages”.

What means “Page Loaded After Error” in crawl results - JetOctopus - 3

Select the filter “Load Try – “Greater than” – “1”.

What means “Page Loaded After Error” in crawl results - JetOctopus - 4

In the list you can see the pages that had errors during the first GET request. And in the column “Load Try” you will see the number of tries by the crawler to load the page.

What means “Page Loaded After Error” in crawl results - JetOctopus - 5

You can add additional datasets to see if search engines are getting similar errors. In the logs, pay attention to blinking status codes, the cause of which may be the same problem.

You can export all data to a convenient format: CSV, Google Sheets or any other.

What means “Page Loaded After Error” in crawl results - JetOctopus - 6

Analyzing the technical condition of the website, we recommend paying attention not only to pages loaded after an error, but also to the following cases:

How do I check for broken links (404 Errors)?

How do I bulk export all inlinks to 3XX, 4XX (404 error etc) or 5XX pages?

Why there are 5xx pages in the crawl results and why they are not reproduced with the manual checking

How To Find Orphan Pages

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.

Search

Categories

Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!