July 13, 2022
by Sofia

How to check pages with HTML full duplication

Pages with HTML full duplication are URLs that contain identical content, including full duplicated headings, titles, and metadata, HTML-elements. If these pages are open for indexing, search engines will not be able to select which page to display in SERP. As a result, the page that should not be in SERP may appear there.

With the analysis of pages with HTML full duplication, you can discover the pages automatically generated by your CMS.

Using JetOctopus, you can find pages with the same titles, headings, main content, HTML etc. in two clicks.

Step 1. Click the “New crawl” button and configure a crawl.

You can also select the desired crawl from the list if it was performed recently.

How to check pages with HTML full duplication - JetOctopus -1

Step 2. Go to the crawl results, select the “Content” report.

How to check pages with HTML full duplication - JetOctopus - 2

Step 3. Analyze the found pages in detail.

Clicking on the number next to the problem will take you to the data table. In the data table you can find all pages with HTML full duplication. A detailed analysis of these URLs will help to understand their source of origin.

How to check pages with HTML full duplication - JetOctopus - 3

You can configure all the necessary filters and columns.

Step 4. Export data to CSV, Excel, Google Sheets.

Click the “Export” button and select the desired format.

How to check pages with HTML full duplication - JetOctopus - 4

What to pay attention to when analyzing pages with HTML full duplication

Pages with HTML full duplication are a critical problem for your website. During the analysis, pay attention to the following points:

  • are those URLs important or not;
  • where do they come from/how they are formed;
  • what is their number.

Your next steps depend on the situation. If these are useful URLs and they should be in the SERP, then make the content unique. If the content is executed on the client side after JavaScript processing, analyze additionally how search engines rank these pages and for which queries. Maybe you need to use SSR or dynamic rendering.

If these are duplicate URLs and they should not be in the search results, check the following points:

  • are these URLs open for indexing – choose one correct URL, and close the others from indexing;
  • how often search engines scan URLs with HTML full duplication – if these pages are not needed, but search engines scan them, block them using robots.txt file.
About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.

Search

Categories

Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!