The problem of duplicate content is one of the reasons for the low visibility of the site in the SERP. When a website has many pages with non-unique content, search engines cannot determine which page to index and rank. As a result, Googlebot can independently select which page to show in the search results, or it will not show any of the pages.
Duplicate content is the same information using the same words, tags and other HTML elements, located at different URL addresses. When JetOctopus analyzes the pages of your website, the pages with duplicate content will be those that have all the same text elements inside the <body>.
It can be both the same texts on the pages and completely identical listings with the same headings, product names, etc. And all types of such duplicate content can be found using JetOctopus.
There are several reasons why you should ensure that your website doesn’t have much duplicate content.
First, it will be difficult for search engines to determine the relevant page, so Googlebot may index a completely different URL than you expected.
Secondly, link equity will be distributed between the few pages, so internal and external linking will not have such a good result or there will be no result at all.
Third, pages with duplicate content are less visible in search results. What is visibility and how it works, read in the article What is ranking in GSС reports and how to analyze this metric. That is, in total, all versions will rank less often in Google. As a result, you will get less organic traffic.
1. Start crawling your website. To do this, log in, select the desired project and click the “New crawl” button. You can also select any other completed crawl to analyze.
Pay attention to the crawl settings: choose whether you want to crawl only indexable pages or everything on your website. You can also use the include/exclude option.
More information: How to configure a crawl.
2. Wait for the crawl to complete. In the meantime, you can read the article on how to check for duplicate titles and meta descriptions.
3. Go to crawl results – “Duplication” dashboard. Select the “Content” report.
4. On the “Content duplication overview” chart, you will find a comparison of the content of all scanned pages, including the number of pages with duplicate content.
All sections of the chart are clickable. Therefore, you can go to the data table with detailed results. Actually, we recommend analyzing each case separately. Below we will tell you why.
5. On the “Duplication by indexability” diagram, you will see the ratio of indexable and non-indexable pages with duplicate content.
By the way, if you need information only about duplicate content on indexable pages, use the built-in indexable segment.
You can also create your own URL segments and analyze duplicate content in each individual segment.
More information: How to use segments.
If you need to analyze similar content, click on the appropriate segment of the “Content duplication overview” diagram and set the desired percentage of similarity in the filters.
All duplicate content on your website should be viewed in context, as some duplicate content on the site is the norm. However, everything depends on the number of duplicated pages, scale and other technical characteristics.
When analyzing duplicate content, pay attention to the following points: