Why Partial Technical Analysis Harms Your Website

Jan 30, 2019
  • Please, share!

Why Partial Technical Analysis Harms Your Website

Have you ever faced SEO drop? So did we. Guided by data of partial crawling our SEO had deleted relevant content which led to losses of traffic and positions in SERP. The lesson we learned is that partial technical site audit is useless and even harmful for websites. Why? Read below to avoid repeating our mistake.

Any missing pieces of technical data could result in wrong insights

When you are losing positions in SERP, you need to act quickly. There is no time to scan the whole website, you will just analyze the quarter of the total webpages volume to find high priority bugs. Website owners widely use this method but this is wrong.

Partial approach to website’s audit is more dangerous than it seems and here is why. Our first business is 5M pages job aggregator Hotwork. Looking for new opportunities to increase SEO traffic we were analyzing 200K pages and we found a lot of trash auto-generated content. We applied this conclusion on all 5M pages and blindly deleted all auto-generated content. That was a crucial mistake! We were waiting for improvement of crawling budget and increase of our positions but we faced 30% SEO drop as it was a big part of pages and not all of them were trash. There were qualitative auto-generated pages too which had good positions in SERP. Only when we checked each URL and returned relevant pages in the website structure, our website took back the lost position in SERP. It took us up to 3 months. We lost time and efforts but we’ve got a priceless lesson. Results of partial crawling of site can never be replicated to the whole website.

Using partial analysis you could repeat a story of the liner Titanic. The huge ship sank after hitting with an iceberg. With partial analysis you see only the top of the iceberg and underestimate the danger. The earlier you see the real scale of slow load time, wrong status codes, thin-content, duplicates and other crucial technical problems on your website, the more time you have for maneuver.

Partial crawling helps if URLs sample is random - but it’s impossible!

If sociologists want to get correct results of survey, they choose people for interviewing of different age, gender, and professions. Random sample is the most reliable way to achieve undistorted outcomes. The same way technical audit works: you can analyze a few webpages and get accurate insights, but you how will make programmed web crawler choose URLs in a random order?

Technopedia explains web crawler as a tool that analyzes one page at a time through a website until all pages have been processed. Since web crawler is based on detailed algorithms, this tool doesn’t have artificial intelligence. Any web crawler scans your URLs one by one and can’t pick webpages for analysis randomly.

Let’s take the example of an internal linking structure analysis to see how partial crawling provides misleading insights. Web crawler starts analyzing your pages from the landing page and dives deeper on the next levels through internal links on your webpages. Partial analysis of the first 100K pages reveals the structure of your website on 1, 2, 3-rd click distance but e-commerce websites contain 5-10+ levels on which picture will be different! You stay blindly to a mess in your links and don’t see the scale of the issue. So, when you analyze not just a separate segment where you had a product update but the fundamental on-page SEO criteria never make conclusions on partial data. This conclusion will likely be wrong.

Treating audit as One-Time task doesn't give true insights into the challenges you’re facing

Forbes author George Bradt treats Stability as a key quality of successful business. Stability in conduction of SEO audits is really crucial for your website. Google and users appreciate websites with the optimized interface, navigation, and content. Here are the main reasons why you should realize the technical audit of your website steadily:

- Google’s algorithms are developing constantly. Unless you adapt your website to new Google’s recommendations, you lose positions in SERP. Full technical audit reveals areas that should be optimized on your website for smooth sailing among competitors.

- Users’ behavior and needs are changing. People’s demands and desires are not stable. Google always ranges higher those webpages that are relevant to users queries. A full technical audit involves logs files analysis that shows which webpages are important for Google and which are not. Make your website’s hierarchy clear and simple to give your visitors exactly what they want.

- Technical state of your website is not stable. Every single change on a webpage could impact the technical state of your website. If you decide to migrate your website to a new CMS, redesign webpages or add a new block with fresh content, you should conduct SEO audit to see how your website performs in Google after changes and compare data before and after updates. Compare crawls is super revealing option. In-depth technical analysis is a test-drive of innovations on your website.

When you treat technical audit as a one-time task, you still get insights for search optimization today, but tomorrow after another product update something can be accidentally broken. We strongly recommend making full crawling of your site after each product update watching the dynamics of main criteria.

Сonsolidation of partial crawling data is time-consuming and risky

Desktop crawlers are limited in crawling capacity due to the resources and memory of the computer they are running on. It’s most likely you’ll be limited to crawling only a few thousand URLs per crawl. While this can be normal for small websites with 1-2K pages, it still takes plenty of time gluing together pieces of data into a single picture. If you crawl an e-commerce website around to 1M. pages, you need firstly slice it by 200k pages parts, crawl by parts, then put data in Excel trying to make one big picture (and won’t lose any part) and after these time-consuming tasks, you can start searching for meaningful insights. Sounds exhausting and not effective, right?

You also need to take human factor into consideration. Since one SEO beats his head around a combination of data for a few days, you should give this task to several members of your team. The more people are engaged in the process, the more chances pieces of technical data will be lost. Copy-and-paste approach to analysis requires full attention, but after a while interest to repetitive tasks and quality of work decrease. As a result, you waste plenty of time and efforts but still get distorted insights of crawling. Due to the complexity and scope of the described process, analysis can easily turn to paralysis of your SEO strategy.

Today plenty of website owners put their business at risk by conducting technical SEO audit manually or partially. You can hire a tech geek who can cope with the processing of partial crawling data, but this is definitely not scalable approach. Think about cloud-based crawlers that save you truckloads of time and resources and give you problem visibility, more data, much more filters to play with, compare crawls option which all lead to true SEO insights and quick SEO traffic uplifts.

Read our clients’ case studies here.


Ann Yaroshenko is a Content Marketing Strategist at JetOctopus. She has Master’s diploma in publishing and editing and Master’s diploma in philology. Ann has two years experience in Human Resources Management. Ann is a part of JetOctopus team since 2018.

  • Please, share!
Auto classified, 20m pages crawled
What problem was your SEO department working at when you decided to try our crawler?
We needed to detect all possible errors in no time because Google Search Console shows only 1000 results per day.

What problems did you find
That’s quite a broad question. We managed to detect old, unsupported pages and errors related to them. We also found a large number of duplicated pages and pages with 404 response code.

How quickly did you implement the required changes after the crawl?
We are still implementing them because the website is large and there are lots of errors on it. There are currently 4 teams working on the website. In view of this fact we have to assign each particular error to each particular team and draw up individual statements of work.

And what were they?
It’s quite difficult to measure results right now because we constantly work on the website and make changes. But a higher scan frequency by bots would mean the changes are productive. However, around one and a half months ago we enabled to index all the paginated pages and this has already affected our statistics.

Having seen the crawl report, what was the most surprising thing you found? (Were there errors you’d never thought you’d find?)
I was surprised to find so many old, unsupported pages which are out of the website’s structure. There were also a large number of 404 pages. We are really glad we’ve managed to get a breakdown of the website subdirectories. Thus we’ve made a decision which team we will start working with in the beginning.

You have worked with different crawlers. Can you compare JetOctopus with the others and assess it?
Every crawler looks for errors and finds them. The main point is the balance between the scanned pages and the price. JetOctopus is one of the most affordable crawlers.

Would you recommend JetOctopus to your friends?
We’re going to use it within our company from time to time. I would recommend the crawler to my friends if they were SEO optimizers.

Your suggestions for JetOctopus.
To refine the web version ASAP. There are a few things we were missing badly:
Thank you very much for such a detailed analysis. Currently we have been reflecting on a redirects' problem.
Do you wan’t
more SEO traffic?
All sites have technical errors which block your SEO traffic boost
I’ts an axiom
Find out the most epic errors ot your site and start fixing them
The amount of the provided information is quite enough and you can see at once where and what the problems are