March 2, 2023
by Sofia

Analysis of the activity of marketing bots

During our comprehensive scan of the internet, we observed a remarkable trend – the number of websites that block marketing bots is increasing year by year. Details are highlighted in the article “Top Insights from JetOctopus Big Data Research“. It appears that website owners are becoming increasingly cautious about allowing various services to crawl their pages. In this article, we will delve into why this trend has emerged and whether marketing bots pose a threat to your website. We will explore what marketing bots are, how to analyze their activity, and how to block them if you choose to do so.

Marketing bot: what is it?

Marketing bots are various programs and software used  for a wide range of marketing tasks. For example, they can be used to collect external links, traffic and page data, among other things. SEO specialists often use marketing bots to collect information about backlinks. If you have started link building, then you definitely want to know whether these links are really placed on partner sites and whether dofollow is used. Most SEO specialists use Majestic, Ahrefs, Semrush and other tools. Additionally, many marketing tools are used to analyze reviews of products, to  monitor brand mentions.

Some bots can be used for data scraping and parsing. For example, ad aggregator sites may use a variety of programs to collect information from your site.

In addition to marketing bots analyzing your website and its content, there are also advertising bots. If you use paid targeted advertising in Google, Bing, Facebook, etc., then you need to analyze the visits of the corresponding advertising bots in the logs. This activity is expected, and it’s crucial to monitor the reverse fact: if you advertise on social networks or search engines, and advertising bots don’t visit your website, it’s worth checking if you’ve blocked them. Because if you block ad bots, they won’t be able to get information from your website and won’t show ads to potential customers.

It’s also essential to consider search engine bots. Most SEO specialists focus their work on several known search engines, depending on the region and language. However, there are more than 50 search engines. And all of these search engines can crawl your website.

As you can see, your website can be interesting not only to users and Google, but also to other search engines, various services and even data collection tools. So, understanding how marketing bots work is crucial. By analyzing their activity and impact on your website, you can make informed decisions about whether to block them or not.

How do marketing bots work

The operation principle of marketing bots is quite simple: they visit your website and scour the HTML code to find relevant information. Marketing bots can be divided into several types.

Firstly, we can distinguish marketing bots that do not conceal information about themselves in the user agent string. Such bots send the same information to your server as a regular user, and your security system (or you) can recognize which bot is visiting your website based on this information.  Also, such “white-hat” services often provide information about the IP address and user agents that this service uses to analyze your website. For instance, JetOctopus always informs its clients about the IPs and user agents used to scan the website. Such open data allows website owners and administrators to identify a second group of marketing bots – bots that emulate other user agents. Such bots may transmit incorrect or incomplete information to your web server. For example, a bot with a user agent string corresponding to Googlebot Smartphone could visit your site, but upon inspecting the IP, you may discover that it is not a real Googlebot. By the way, using JetOctopus, you can validate the Googlebot by IP and check if it is the real Google or if some service is trying to crawl your website under the guise of Google. However, services that emulate the behavior of search engines or other bots should not be confused with malware. Malicious bots should be blocked. You can read an article on how to identify malicious and fake bots: Fake bots: what it is and why you need to analyze it.

Secondly, white-hat marketing bots follow the same rules as search engines. In particular, the prohibitions in the robots.txt file are observed. That is, before starting to scan your website, marketing bots analyze the robots.txt file. After analyzing the robots.txt file, data collection, parsing, scanning, or backlink searches commence.

How to analyze the activity of marketing bots

Using JetOctopus, you can analyze the activity of not only Googlebot and other search engines, but also marketing bots. To do this, you need to integrate logs: this can be done in three ways (more information: Log file integration: everything you need to know).

Next, go to the “Logs” section. I recommend starting the analysis with general data in the “Overview” dashboard.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 1

This dashboard contains a chart that displays how often your website is crawled by marketing bots compared to other types of bots.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 2

Ensure that marketing bot activity is not higher than search engine activity, as excessive marketing bot activity may overload your web server, resulting in search engines receiving 5xx response codes and reducing the scanning frequency. If the scanning frequency of marketing bots (these bots are highlighted in purple on the chart) is higher, I recommend that you check whether you block scanning by search bots or analyze the status codes. Perhaps, due to the activity of marketing bots, your web server is overloaded, and search engines receive 5xx response codes. And as a result, they reduce the scanning frequency. Another probable reason is the blocking of search engines in the robots.txt file. A similar situation can be if most of your pages are closed from indexing with meta robots or in the HTTP header. Search bots scan non-indexable pages not often, and marketing bots usually do not pay attention to indexing rules.

Below in the “BOTS” chart you can see the ratio of crawls of marketing bots and all others.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 3

By clicking on the desired segment of the diagram or on “Marketing Tools”, you can go to the data table.

In the data table, you will see a list of all marketing bot visits. Pay attention to the two columns: “Bot” and “User Agent String”.

Here you can find information about marketing bots scanning your website.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 4

JetOctopus only considers validated user agents as marketing bots, so you will only see official services like Ahrefs, Moz, and Semrush in the list.

You can add filters to analyze specific bots. For instance, filter to analyze only MOZ bot by clicking the “Add filter” button, then select “User Agent String” – “Contains” – and enter the name of the user agent. Apply.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 5

Filter the data as you need: JetOctopus can quickly display the information you need, even if you have millions of lines of logs.

You can export all data to a format convenient for you: Excel, Google Sheets, CSV or Google Data Studio (Looker). Click on the export button and select the desired format.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 6

What to pay attention to when analyzing marketing bots

When analyzing the activity of marketing bots, there are several important metrics to pay attention to. 

The first one is how actively the marketing bots are crawling your website, and whether they are overloading your web server. This is crucial because if the bot activity is excessive, it can lead to slower load time and potentially prevent real users and search bots from accessing your website.

Another important metric to consider is the status codes that your web server is returning to marketing bots. Some security software may automatically block marketing tools and return a 403 status code, but if you want to analyze a website with these bots, you need to unblock its user agent. The same applies to non-200 status codes. If these codes are predominant in your marketing bot logs and you need to get data, you should investigate the reasons behind them and fix them as necessary.

In summary, it’s essential to monitor the activity and impact of marketing bots on your website’s performance and ensure that they are not causing any issues that could harm your SEO or user experience.

When analyzing your website’s activity by marketing bots, it’s important to identify which marketing tools are crawling your website. Some may be useful, while others may be used by your competitors and have no benefit to you. Conducting IP analysis can also help you identify invalid marketing bots. Most services publish a list of their IPs, from which websites are scanned.

Load time is a key metric to monitor. If the load time is lengthy for marketing bots, it may be the same for search bors and users (load time can impact user experience and search engine optimization). Therefore, pay attention to this metric and analyze data for both users and Google, if necessary.

Status codes are another important factor to consider. If marketing bots crawl mostly 3xx or other non-200 pages, this could indicate problems with internal linking.

Lastly, you can analyze marketing bot visits using all JetOctopus’ dashboards and data tables. Simply select the desired dashboard and click on “Bots” – “Marketing Tools” at the top of the page. Further, the data will only be displayed for marketing bots.

Analysis of the activity of marketing bots - JetOctopus Log Analyzer - 7

You may be interested in: The Ultimate Guide to Log Analysis – a 21 Point Checklist

Are marketing bots harmful?

Most marketing bots and services are not dangerous to your website. If a certain tool opens the data of its user agents and IP addresses, this indicates the security of the tool. With this data, you can block this bot at any time.

However, there are cases when blocking marketing bots is necessary, such as when they overload your web server. Additionally, you may have a legal reason to block them, or you may not want your website’s information to be analyzed by marketing tools. But keep in mind that blocking bots means you will not receive information about your website from them as well.

How to block marketing bots

Marketing bots can be both useful and harmful to your website. However, there may be situations where you need to block them.

Most marketing bots have a unique user agent and follow the rules in robots.txt file. Thus, blocking them is easy by adding a rule for the relevant bot in the robots.txt file. Here’s an example:

User-agent: Marketing Bot
Disallow: /

Another option to block the bot is by using your website’s security systems. If you’re using CloudFlare or another security system, you can configure it to block a needed marketing user agent or IP address.

You can also block user agent or IP address access to your website at the web server level.

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.

Search

Categories

Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!