Analyzing search engine logs is a core task for technical SEOs. Log files contain 100% accurate information about how search engines crawl your website, which makes log analysis indispensable for technical SEO strategy.
However, analyzing raw logs with standard tools like Kibana, Datadog, or exporting them into CSV files can be frustrating and inefficient. That’s why a specialized SEO log analyzer such as JetOctopus is the smarter choice. In this article, we’ll walk through how to integrate logs into JetOctopus, what integrations are supported, common pitfalls, and how to solve them.
Tip: You can use our free 7-day trial after a demo call to test log integrations and start getting insights from search engine logs right away.
For a full checklist of what to analyze in your logs, see: The Ultimate Guide to Log Analysis – 21 Point Checklist.
If you haven’t used a log analyzer before, you may be wondering where to get the logs from.
1. If you have a large website and development team, then it is 100% that the logs are already being collected. Usually, various programs and services are used for visualization, for example, Kibana, Datadog, Greylog, Grafana, Databox, Looker, and so on. Unfortunately, logs in such services may not be stored for a long time, because their number is very large.
To get the logs, you can contact the development teams (best to contact DevOps) and get access to one of the services that collect the logs. Next, you can select the desired period and export the data to a text file.
This is what a record of a Googlebot visit looks like in one of the services.
2. Another way is to get Access Log records from hosting. Access Log contains raw logs of all users of your website. Mostly, the data is stored for the last day. Then the logs are archived. Logs are deleted at the end of each month.
What do raw logs mean? This means that you will receive a large file with dozens and hundreds of lines without separated sections with user agents, messages, IPs, etc. Handling such files manually is very inconvenient.
To access the raw logs, you need to connect to your account with FTP or SSH.
3. You can also contact DevOps to integrate logs directly from your web server.
Logs contain standard data:
This data is usually transmitted by the user agent in the request header to your web server, which is why the logs are also called “server logs”. And they should not be confused with logging changes on your website. For example, logging can contain information about when and which users edited the content of the page.
JetOctopus offers several convenient options for log integration. Choose the one that fits your infrastructure and needs.
For clients with Cloudflare Enterprise accounts, JetOctopus offers a native integration that taps into Cloudflare’s Logpush service. This provides complete, structured logs with all the essential fields needed for SEO analysis. Enterprise-level security, scalability, and support make it a reliable option for big brands running on Cloudflare.
Time lag: ~1 hour.
For websites hosted on Amazon Web Services, JetOctopus integrates with CloudFront log streaming. Logs are delivered to an S3 bucket, which JetOctopus fetches and processes automatically. This integration works particularly well for global websites that rely on AWS infrastructure, as it scales effortlessly and supports high-traffic environments.
Time lag: 1–24 hours.
Fastly CDN is widely used by enterprises and e-commerce platforms for its speed and advanced caching. JetOctopus integrates with Fastly’s log streaming to capture detailed bot activity.
Time lag: ~1 hour.
CloudFlare is a useful technology to gather website logs when other methods are not appropriate. This integration is based on Workers technology, additional fees on CloudFlare may apply.
Using this method, you will receive data in real-time. This is a very cool feature that allows you to monitor the behaviour of bots immediately during site migrations or updates.
How it works: theory
Each time bot makes a request, your server sends a small package of data (on average, 200-300 bytes) on our servers in real-time. We analyze this data immediately and store it in the system where data appears without delays in the Raw logs report.
How it works: practice
This technology is built on the base of the UDP protocol. The main advantage of this technology is that it doesn’t impact your server, and even if all our servers break down, it won’t impact your site’s productivity at all.
Security
The only way to intercept data via Live Stream log data transmission is to get direct access to the core routers between your server and our server. You don’t have to worry because it is almost impossible to realize.
Please remember that your server doesn’t send us any sensitive data such as passwords, users’ credentials, etc.
You need to create a location and choose a way to store the log files with our technical team (FTP, for example), and JetOctopus will fetch these files daily. This method allows you to receive data with a delay of 1 hour to 1 day. You won’t have to do anything manually.
How it works
Together, we choose the way and place of log storage and share all needed access credentials.
Then, your system administrator should set up automatic data exporting for the previous day time that is available at the beginning of the present day.
It’s only a 1-time setup from the client’s side that will enable all next log updates in the storage, no need to export files manually every day.
Every morning JetOctopus retrieves fresh log data from the mentioned storage, analyze it, and import it in our system, so the dashboards will be updated automatically.
Security
This way of logs integration is the most secure. Files transfer is conducted via the HTTPS protocol. Data is only accessible through the IP and with the password.
Processing such files may take a little longer than integrating directly from your web server. But you can independently choose the period and data for analysis according to your limits.
How to do it
Your system administrator should set up data extraction for a particular period and transfer data via FTP, S3, etc.
Security
This way of logs integration is the most secure. Files transfer is conducted via the HTTPS protocol. Data is only accessible through the IP and with the password.
Actually, JetOctopus supports all types of log files. If there is an error with the file type during the integration, our technical team will manually integrate your file within 24 hours and the logs will be displayed in JetOctopus reports.
No, you don’t need to do this. We use only search engine logs, user logs in complete security: we do not use any personal data of users. However, you can separate user logs to reduce the file size when downloading manually. This will speed up the processing of log files. But please note that if you are unsure about which logs are 100% from users and which are not, we recommend not separating the loglines. After all, JetOctopus also checks whether the Googlebot is real or it was a fake bot emulating Google. We do reverse DNS Lookup. Fake bot data is very important for analyzing your website.
More information: How to analyze what types of search robots visit your website.
1. An error occurred with the file type during manual download.
Our technical team will solve your question within 24 hours, no need to worry!
2. Logs stopped being updated in logs reports.
Usually, the reason lies in changes in your web server settings or access settings. You need to contact DevOps or the site administrator to verify the connection with JetOctopus. If you use Cloudflare or AWS, you can check the connection with JetOctopus in the interface.
If everything is fine on your side, contact our online chat: we will help you!
3. Outdated logs are displayed in the data tables.
If you downloaded the log files manually, we recommend re-uploading them. Pay attention to the type of download: Append logs data, Skip updates for uploaded days, Update log data for existing days. If you need to update data for days that already have logs uploaded, select the last type of managing logs. If you want to add all log lines from the file, select the first one. If it is necessary to add data only for dates without logs, choose “Skip updates for uploaded days”.
4. The number of logs in JetOctopus differs from the number of downloaded loglines. It is likely that there are many records of fake bot visits among the uploaded data. Check the type of user agent for which you filtered the logs in JetOctopus.
Log file is usually a text file that contains each step Googlebot makes here, it includes server IP, client IP, timestamp of the visit, URL requested, Http status code, user-agent, method (get/post), etc.
What we need:
Apart from those main parts, there are also secondary but still meaningful logs components that provide clues about bots’ behavior:
From our experience, different users have specific attitudes and fears connected with the security of the data. And we totally get it.
So before using any tool we have to highlight that JetOctopus follows GDPR of personal data and doesn’t use any sensitive data like POST requests with users’ passwords, invoices data, etc. We don’t collect, analyze, and save this data.
We do not process and we do not save users’ log records, even if you provide us with this data.
An IP address is only saved in case the User-Agent line contains the word “bot” in it.
If you feel like you need an additional legal justification behind the data transfer we sign a mutual NDA (Non-disclosure agreement). It is not an obligatory part but is done at your request and you can suggest your own NDA document if needed.
Our Privacy Policy is disclosed on the website, where you have full access. Please make sure to read it carefully. It aims to help you understand what data we collect, what we use it for, and how you can exercise your rights.
Our technical team will be happy to help you solve all log issues.