July 11, 2022
by Sofia

How to join datasets

Merging data from different tables is much more convenient if you use the “Join Dataset” tool. You can use this option in all data tables: logs, Google Search Console and crawl. In this way, you can merge the data from the crawl or logs with the most effective pages in the GSC. You will find many other datasets that may be useful to join in each data table.

How the “Join dataset” works

JetOctopus stores data from logs, crawls and Google Search Console in various data tables. Therefore, you can use all the tools independently. So, you do not need to wait until the crawl is finished to start working with the logs or GSC, and so on. However, there are many situations when joining datasets is required. For example, if you want to check whether all URLs found during a crawl have been visited at least once by search bots. With JetOctopus you can join datasets quickly and easily.

1. Select the data table on which you want to join the dataset. It can be any data table (logs, GSC, or crawl).

How to join datasets - JetOctopus - 1

Please note that if you work with logs or GSC and if you want to join a dataset with the crawl results, the crawl results from the page crawl on which you are currently on, will be used. 

How to join datasets - JetOctopus - 2

To join a dataset with another crawl, select the desired one from the list of crawls.

2. Select the desired dataset and set the conditions. You can also import your own dataset, see information below.

How to join datasets - JetOctopus - 3

Each data table has its own datasets. Imported datasets will be displayed in all data tables. For any dataset, you can choose whether you want to analyze data that is present in both datasets, or only those that are not present in the dataset you selected.

Note that you can combine as many datasets as you want at once.

Dataset Bot Logs

One of the most interesting datasets to join it with crawl results. It is a perfect way to determine which pages are not known to search engines.

How to join datasets - JetOctopus - 4

If you select  “URL is not present in Bot Logs”, then in the results you will see all the URLs that our crawler found, but which were not visited by search engines. And vice versa: you can filter URLs that are crawled by JetOctopus and visited by search bots. These are the most promising pages of your website.

You can also choose the period, type of search engine and domain.

Dataset Queries in GSC

Select the desired period, condition, device and country. This is an extremely useful dataset if you are analyzing the performance of your website by country and by device.

How to join datasets - JetOctopus - 5

The Pages Problems dataset contains pages which have been added to the list of problem pages in the “Ideas” – “SEO problem” section.

How to join datasets - JetOctopus - 6

Integrate Ahrefs and Google Analytics to combine datasets with data from Ahrefs or Google Analytics. Interesting way to audit the relationship between on-page optimization, bots activity, external linking and traffic.

How to join datasets - JetOctopus - 7

You can find a list of all possible datasets to join in each menu under filters.

How to import your own datasets

You can also import your own datasets. Go to the menu “Tools” – “My data” – “Import file”.

How to join datasets - JetOctopus - 8

Next, download a list of URLs/own dataset with additional columns in CSV or TXT format. After the data is preloaded, you will be taken to the settings page.

1. In the “Title” field, enter a meaningful name for the dataset, for example, “Archived pages”.

2. If the “First Row Is Header” checkbox is checked, the first row will be used as the title of the column. Deactivate this checkbox if you have uploaded a list of URLs without a column title.

3. If you have a specific CSV file, adjust the delimiter, enclosure, etc. in “Advanced settings”.

How to join datasets - JetOctopus - 10

4. Next, configure the columns and select the data type if there are several columns. If there is only one column with URLs, select the “URL key”.

5. Then click “Load data” and wait for the import to complete.

How to join datasets - JetOctopus - 9

After a successful import, you can merge your own dataset with all others.

To edit or delete your own dataset, go to “My Data” – “Overview”.Than move on to the needed dataset, select the action you want to perform: edit, delete, or save as a segment.

How to join datasets - JetOctopus - 11

About Sofia
Technical SEO specialist. Sofia has almost 10 years of experience, of which the last 5 years in JavaScript SEO. She is convinced that SEO is a very technical part of digital marketing. And without logs and in-depth data analysis, you can't do effective SEO.



Get exclusive tech SEO insights
We are tech SEO geeks who believe that SEO is predictable and numeric. Don’t miss our insigths!