Merging data from different tables is much more convenient if you use the “Join Dataset” tool. You can use this option in all data tables: logs, Google Search Console and crawl. In this way, you can merge the data from the crawl or logs with the most effective pages in the GSC. You will find many other datasets that may be useful to join in each data table.
JetOctopus stores data from logs, crawls and Google Search Console in various data tables. Therefore, you can use all the tools independently. So, you do not need to wait until the crawl is finished to start working with the logs or GSC, and so on. However, there are many situations when joining datasets is required. For example, if you want to check whether all URLs found during a crawl have been visited at least once by search bots. With JetOctopus you can join datasets quickly and easily.
1. Select the data table on which you want to join the dataset. It can be any data table (logs, GSC, or crawl).
Please note that if you work with logs or GSC and if you want to join a dataset with the crawl results, the crawl results from the page crawl on which you are currently on, will be used.
To join a dataset with another crawl, select the desired one from the list of crawls.
2. Select the desired dataset and set the conditions. You can also import your own dataset, see information below.
Each data table has its own datasets. Imported datasets will be displayed in all data tables. For any dataset, you can choose whether you want to analyze data that is present in both datasets, or only those that are not present in the dataset you selected.
Note that you can combine as many datasets as you want at once.
Dataset Bot Logs
One of the most interesting datasets to join it with crawl results. It is a perfect way to determine which pages are not known to search engines.
If you select “URL is not present in Bot Logs”, then in the results you will see all the URLs that our crawler found, but which were not visited by search engines. And vice versa: you can filter URLs that are crawled by JetOctopus and visited by search bots. These are the most promising pages of your website.
You can also choose the period, type of search engine and domain.
Dataset Queries in GSC
Select the desired period, condition, device and country. This is an extremely useful dataset if you are analyzing the performance of your website by country and by device.
The Pages Problems dataset contains pages which have been added to the list of problem pages in the “Ideas” – “SEO problem” section.
Integrate Ahrefs and Google Analytics to combine datasets with data from Ahrefs or Google Analytics. Interesting way to audit the relationship between on-page optimization, bots activity, external linking and traffic.
You can find a list of all possible datasets to join in each menu under filters.
You can also import your own datasets. Go to the menu “Tools” – “My data” – “Import file”.
Next, download a list of URLs/own dataset with additional columns in CSV or TXT format. After the data is preloaded, you will be taken to the settings page.
1. In the “Title” field, enter a meaningful name for the dataset, for example, “Archived pages”.
2. If the “First Row Is Header” checkbox is checked, the first row will be used as the title of the column. Deactivate this checkbox if you have uploaded a list of URLs without a column title.
3. If you have a specific CSV file, adjust the delimiter, enclosure, etc. in “Advanced settings”.
4. Next, configure the columns and select the data type if there are several columns. If there is only one column with URLs, select the “URL key”.
5. Then click “Load data” and wait for the import to complete.
After a successful import, you can merge your own dataset with all others.
To edit or delete your own dataset, go to “My Data” – “Overview”.Than move on to the needed dataset, select the action you want to perform: edit, delete, or save as a segment.