If you need to crawl all the images on your website, you can use JetOctopus. By default, we do not crawl images, so you need to configure the crawl specifically for further image analysis.
Go to the desired project and click the “New crawl” button.
In the basic settings, enter the domain or list of URLs where you want to find images.
In “Advanced settings”, select the “Process images data” checkbox. This means that while crawling, JetOctopus will analyze image data.
If you need to analyze only certain image formats, please do not use “Include/Exclude URLs”. This way you will limit the URLs that need to be scanned and in the code of which to look for image links. That is, you tell the crawler to look for images on pages that contain .jpg or .png in the URL. That’s not quite what’s needed, is it?
Use “Custom Extraction” to count the number of images on your page. For example, to count the number of jpg images, select “Extract Type” – “Regex” and use “\.jpg”. Next, select “Count Items” and test the rule.
You can do this for any image format.
Read more: Custom Extraction Guide with multiple use-cases.
If you only need to scan images on certain types of pages, then use “Include/Exclude URLs”. So you can get the product list of pages that have no images or just have an empty <img> element. Users usually pay attention to the image when making a purchase. Therefore, it is important that products have images.
In addition to the settings for crawling the images, pay attention to all the other settings: How to configure a crawl of your website.
In particular, pay attention to whether the image is in regular HTML code, or whether you need to run JavaScript crawling. If the images are processed on the client side, it will be necessary to start JS crawling.
Read more: How to configure crawl for JavaScript websites.
By using image crawl, you can discover growth points for your website.
1. Using Custom Extraction, check which pages do not contain images. You can also use the following option: How to check image count on the page. However, this way you won’t be able to detect empty <img> elements.
2. Find out which image format is most common on your website. Next, using Google Analytics, analyze the browsers used by your customers. Unfortunately, not all browser versions support new image formats such as webp, so add an image of a supported format if you have visitors to your website with browsers that do not support webp.
3. Check the image with the missing alt attribute: How to find and export all images without alt attributes.
4. Analyze images as links: How to do a deep audit of all internal and external image links.
5. Identify images that do not have lazy loading: How to find lazy load images and how to analyze it.
5. Analyze images with non-standard height and width and others – you can find information here: How to check all images on your website.