Beginners Guide to Log File Analysis (2022)

Nov 16th, 2021

Log file analysis was one of the primary SEO techniques during the early years of search and digital marketing, but the rise of various SaaS products has led to it becoming a dying art. However, there are still insights that can be gained from analysis of your raw access logs and we hope to provide you the information you need to decide whether it’s a worthwhile investment of your time and how to conduct it if it may be.

Log file analysis is the process of either manually, or using a tool or platform, reviewing the data that is stored by your site’s servers whenever a request for a resource (web page, CSS/JS file, image etc.) is registered. In doing so, the analyser can reveal issues with various parts of the site, possible SEO opportunities and the general behaviour of various search engine crawlers that roam the web.

While log files can be extremely useful for SEO, the information stored is pretty basic – this includes:

  • The HTTP or ‘status code’ of your website’s server response (2XX, 3XX, 4XX etc.)
  • The IP address of the user agent (the software that retrieves, renders and allows for easy use of the web)
  • The type of request – either GET or POST depending on whether it’s a request to receive or provide data
  • A time stamp which states the date and time the request was received by the server
  • The URL path of the resource requested (the image, web page or file URL)
  • The user agent requesting the resource – generally a web browser such as Chrome, Mozilla etc.

These files will vary in size depending on how large the site is, how much traffic the site gets and how regularly the logs are archived.

The only servers I have access to at the point of writing use the cPanel GUI, but most are fairly similar, so while they may not be in the same place, they’ll generally have the same descriptions. So, to find your log files, you’ll need to access your server management platform or, if you use a CDN, you’ll find your logs there as the server won’t receive most of the data.

Once you’re in, you’ll have two potential options, you can select ‘File Manager’ and scroll down to the ‘Logs’ folder, or you can select the Raw Access option which will allow you to download the most recent files.

You’ll receive a zip file (gz or similar) which you can unzip with Winrar to allow you to open it in your preferred spreadsheet program or SaaS log file analyser. If you’re opening in a spreadsheet, however, you’ll generally need to separate the text into columns as it will generally paste into a single cell per hit.

If you’re in Excel, you can do this with the ‘Data’ tab, and the ‘Text to Columns’ option.

The result should be columns which will fit into the following structure (sometimes there are a couple more, sometimes a couple less):

  • IP Address
  • D/M/Y/HH:MM:SS
  • Method/Query
  • HTTP Status Code
  • File Size/Bytes Downloaded
  • User Agent

What log files include

+

While log files can be extremely useful for SEO, the information stored is pretty basic – this includes:

  • The HTTP or ‘status code’ of your website’s server response (2XX, 3XX, 4XX etc.)
  • The IP address of the user agent (the software that retrieves, renders and allows for easy use of the web)
  • The type of request – either GET or POST depending on whether it’s a request to receive or provide data
  • A time stamp which states the date and time the request was received by the server
  • The URL path of the resource requested (the image, web page or file URL)
  • The user agent requesting the resource – generally a web browser such as Chrome, Mozilla etc.

These files will vary in size depending on how large the site is, how much traffic the site gets and how regularly the logs are archived.

Where to find your site’s log files

+

The only servers I have access to at the point of writing use the cPanel GUI, but most are fairly similar, so while they may not be in the same place, they’ll generally have the same descriptions. So, to find your log files, you’ll need to access your server management platform or, if you use a CDN, you’ll find your logs there as the server won’t receive most of the data.

Once you’re in, you’ll have two potential options, you can select ‘File Manager’ and scroll down to the ‘Logs’ folder, or you can select the Raw Access option which will allow you to download the most recent files.



You’ll receive a zip file (gz or similar) which you can unzip with Winrar to allow you to open it in your preferred spreadsheet program or SaaS log file analyser. If you’re opening in a spreadsheet, however, you’ll generally need to separate the text into columns as it will generally paste into a single cell per hit.

Contact us today

to see what our award winning teams can do for your brand

let's chat
Facebook Twitter Instagram Linkedin Youtube