Our Logz.io integration allows us to automatically get log file data every time we crawl a website, meaning that log file data is kept up to date, and there’s no need to manually upload data.
Once set up, Lumar will query your Logz.io account and get all the URLs that received traffic from search engine and AI bot crawlers within the specified date range, along with the count of hits from Google’s desktop crawler, and the count of hits from Google’s mobile crawler. This data is used to populate reports in the Discoverability > Crawl Budget section of SEO projects, and also reports in our dedicated GEO analysis.
How to Connect Logz.io to Your Lumar Account
Firstly, you’ll need to add Logz.io to your Lumar subscription, so to discuss your options, please speak to your Customer Success Manager, or contact our support team.
Once you’ve added Logz to your subscription, we’ll take on most of the work to get the integration up and running, but there are a couple of steps that we’ll need you to do:
1. Let us know who needs access
Let us know who the users will be and what permissions they should have (Read Only, User, or Admin) so we can get everyone set up for you with the right access.
2. Ship your logs to Logz.io
You’ll need to work with your internal resources to confirm which system you’ll use for log file management. Most of our customers use the S3 Bucket method, for which you can find instructions on the Logz.io website here. Essentially, the way this works is as follows:
-
- Bot data is stored on your servers and made available via an external storage method.
- Data is stored in an S3 bucket and is periodically fetched by Logz.io automatically.
- Logz.io parses the data, which is available to Lumar via the API integration.
- When a crawl runs with Logz.io integrated, bot data is pulled into Lumar via the API. Bot data is overlaid with crawl data and displayed to show activity and potential crawl budget inefficiencies.
Logz.io also provides a number of other methods for shipping your data. You can explore those options here.
Once you’ve confirmed how your log files are stored, let us know so we can ensure we can parse the logs correctly.
Once we’ve done this, we will export the log file summary from Logz.io and add it to a test crawl in Lumar to confirm that everything is working correctly, and troubleshoot any issues with the Logz.io team. Once that is done, we’ll create the API token and run a test crawl to ensure that the integration is working. Then, you’re all set!
What Information Do I Need to Ship?
There is a minimum requirement of the following information that we need to make sure is included in the log file data. This ensures we can successfully set up and validate the integration, and provide useful analysis in the Lumar platform:
- IP Address: The IP addresses of visitors and bots to provide insights into website traffic, and search engine crawling behavior. Search engines release the IPs they use, so this information can be used to validate the bots that are hitting your site.
- Uniform Resource Identifier (URI): This is the specific URL or resource requested by a web crawler or browser recorded in the server’s log files (merge request and string). Depending on the system this could, for example, be /dresses/ or /product/, etc.
- Host: This is the host domain (e.g. https://www.lumar.io/) which is especially useful is you have multiple domains.
- UA String: This is the long version of the User Agent string to identify exactly what is requesting the information. Note that each system handles the spaces in strings differently (e.g. some use %20 and others will use +).
- HTTP Response: The response code of the page (e.g. 200, 404, etc.) to see if a specific HTTP request has been successfully completed.
- Timestamp: The time that the request was made, to establish the precise moment events occur.
There is also additional data that can be included in the log file analysis. While this is not required, it can greatly enhance the insights you get from Lumar with log file data:
- Accurate UA Name: This is the nickname of the user agent that is accessing the site (e.g. Googlebot). Some systems may group multiple UA Strings (see above) under the same name, so this can help group requests together for deeper insights.
- Request Size: The size of the page or resource that is returned following the request. This can be helpful to identify, for example, empty pages or significantly large pages that can be reduced.
- Time to First Byte (TTFB): The response time from request to delivery, so gives insight into page speed.
- OS/Device: The operating system or device, which gives deeper insights into how users are accessing your site.
- City and/or Country: Helps break the traffic down by geography, again for deeper insights. This can also help identify if content is taking longer to serve to a particular place, and identify geo redirects.
- Referer: The linking domain of where the request came from, to help identify the source of traffic.
How to Add Log File Data to Your Project
Once the integration is set up, you can then add Log File Data to any relevant projects. From the project view, click on the ellipses to the right of the project you want to add the log file data to, and choose Edit project.
Navigate to step 2 of the project setup (Sources) and scroll down to open up the Log Summary section.
Once you’ve clicked that button, wait a second and a query will populate. You can then click on the query to open the settings and change anything that needs amending.
Where Do I Find the Data Once it's Integrated?
Once log data is included in your crawl and a new crawl has completed, you’ll predominantly see data populated in the Discoverability > Crawl Budget section of the SEO crawl. You’ll also find a report for pages in the Log Summary, and one for pages not in the Log Summary in the Crawl Overview section. These reports are helpful to understand how search bots are interacting with your site, and identifying any issues such as error pages with bot hits, or out-of-date content that is receiving bot hits.
If you have our GEO reporting as part of your subscription, you’ll also find reports in the AI Discovery (AI Crawlability and AI Indexability) and AI Inclusion sections. Here, these visualizations and reports can help you understand how AI bots are interacting with your site, and which content is available for their use. We will be developing more detailed reports in this section, so watch this space!
Find Out More
Once your integration is set up, you may need to do some of the following. Follow the links to help articles with details on how to do this: