Crawl Error Descriptions

Here is a list of common crawl errors, what they mean and how to resolve them:

RequestTimeoutError

Chrome Errors: net::ERR_TIMED_OUT

Description: The request time-out period was reached before a response was received. The web server may have struggled to keep up with the crawl rate.

Potential Solution: If this is happening for only some URLs in a crawl, reduce the crawl speed and try again. If this happens for every URL then the web server may have been down, so check that the website can be accessed. If not then it may be an indication that the crawl was blocked and would need to be whitelisted.

DOMTimeoutError

Description: The DOMContentLoaded event was not received within the rendering timeout period.

Potential Solution: Try reducing the crawl rate.

ConnectionError

Description: A connection could not be established with the the web server.

Potential Solution: This may be caused because the server is blocking the requests, or is overloaded. Try reducing the crawl rate, or see if the crawler can be whitelisted.

HostNameResolutionError

Chrome Errors: net::ERR_NAME_NOT_RESOLVED, net::ERR_TUNNEL_CONNECTION_FAILED

Description: The domain name of the URL could not be resolved to an IP address.

Potential Solution: If the domain is not a public website (for example a staging website) then you may need to add custom DNS records. Otherwise speak to support@lumar.io.

EmptyResponseError

Chrome Errors: net::ERR_EMPTY_RESPONSE

Description: Nothing was returned from the server in the response.

Potential Solution: The most common cause for this is user agent blocking, where Lumar is blocked for pretending to be Googlebot. Changing the User Agent in the Advanced Settings may allow the next crawl to run successfully.

SslCertificateError

Description: The remote server's SSL certificate is invalid. Lumar will not crawl a domain with an invalid certificate by default.

Potential Solution: We'd recommend you check the domain's SSL Certificate validity. In the meantime, if you would still like to run a crawl, you can tick the "Ignore invalid SSL certificate" under the Crawl Restrictions section of the Advanced Settings and then crawl again.

BrowserProxyError

Description: The proxy used to make requests failed.

Potential Solution: Review the proxy settings in the crawl. If this occurs using the default IP, then speak to

AbortedRequestError

Description: The request connection was aborted before a response was received.

Potential Solution: This may be caused because the server is blocking the requests, or is overloaded. Try reducing the crawl rate, or see if the crawler can be whitelisted.

MultipleContentDispositionError

Chrome Errors: net::ERR_RESPONSE_HEADERS_MULTIPLE_CONTENT_DISPOSITION

Description: The error indicates that the Content-Disposition header sent back from the server contains multiple values.

Potential Solution: Review the Content-Disposition header contents to see if the there are multiple comma separate values.

InvalidRedirectError

Chrome Errors: net::ERR_INVALID_REDIRECT

Description: The response from the server indicated that the URL is a redirect, however the redirect location could not be parsed from the Location response header.

Potential Solution: Review the response headers for the page and determine if the Location header value is a valid URL.

InvalidResponseError

Chrome Errors: net::ERR_INVALID_RESPONSE

Description: The response from the server could not be parsed.

Potential Solution: Check if the page works when manually tested in the browser to see if it is persistent. If it does not work then there may be an issue with the web server. If it does work, the issue can potentially be caused by an overloaded server, so reducing the crawl rate may help.

FailedToGetRequestBody