What is the purpose of serpstatbot?
Our bot constantly crawls the web to add new links and track changes in our link database. We provide our users with access to one of the largest backlink databases on the market for planning and monitoring marketing campaigns.
What happens to the crawled data?
Crawled data is added to the backlink index. You can get access to the data from the index in Serpstat.
How do you handle 404 or 301 pages?
We collect historical data to make sure that no temporary change has an impact on your site. If the links to pages with a 404 or 301 server response code still exist, serpstatbot will continue to find and follow them. You can check Google 404 policy to find more information.
Does the bot crawl links with the rel = nofollow attribute?
Yes, it scans.
How can I block serpstatbot?
Serpstatbot addresses to the robots.txt standard. If you don’t want the bot to crawl your website then add the following text to your robots.txt:
Please always ensure the bot can get robots.txt file itself. If not, then it will crawl your site by default.
If you think that serpstatbot doesn’t follow your robots.txt commands, please contact us via email: email@example.com. Please attach the URL to your website and log entries that show bot trying to retrieve pages that it was not supposed to.
What commands in robots.txt does serpstatbot support?
Serpstatbot supports the following additions to robots.txt:
- Crawl-delay up to 20 seconds (higher figures will be cut)
- Redirecting within the same site when trying to get robots.txt
- Simple pattern matching in Disallow directives consistent with Yahoo’s wildcard specification
- Allow directives prevail over Disallow if they are longer
- Failures to retrieve a robots.txt file, for example, 403 Forbidden, are considered as the absence of any prohibitions. In this case, the bot will crawl all physically accessible pages.
Why doesn’t my robots.txt block work on serpstatbot?
There are several reasons:
- Off-site redirects after requesting robots.txt: serpstatbot goes only to redirects of the same domain.
- If several domains run on the same server, some servers can log access to these domains to a single file without specifying a domain name. You should add a domain name to the access log or split access logs on a per-domain basis.
How can I slow down serpstatbot?
You can slow down bot by adding the following settings to your robots.txt file:
Crawl-Delay must be an integer: it stands for the number of seconds to wait between requests. Serpstatbot accepts a delay of up to 20 seconds between requests to your site. High Crawl-Delay should minimize the impact on your site. This parameter will also be valid if it is set for User-Agent: *.
If our bot finds out that you have used Crawl-Delay for any other bot, then it will automatically slow down the crawling process.