Sometimes while running a large task any SEO software user encounters a situation when data extraction from the search engines is no longer possible. You can tell that by clicking the blue "view logs" link on the finishing screen after any operation: there are red "failed" tasks and their error logs contain "temp blocked" messages.
Another symptom is when you try searching for something in Google in the browser you get to the "Sorry…" page saying your computer might be sending automated queries and your request can not be processed now. There's no need to get alarmed though, because it is a very common and minor issue. This is called a temporary block and it will last for around two hours at most. During this time, however, you won't be able to retrieve any data from the search engine using your IP address, so it's rather an inconvenience than a problem.
The search engines may temporarily block IP addresses that produce large amounts of very frequent queries. Each request to a search engine is a query: whether you're looking up a keyword's rank, going to the next page of results, harvesting competition for a keyword, updating Google PageRank, updating link popularity of a domain or searching for site's backlinks — those are all queries. If a search engine considers that the queries coming from your IP address are too frequent, then your IP will get temporarily blocked.
Each of the SEO PowerSuite' tools queries search engines one way or another, and that is why all tools are equipped with settings that prevent temporary blocks. Rank Tracker works with the search engines the most, so this post is mostly about setting up your Rank Tracker, although I will cover search safety aspects in relation to other tools as well. Understanding the way our tools operate is crucial for minimizing instances of temporary blocks.
The number of simultaneous tasks.
First and foremost the number of simultaneous tasks our software performs can make a huge difference (you can adjust this setting in "Preferences -> Misc. Global Settings"). This setting matters especially if you are interested in only one or several search engines in your project.
For instance, if your Rank Tracker is set to run 5 simultaneous tasks and you are checking ranks in Google only (or several international Googles like Google.co.uk, Google.com.au or Google.de), it means you will be directing 5 queries to Google at once. Although the regional search engines are practically different and they produce different results, querying them all counts as querying one search engine, Google.
If, on the other hand, you had 5 different search engines (e.g. Google, Bing, Blekko, Alltheweb and Yandex) then each of the 5 simultaneous tasks would be querying a different search engine which would mean each search engine would be getting only one request from your IP address at a time, which is absolutely safe and natural.
So if you plan to check rankings in one search engine or several regional sections of a search engine, please keep in mind that the number in the "number of tasks" is the amount of queries you will direct at that search engine at the same time. In this case the safest way is to limit the number of simultaneous tasks to 2-3. Accompanied with 1-2 second human emulation delays 2 simultaneous tasks can let you check an average project (50-150 keywords).
If you have a large list of different search engines then usually there is no need to decrease the number of simultaneous tasks.
Please keep in mind that same applies to other tools, for instance SEO SpyGlass. When you are harvesting backlinks for a certain site in different search engines or when you are updating ranking factors please make sure you spread the queries between these tasks and do not check only Google PageRank for 500 backlinks using 10 simultaneous tasks. Such frequency of requests might be regarded as excessive toolbar querying as well and can lead to a temporary block of your IP address in Google. Balancing the set number of simultaneous tasks and the amount of queried data sources is important to avoid temporary blocks.
Rank checking method
This section applies to Rank Tracker only and it concerns the way the tool is parsing search engines' results. You can tweak this setting in "Preferences -> Rank Checking precision -> Max. Results". The four rank checking options are:
1) "Successive search" (plain page by page search from result #1 to the maximum result number you've specified in "Preferences -> Rank Checking precision -> Max. Results").
Rank Tracker makes one query per search engine's result page, a regular result page with 10 results on it that most users see. In order to check rankings of a keyword ranking at #55 (page 6) Rank Tracker will make 6 queries each time you run a check. This is the default method and it's great for watching dynamics of keywords that are found on pages 1-3, although it definitely causes excessive querying when you have a project with keywords ranking closer to the lower end of the top 100 results.
2) "Last found position"
Once you start a project for your website and run the first check it is run using the page-by-page "Successive search" method. All consecutive checks start at the result page where the rank was found previously, and if the rank has not been found there, the next and the previous pages are checked, slowly increasing the checked page span until the rank is found. So in order to dig out the result #55 Rank Tracker will make 6 queries during the first check, and all the consecutive checks will require only 1 query, unless the rank has changed. This is generally a good method for checking keywords ranking on pages 4-10 although when your rankings increase successive search could be enough.
3) "100 results per page" (refine results using the "ten results per page search").
The SE results are checked using the "maximum results per page" option of a search engine. The maximum number for Google is 100 ( http://www.google.com/advanced_search ), so with a query Rank Tracker loads a Google results' page with 100 results on it. After the position has been found Rank Tracker double-checks it using the "last found position" method, because very often rankings obtained using the "maximum results per page" method differ from those that most users see. Most probably the reason for that is internal, caused by unsynchronized search engines' databases. So when you use this method Rank Tracker will use 2 queries to find a keyword ranking at #55 each time. This rank checking method is usually the best one for a project containing keywords ranking in the wide range of positions.
4) "100 results per page" (without refining results in the "ten results per page search").
Once the rankings are obtained after scanning the maximum results' pages the check is stopped. Only one query is used to find the keyword ranking at #55, but its' ranking might be found at e.g. #50 or #62, which would not be accurate. It's a great rank checking method for a newly created project with a lot of keywords when you have no idea where they rank yet. Running a fast check of all your keywords using this method gives you an idea of your general visibility, or the visibility of your competitors. Although the positions might be not totally accurate, you can still get an idea of whether the sites are generally present in the top results for a set of keywords. However, when you wish to see the exact and accurate rankings you will need to double-check the keywords using either the "Successive search" or the "Last found position" method, depending on how high their ranks are.
As you can see, each way of checking your ranks is different and uses different amount of queries. The above example concerned rank checking of just one keyword, but imagine a project with hundreds of keywords — the way you check their positions can really influence the convenience of the whole process.
Search safety settings
In "Preferences -> Search safety settings" of every SEO PowerSuite tool there are four features you can use: human emulation, user agents, proxy rotation and CAPTCHA settings.
1) Human Emulation
Human emulation is the simplest and the most effective feature, although not the fastest one. Come to think of it, the reason why a search engine blocks your queries is that they are multiple and frequent. So addressing the nature of the temporary blocks would be the easiest solution — if you're penalized for querying too fast you can slow down your querying. Human emulation inserts delays between all your queries, and your rankings update takes longer but you do not risk getting blocked, because Rank Tracker's behavior is humanized and "looks natural" to the search engines. There is an option to insert a delay before searching for each next keyword as well as before going to the next result page for that keyword, that is, each new task is run after a delay. Human emulation can be great when you do not care how much time the rankings check will take (e.g. when you schedule this task for the night). In addition, enabling minor delays can help a lot when combined with other settings, namely the number of simultaneous tasks and the rank checking method.
2) User agents
Selected user agent defines the way Rank Tracker appears to a search engine (the kind of OS, browser etc.). In Rank Tracker this setting randomizes user agents by default, and it normally works best. You can enter a specific user agent only if you wish to check the way a search engine returns results for a particular user.
3) Proxy rotation
Proxy rotation is the fastest feature but it might require additional investment. The essence of proxy rotation is that each new task (remember the amount of simultaneous tasks above) is run using a different proxy server. When you query a search engine via a proxy server your IP address is not involved in any way, the search engine detects the query coming from the proxy server. That is, if you are rotating as many proxy servers as many simultaneous tasks you have, each query will come to a search engine from a different IP address, none of these addresses being your actual IP. Thus, each of the proxy servers will be querying its' targets with natural frequency. Another advantage of the proxy rotation feature is that when you check rankings for a large number of keywords and one of the proxy servers gets blocked your queries are directed through another proxy server in the list, and you still get data.
SEO PowerSuite tools let you search for publicly available proxy servers and use those in the proxy rotation feature. This feature can be a time saver and can help you complete rank checking in your project. However you might not really want to rely on free public proxies all the time. First, some of the proxies that can be found online are overused and already blocked by Google; second, regional proxy servers (you can see the country they are located in) can return altered results, because Google aims at providing different results for users from different countries. For instance if you are checking Google.com and are in the USA you would not be interested in Google.com results for Germany that you will get using a German proxy server. So in order to make proxy rotation the ultimate safety feature you can purchase a list of exclusive proxies located in your country or the country you are interested in tracking results for. The best thing about exclusive proxies is that only you are using them and they are very responsive. However, you will need to pay for them, usually not much. There are many companies selling exclusive proxies online and you can find a good offer that will spare you the temporary blocks' inconvenience.
4) CAPTCHA settings
The CAPTCHA settings section lets you enable CAPTCHA images in Rank Tracker or any other tool. Once you get temporarily blocked you are offered a CAPTCHA image code to prove you're human. Having enabled CAPTCHA display you will be able to enter the code and finish your rankings update.
A very convenient way to go for an SEO company or anyone dealing with large projects is to purchase and enter an anti-CAPTCHA key. That way any offered CAPTCHAs will be timely recognized by our partner CAPTCHA detection service and you will be paying only for correctly recognized CAPTCHAs.
* While using Rank Tracker please remember that KEI update using Google AdWords is slightly different from rank checking. KEI update consists of two processes: competition update and number of searches update. When you update competition Rank Tracker is loading the first results' page for each keyword getting the total number of all results, so this is similar to rank checking. It IS a query to Google and search safety consideration makes sense here. On the other hand, when you send a query to Google AdWords to get number of searches for a set of keywords, you will be asked to enter a CAPTCHA once — that does not mean you are blocked and is the normal procedure. One CAPTCHA will suffice to get results for a large number of keywords at once. Enabled proxy rotation might increase the amount of AdWords CAPTCHA codes, as each proxy in line will be offered a CAPTCHA code. Generally, the best strategy for project with a large number of keywords would be to update competition and number of searches separately, having your search safety settings enabled during the competition update and disabled during the number of searches update.
Search engines API keys
You can enable API keys or find links to obtain them in "Preferences -> Search engines API keys". An API key gives access to search engine's resources and is meant to be used in external applications (just like Rank Tracker). However, practice shows that they are not reliable in terms of rankings accuracy: the keyword positions obtained via an API key usually differ a lot from the actual rankings. That is why getting live rankings by querying the search engines directly each time is a better option. On the other hand, using an API key is great for other purposes, for instance backlink gathering in SEO SpyGlass or search of indexed pages in WebSite Auditor. In the case of Yandex, the main Russian search engine, an API is almost obligatory, because that search engine temporarily blocks your IP really fast, and at the same time returns correct rankings via an API key.
As you can see, there are ways to organize data extraction from the search engines in a way that will neither cause any inconveniences to you, nor overload their servers with unnaturally frequent requests. What matters is to combine the search safety features with other general settings and to adequately arrange your querying, spreading it in time and between data sources of interest.
back to SEO blog