Beginner's Guide to Google PageRank:
How It Works & Why It Still Matters in 2018
When a child is born, its parents have all perfectly planned for its future. However, in no time, they realize that their child is not into cello and family fishing, but rather into tattoos, bad-influence friends, and skydiving.
Well, something similar happened with PageRank, a brilliant child of Google founders Larry Page (who gave his name to the child and played off the concept of a web-page) and Sergey Brin. It helped Google to become the search giant that dictates the rules for everybody else, and at the same time it created an array of complicated situations that at some point got out of hand.Definition.
PageRank (or PR in short) is a mathematical algorithm that evaluates the quality and quantity of links to a webpage. This evaluation helps it to determine a relative score of the page's importance and authority.
Thus, each link from one page to another counts as a so-called vote, and the weight of this score depends on the scores of the pages that link to it. Those pages' scores, in turn, depend on the scores of the pages that link to them, and so on. Calculation seems as pain in the, erm, neck, but we will come to it a bit later.Google Toolbar.
To cut the long story short, initially Google made these scores public with the help of its Toolbar that showed the scores right in the browser (of course, it was a rough value). The scores were only from 0 to 10 and seemed to be on a logarithmic scale. The real meaning of this scale is the following:
(log base 10)
|0||0 - 10|
|1||100 - 1,000|
|2||1,000 - 10,000|
|3||10,000 - 100,000|
|4||and so on.|
This step led to much controversy, and now I understand Google's current intention to avoid straightforward statements when talking about ranking signals. Each piece of such data will be misunderstood anyway.
With time, Google was cutting the support of Toolbar, and on April 15, 2016 it officially shut down Toolbar PageRank Data to public making it the real secret sauce of its ranking mechanisms.
Though PageRank is not public, it still exists. Thus, I would like to explain how it works, what it does, and why it is still important. Plus, I know just the alternative to it when it comes to the ready values' part.
Calculation of PageRank.
Every webmaster should understand how PageRank really works. This knowledge is essential for that kind of optimization where an SEO clearly sees what exactly has to be done to get a result as well as what has gone wrong when the ranks dropped.
The calculation seems kind of tricky. As you remember from my foreword, the PageRank score of each page depends on the score of the page that links to it. But we can't know the score of these pages till we calculate it. So it seemingly forms circles and is impossible to calculate.
However, it is not that bad. The PageRank score can be calculated using a simple iterative algorithm, and it corresponds to the principal eigenvector of the normalized link matrix of the web. It means that it is possible to calculate the score of a particular page without knowing the value of other pages that link to it.
Why so? The matter is that each time we run the calculation, we are getting a closer estimate of the final value. We need to remember each calculated value and repeat the calculations a number of times till the numbers stop changing much.
Google recalculates PageRank scores after each crawl of the Web. As it increases, the number of documents in its rebuilt index, the initial approximation of PageRank decreases for all documents. Plus, it is considered that PageRank favors older pages. A new page, however good, cannot have lots of quality backlinks, thus, it receives a lower score.
The PageRank formula also contains a damping factor (d) . According to the PageRank theory, there is an imaginary surfer who is randomly clicking on links, and at some point he gets bored and eventually stops clicking. The probability that the person will continue clicking at any step is a damping factor. Thus, this factor is introduced to stop some pages having too much influence. As a result, their total vote is damped down by multiplying it by 0.85 (a generally assumed value).
It is also considered that the normalized sum (or the average) of all web pages equals one. And the PageRank formula provides that even if a page has no backlinks, it will get a small score of 0.15 (1 minus d-factor).
One of the best articles on the matter of the PageRank calculation process and its maximum efficiency is this one by Ian Rogers. He gives us some examples (simple hierarchy, inclusion of page reviews, looping, extensive interlinking, and so on) and accompanies them with observations and principles.
The most important observations are:
On linking back. The homepage usually has the highest PR score as it has most incoming links. However, if the external pages do not link back to it, they do not cast any votes and the average PR score gets lower. Thus, it makes sense to link the external sites back to the homepage so that those "votes" were not wasted.
— Trying to abuse the PR calculation by making a site structure with the PR concentration on the homepage just won't work and will harm your UX. Thus, you can risk your PR by providing a great experience to your users that will result in more PR than you actually lost.
On hierarchy. Hierarchy concentrates votes and PageRank into one page.
On structure. A well-structured site will amplify the effect of any contributed PR.
On internal linking. Internal linking minimizes the damage when you take the risk and give away votes by linking to external sites.
— If a group of pages do not contain external links, the number of internal links does not have any effect on the average PR score of the site.
- On spammy links. Thousands of spam pages pointing at your homepage will add up and give you a nice PR score. However, once Google spots it (and it does it really quick), your site will be no more found in SERPs. At the same time, lots of pages (or at least a few) with unique content that point to your homepage is the best recipe.
Controversy of PageRank.
PageRank is a genius invention that makes Google so efficient and authoritative. However, each genius invention can be easily perverted, manipulated, and used for completely different intentions. This very fate came upon PageRank.
Let's talk about the reasons why public PageRank was ruinous for the Web and why it has been and still is important.
Vices of PageRank.
Google made its PageRank public to make people see that this new search engine is able to:
- find those pages that users really look for
- show users which pages are the best and safest
The future search giant developed its Toolbar for Internet Explorer (further also supported by Firefox). The Toolbar showed a PageRank score (ranging from 0 to 10) when it was enabled by the user.
While ordinary users were not that interested in pages' scores, SEOs of a different caliber felt that this was a great opportunity to make a difference for their customers. This obsession of SEOs with PageRank made everyone feel that this ranking signal is more or less the only important one. In spite of the fact that pages with a lower PR score can beat those with a higher score! What did we receive then, as a result?
- Link farms.
In such a situation, the new market emerged to react to this specific demand, and let the PR scores' manipulation begin. Yes, the era of link farms.
Of course, Google did not like such development of the situation. It started to fight back. The most famous action was against SearchKing network which was penalized and removed from Google results. SearchKing filed a suit against Google, but Google won.
After that, link selling went underground. Google still managed to find such networks, but it did not really matter: as one network closed, new ones took their place. While there are people who are ready to pay for a PageRank boost, these schemes will never cease to exist.
- Link spam.
Public PageRank also unleashed link spam. Yes, spammy comments with links in every imaginable place. So it was possible to leave hundreds, thousands of comments that had links right to the particular site, and these links mattered! Dream, eh?
Not really. It became such a pain in…yes, the neck that Google felt pressed into some decisive action. Thus in 2005, a nofollow attribute was introduced, a new value for the rel attribute of HTML link and anchor elements. It was a way to prevent such links from passing PageRank votes along.
Unfortunately, it did not end link spam. But, a nofollow tag is widely used up to our days by an array of major social platforms.
Conclusion: Even when Google killed its Toolbar and PageRank became secret, it was obvious that it would not bring peace to the Web. As long as people know that Google's ranking mechanism hugely depends on the links, links will continue to be farmed and sold.
Importance of PageRank.
There can't be a question whether PageRank is important or not. Of course it is! Google is positive that PageRank still matters. It helps the search engine determine the most trusted material for a particular query. It just is not public anymore. It came back to be its secret sauce ingredient.
After all, you perfectly know the formula for higher PageRank value for your site:
quality (not quantity!) of your backlinks + efficiency of internal linking
It means that when optimizing, we subconsciously follow the requirements for the higher PageRank score.
Moreover, the PageRank mechanism is entirely general, so it can applied to any graph or network in any field. Currently, the PR formula is used in bibliometrics, social and information network analysis, and for link prediction and recommendation. It's even used for system analysis of road networks, as well as biology, chemistry, neuroscience, and physics.
Optimization for PageRank.
Following the mentioned formula of backlinks and internal links, I've put up a quick optimization check-list with some tips.
1. Check InLink Rank.
Yes, PageRank scores are not visible, but it does not mean there aren't any calculation alternatives to them. For example, it is possible to use SEO PowerSuite's InLink Rank to calculate that exact PageRank score (here from 0 to 100), as this feature is based on the same algorithm!
Go to the backlink checker SEO SpyGlass, create a project for your site, and jump to the Backlinks dashboard. Switch to the InLink Rank/Domain InLink Rank tab, select those backlinks that you would like to check, and click Update InLink Rank.
When the value of some backlinks is low (red color), this is a strong indicator that a link can be spammy or of very low quality. You may want to get rid of such a link or run the additional check I've mentioned below.
2. Check for risks and mistakes.
In order to understand whether an external link is harmful for your site, you have to check page's:
- nofollow tag;
- anchor text.
All these factors can be checked in SEO SpyGlass:
1. In your project, staying in the Backlinks dashboard, go to the Penalty Risk tab, select or add those backlinks that you would like to keep an eye on, and hit Update Penalty Risk. In case there is some risk percentage in the Penalty Risk column, click on "i" button to summon the Detected Penalty Risk factors pop-up window:
2. In the same tab check the Links Back column for the nofollow attribute — it might have been switched off, and the Anchor Text column — it might have been changed.
While backlinks can go out of hand, your site's internal links are under your full control. Thus, it allows you to make the journey to your site attractive for both search engines and users. Make sure to follow these requirements:
1. Site structure is shallow.
It is favorable when a page is 2-3 clicks away from the homepage. Nobody wants to click through 10 links to get to the needed one. Thus, if possible, it is better to reduce the number of links. On complicated websites this goal can be achieved using breadcrumbs, tag clouds, and internal search.
To check the click depth of your pages, create a project in WebSite Auditor or open an existing one, jump to Site structure > Pages. Then sort the links by Click depth:
2. All important pages are linked.
In case any page is not linked to any on your site, it will be completely invisible to Google and users. And if this poor orphan is an important page, this is a disaster.
To prevent this situation, you can visualize your site structure with the help of WebSite Auditor. Open your project, go to Site Structure > Visualization, and voila! Your site structure is on your palm with all the link strings exposed:
3. Nofollow attribute is used correctly.
In early 2005, Google implemented "nofollow", a new value for the rel attribute of HTML link and anchor elements. After that, the so-called "PageRank sculpting" came into fashion. It is the act of strategically placing the nofollow attribute on certain internal links in order to funnel PageRank towards the most important pages.
However, such tactics do not work anymore as the nofollow attribute does not redirect PageRank to other links. Now link juice is divided among all links coming from a page, including the nofollow links. But those nofollow links do not pass link juice further. Thus, when it is reasonable, it is better to remove the link than make it nofollow.
If you want to dig into some internal linking strategies, refer to this post.
The future of PageRank.
One of the consequences of the PageRank algorithm and its further manipulation has been the situation when backlinks (as well as link-building) have been usually considered black-hat SEO. Thus, not only Google has been combating the consequences of its own child's tricks, but also mega-sites, like Wikipedia, The Next Web, Forbes, and many others who automatically nofollow all the outgoing links. It means fewer and fewer PageRank votes. What is then going to help search engines rank pages in terms of their safety and relevance?
It is clear that something new should emerge to cover that unfollow emptiness. Here and there it is believed that some search engines may use so-called implied links to rank the page. Implied links are, for example, references to your brand. They usually come with a tone: positive, neutral, or negative. The tone defines the reputation of your site. This reputation serves as a ranking signal to search engines.
I'm not saying here that those linkless mentions are 100% ranking signals. There is no sure evidence of that besides Google and Bing dropping some hints at it. You can check out Gary Illyes (Google Webmaster Trends Analyst) talking about the importance of brand mentions in his keynote at Brighton SEO:
"If you publish high-quality content that is highly cited on the internet — and I'm not talking about just links, but also mentions on social networks and people talking about your branding, crap like that. Then you are doing great."
Or Duane Forrester (formerly senior product manager at Bing) talking about unlinked mentions being as strong a signal as backlinks at SMX West 2016:
"Years ago, Bing figured out context and sentiment of tone, and how to associate mentions without a link. As the volume grows and trustworthiness of this mention is known, you'll get a bump in rankings as a trial. "
Or Google's Panda Patent saying that implied links can have the same weight as backlinks; and so on and so forth.
While backlinks surely matter a great deal, you can try a new implied-link-building technique. Yes, you can't use a backlink checker to find a mention, but you can easily track them. There are plenty of monitoring tools that can do it for you.
Try to find the one that tracks mentions not only from the social media platforms but also from the whole Web (forums, news sites, blogs, etc.) For example, Awario app has its own web crawler that allows it to look at every corner of the Internet.
Make sure people say good things about your business, and if there are some issues, try to solve them to show that you care about your customers. Even if it is not a proven link-building technique yet, it is surely a reputation-building strategy.
I hope that now the PageRank formula does not seem scary and complicated, and you are armed to easily understand how it works and will use this knowledge to your advantage.
One more important thing to keep in mind is that this factor is just part of the story about what helps pages to be displayed high in SERPs. Yes, it was the first one used by Google, but now there are lots of ranking factors, they all matter, and they all are taken into account for ranking. The most essential one is deemed content. You know this, content is king, there is no way around it. User experience is the new black (with the new Speed Update, it will become even more important).
What are your thoughts on PageRank? Share them in the comments section!