One of the things that is evolving in my philosophy of SEO is how I look at the role of trust. Trust was not something that was important during AltaVista’s hey day, when keyword density was king, or even during the early days of Google when PageRank in its purest form ruled the day.
Another big thing in my mind these days is that the number of factors, and the importance of each factor, involved in ranking algorithms has changed. When the PageRank paper was published, you pretty much had a blueprint for how it all worked. However, knowledge is power, and in this case the power was in the hands of the spammers.
As a result of these factors, numerous patents have been published by each major search engine, on a variety of topics related to ranking, yet these patents no longer provide a clear roadmap to ranking algorithms. They provide hints as to what the search engines could choose to use a ranking signals, but they don’t tell us what they do use. For this we have to rely on intuition, judgment, and testing. In evaluating ranking signals I believe there are two major factors:
- Noisiness of the signal. Does a strong positive always, or nearly always, mean a good, relevant site? Does a strong negative always, or nearly always, mean a poor, or less relevant site?
- Importance of the signal. Assuming that we have a signal that is not noisy, how significant an indicator is it when compared to other signals? What made links such a powerful element is that they were, and still are, a powerful indicator of relevance and quality.
One example of a noisy signal is Bounce Rate. In principle, the idea is that when a user goes to a site, and returns to the SERPS after a relatively short period of time, that this is an indicator that the result was not a good one. But the problem with it is that on a reference search (e.g. zip code for Charlotte) the user may have gotten what they want in just a few seconds.
One of the signals that I think has low noisiness and a high degree of importance is trust. One important paper on this topic was published in 2004 by Yahoo! and Stanford University. The paper was titled Combatting WebSpam with TrustRank. The paper proposes that the search engines use human editors to identify a site of highly trusted seed pages. Then, “once we manually identify the reputable seed pages, we use the link structure of the web to discover other pages that are likely to be good”.
The general notion is that the closer a web page is to a highly trusted page (closer as mentioned in number of link hops) the most likely it is to be a trustworthy page. You can think of a Trust Rank factor that reduces the overall trust level of a page based on the number of hops from the human reviewed seed sites. The paper also suggests that links placed on pages with lots of links (even if they are on one of the human selected seed pages) tend to be placed with less care than links on pages with very few links. As a result the trust communicated by two selected seed pages can differ.
The researchers who wrote the paper on TrustRank also authored an interesting paper on a concept they call spam mass. This paper lays out a method for calculating the percentage of a web sites total PageRank that results from being linked to by spam pages. The higher this ratio is the greater the likelihood that the site is itself a spam site. There are some obvious problems with this idea – as your competitor could buy links to your site from thousands of poor quality sites and potentially trash your rankings by giving you a high spam mass. Nonetheless, the concept is an interesting one.
Once you start thinking about these things, it is easy to come up with fresh ideas. For example, the trust damping factor you might apply for each link between a given page and the highly trusted seed pages could vary depending on the TrustRank level of the intervening web pages. It could also vary depending on the trust level of the domains on which those pages reside. For example, the home page of http://www.usa.gov may be deemed to be a highly trusted seed page.
In three hops you may be able to find yourself on links fairly well removed from the seed page. But if those three hops take you to a different page on USA.gov, do you lower the TrustRank as much as you would if you have transitioned to a completely different domain? Maybe not. You can also think about the notion of “Reverse TrustRank”. This is the notion that if your site links to spammy sites that this should lower its TrustRank. This thought should provide ample motivation to make sure that you take care to not link to any bad sites. Better still, you should screen your site for this on a regular basis. After all, the quality domain you link to today may have a different owner tomorrow, and that new owner may have poor intentions. Don’t let this happen to you!
So we don’t have as clear a roadmap to ranking factors as we used to, but we can still use common sense. Take the time to learn the basic tactics that search engines can use to do their job. Don’t get overly hungup on any one factor. Most of the time, if you build a great site and promote it properly you should achieve good results. But, knowing what things search engines are likely to rely on can really help you understand what you need to do to improve your rankings.
In addition, the way the search engines use and measure trust most likely varies significantly from the papers reference above, but my own experimentation convinces me that they are measuring it, and using it in a significant way. We factor this into our thinking at STC on a day to day basis.