Eric Enge and Bill Slawski talk about Search Engine ranking factors
Podcast Date: October 19, 2007
The following is a written transcript of the October 19, 2007 podcast between Bill Slawski and Eric Enge:
Eric Enge: Hi, I'm Eric Enge, the President of Stone Temple Consulting; you can see our website at www.stonetemple.com. We are here today with Bill Slawski, the owner of the well-known SEO by the Sea blog and the Director of search marketing at KeyRelevance, and we plan to talk about search engine ranking signals. You can see the SEO by the Sea blog website at www.seobythesea.com, and you can see the KeyRelevance site at www.keyrelevance.com. How are you doing today Bill?
Bill Slawski: I'm doing fine Eric, how are you today?
Eric Enge: Hey, I'm doing great.
Bill Slawski: That's good to hear.
Eric Enge: So, let's dive in. One of the things I really like that you've posted about a couple of times, I think a year ago October and more recently was different kinds of ranking signals that search engines can use. You covered twenty different signals in each of the two posts. And, I'd really like to get your thoughts on which of those signals that search engines might be using now that, aren't necessarily things that everybody understands that they are using now.
Bill Slawski: Okay. Just a quick thing about the genesis of those posts; I tend to cover a lot of different patents that come out from the search engines every once in a while. I find it useful to try to tie a lot of those posts together and extract ideas from them. It seems like that was a good opportunity; looking at the way search engines may take results and re-rank them. I really hadn't seen anybody do that in the industry, so I wanted to last October put a lot of those together and some of them were very obvious. We probably want to skip over those really quickly, but it's just things like filtering duplicate content out, removing multiple relevant pages from the same site, etc. Sometimes you get those indented, sometimes you don't and you see a little link, click here to see more.
There are other ones that are happening that are a little bit less obvious like sorting for country specific results. So, if the search engine thinks that you are in the UK, it might present results in a little bit different order than if they think you are in the US. You can also set language preferences on your browser or at the search engine level for most search engines, so that if your preference is English and you type in a word that might have meaning in more than one language like rendezvous.
It's going to try to give you English results rather than French results. With my background in law and legal type field, a lot of legal terms actually have French origins. Appeal, appellate, things like that, and terms like defendant, and that can get little bit confusing as search engine doesn't know which language you are speaking. There are a lot of re-rankings that happen in a smaller niche area like changing results based on commercial intent. I am not sure that any search engine really has folded that into its main search, but an example of this is Yahoo's Mindset.
When you go there, you see a little slider bar and you can slide the bar back and forth to shopping to informational, and it re-orders results. Microsoft's has produced a lot of papers on commercial intent. They may or may not use those today. Some are informational in nature, some are transactional.
Eric Enge: Right. So, when someone uses the word buy in a query, that's obviously transactional.
Bill Slawski: Right.
Eric Enge: Right, as opposed to "digital camera reviews" which is obviously informational.
Bill Slawski: Right. So, with the search engine, if the query is informational or transactional, we rank results based upon that type of intent. One of the other things we talked about is looking at more than one query, where you have a query session. If you look for commercial results by the types of queries that you use one after another, will it change the order of results to give you more commercial results than if you view type in, let's see Portland Maine, and then you type in seafood restaurants. Perhaps the search engine starts giving you search results that have to do with seafood restaurants in Portland Maine, and give you overviews of places like that.
Eric Enge: Right. I think there is already evidence that they are aware of your location based on reverse IP lookup.
Bill Slawski: Not just reverse IP lookup, if you are using a cell phone it might do cell tower triangulation. They might use global positioning satellite information; they might have a query history, if they are collecting web history and search history, showing that you do a lot of searches in that geographic area related to them.
Eric Enge: Right. Well, it gets really interesting if you are sitting in say Boston, and you just did a query on Portland Maine, and then do a query on seafood. So, are you really looking for seafood in Boston or are you looking for seafood in Portland Maine?
Bill Slawski: Right. And, we've been hearing a little bit about generating advertising that's taking the advantage of consecutive queries to show ads related to that stuff. Could we see the same type of thing from organic results? It's possible. A lot of these ranking factors that I've written about to one degree or another are being used or are very close to being able to being used. But, like I said not always within the main context of the search engine, maybe within a smaller sphere like a mindset or with Yahoo's YQ.
It can take certain contextual information from pages that the website owners can tag. This page maybe about thirty different restaurants in Boston, but they've only tagged two of them. So, you have to really learn about those. So, the website owner is determining some of the relevance.
Personalization is another area where you've got to turn it on to get the full impact. But, there maybe things going on behind the scenes during the normal regular web search that influences the results that you see, and that aggregates data from users who search for things similar to what you search for and who tend to select pages similar to the pages you select. You may not have to be signed in or logged into personalized service for them to carry that information over from one search to another.
It maybe done based upon say triples of data. They see other searchers who perform searchers similar to you, and go to pages, select pages similar to the pages you select. There may be 2, 3, 4, 5 different queries in a row in a query session. So, that may influence the next pages that you see.
Eric Enge: Right. Since they know for example that when someone did three similar queries to you, and then they do the fourth query, they know what the majority of the people clicked on. They'd follow that pattern and they can potentially take that if it wasn't the no.#1 result and make it the no.#1 result by the time you get to it.
Bill Slawski: There is a transition there, going from a straightforward keyword matching type search to a more of a recommendation type search.
Eric Enge: What's your sense as to how much of that recommendation model is actively in place now?
Bill Slawski: It's hard to tell. I think we are moving more and more towards it, part of that is triggered by building the statistical model, and doing some machine learning. The more searches that people conduct, the more information that the search engines are able to take and use in a meaningful way, the more you'll see there. The search engines have an incredible amount of data, and when we here talk of infrastructure updates at the search engines, one of the things that we need to consider when they are talking about infrastructure is their ability to switch on and switch off different ranking mechanisms.
Eric Enge: Right, given the global distribution of their data centers.
Bill Slawski: Yes. They may assign different weights to different queries, different categories, different classes of websites, different searchers, and it's possible when you are doing a search that you even have the results from more than one ranking algorithm in front of you at once. Your choice which you click on may not increase the rank of that website, but rather increase the use of that algorithm that produced that website. (Editor: think about this comment a bit, it's a real mouthful).
Eric Enge: Right. Yeah, it's a, there are an intense number of things they can look at. One thing that I'd love to get your take on for example, is how often a particular website is bookmarked by someone, and how that can affect ranking.
Bill Slawski: There is a lot of user behavior information that search engines can collect, and user bookmarks is one of them. The amount of distance somebody scrolls down page, the amount of time somebody spends on a page before they return to search results, whether or not they will come back to a page after looking at some other pages. Those are all things that search engines use to say hey, this is an important page, this isn't an important page; this page matches well with this particular query, etc.
Bookmarking is one of those things; it's an active browsing activity that's outside the normal scope of the search engine. But, if you build in a bookmark tool, or if you watch traffic carefully through ISP information, or toolbar information you can make use of that data. Ask came out with a patent application, where they talk about looking into traffic, and seeing where people go, and seeing how long they spend at places. It even mentioned watching along as people used other search engines, and seeing what results they clicked for specific queries on those search engines.
Eric Enge: Right. The interesting thing to me is when you think about something like bookmarks, right say Google's own bookmarks, Certain people promote and put on their page something that says bookmark us. It seems to me that that would introduce a significant amount of noise into the process, in terms of using that signal compared to sites that don't have a "bookmark this" button on their content. It makes it a very difficult signal to place too much weight on.
Bill Slawski: Webmasters have always come up with ways to get people to extend their relationship with visitors. Newsletter subscriptions, the email update forms or buttons, send this page to a friend emails. You've been able to save pages on your browser as a bookmark. Bookmarking services like Del.ici.ous and others have been around for a little while. There were similar bookmark services that came out in the late 90's that didn't use a tagging system, but they were around. I think what you have to do when you talk about that sites with bookmarking buttons is recognize that if that's been used as a signal, it's just one signal of many.
One or the other patents that was interesting, that came out was actually one of three that talked about building profiles for web pages, and creating traffic estimates that was originally written in the context of paid search. But, it talked about classifying different types of sites by subject, by volume of visits, by search, by bookmarking, and so on to try to get a sense of what the site was like, and build a profile for it. A more recent patent application talked about working on profiles through sites based upon adding site search to the site, and learning what the site was like based upon how people used that site search, what they looked for, how successful they were in finding things, so on.
We have other tools that the search engines are using such as Google Analytics, Website Optimizer, and so on. So, they are learning a lot about how people interact with individual websites, being able to profile those websites, aggregating the profiles, finding the sites that are similar in lot of ways, that aren't taking advantage of say Google Analytics or Website Optimizer, or a bookmark this page button, so on. They are still be able to find enough points of similarity that they can put the sites together, cluster them together, so they know if these sites are somewhat alike. Bookmark activity by itself its just one signal amongst many.
Eric Enge: Right. So, the individual signal maybe noisy, but the cumulative effect of all the signals isn't.
Bill Slawski: Right.
Eric Enge: Of course the other thing you could do, of course is group sites that have bookmark this buttons together with other sites that actively request bookmarks, and weight them differently. So, the value of their bookmark is different than the people who don't have such buttons.
Bill Slawski: Absolutely. It's same like as you take a small Alzheimer's site that deals in one particular subject matter. It's going to have a different type of profile, and provide different signals to say for instance, than a Blog. You have different quality signals, and signals of importance with the Blog like the number of RSS subscribers. By having multiple quality signals and a big number of group sites together, you can compare them based upon that.
Eric Enge: So, let's talk a little bit about how the different search engines are approaching this, at least Google, Yahoo, and Microsoft. Do you have a sense as to, how Google is looking at one type of signal set and Yahoo at a different type, and Microsoft looks at yet another type in terms of their emphasis.
Bill Slawski: There are different types of strategies that they maybe pursuing. We know all the major search engines focus upon keyword matching. They may try to find sites that match the internet researcher, and one of the basic tenants of information retrial is trying to be as precise as possible, and trying to recall as many pages as possible that might be relevant. We've heard from some of the folks at the search engines that those are goals, but they are at a long way towards fulfilling them, but they are trying.
You have Google, Yahoo, and Microsoft, they can't be clones of each other; they can't do things exactly the same way, though they are trying to reach the same goal, but their paths to these goals are different. We have patents that may exclude one from doing the same thing that the others are doing, and we have different corporate cultures. You take a look at Yahoo which started as a directory and portal, and a lot of that remains. They try to build sites or try to acquire sites that have a strong community, such as Flickr, Del.ici.ous, and so on, where they are user based, and there is user content generation behind them.
Google started out with search, and the idea behind most of the services Google provides isn't the generation of content, but they both try to use user generated content in what they do.
Eric Enge: Yahoo appears to have more assets in that regard, because they have Del.ici.ous, and Flickr, and Yahoo Answers for example. And so, it strikes me that their approach, that they will be faster to leverage social data, and I could be wrong, but it's just something that struck me from looking at it from the outside.
Bill Slawski: It may seem like that on the surface. The goals, I think in all cases are very similar in trying to get people answers to the question that matches their intent.
The approach is different, and part of it I think does have to do with that background, it's a different background. If I want to add Microsoft to that equation, Microsoft, I think has an approach that tries to be more contextual. If you are searching for a certain type of information they try to understand the intent behind the search.
Eric Enge: That's an outgrowth of their neural net algorithm, right? It allows them to look at the data in a different way.
Bill Slawski: The rank net algorithm, which uses a machine learning approach. I think some of it comes from people from the operating file system search side coming over to the search engine and saying "okay, when we worked on calendaring searches, we tried to find the date of all events; when we worked on email searches, we tried to find the date of the email we've sent or the date of the email we just received". So, the different context of searches had different best answers. So, how can we take that and apply that to search in a search engine?
I think trying to look at context, there are the quality signals that they look for in web pages, it tends to be based more upon on site results, but they also look at on site factors of pages pointing to other pages to see how good or how poor referral that is? The page rank method of ranking pages is based upon academic citations. The more citations that point to a page the better, but the higher quality citations that point to the pages is an even better signal. Everybody can point to academic papers that maybe infamous because it does something wrong. So, we want to point to the high quality stuff, and we want to count citations that are high quality themselves.
Eric Enge: Right, did you by the way in what Microsoft did with their shopping search? They added this thing which is automatically scanning user reviews, finding criteria that a lot of users talked about for a given product, say a digital camera, and then, determining how many of those were positive versus negative, and having a rating for those criteria.
Bill Slawski: There is a hidden aspect to that that doesn't get a lot of discussion. Are you familiar with visual segmentation papers, and patents, or VIPS stuff from Microsoft?
Eric Enge: I am not, tell us about it.
Bill Slawski: Okay. The idea is that, for a lot of search we may search upon content founded specific URL's. So, we rank an individual page, or we take content from the individual page that's not necessarily the best way to go about it. Pages can be about more than one topic, right? You can have a page that's about restaurants in the Grange Village that actually reviews twenty different restaurants. So, how do you use those individual reviews in local search for Google? How do you take those individual reviews and break them down, and point them to the different restaurants.
In a local search database, Google came up with visual gap segmentation, where they are looking at not just the HTML code, but also the white spaces themselves. They are breaking the pages into different parts, and pointing the different paragraph set at the local search result, so that if you look at reviews in Google Local, you might find reviews coming from pages that have reviews for lots of different restaurants, or hotels, or different types of things.
Microsoft has talked about breaking pages down into segments, and looking at where the links appear upon those pages. That was something they were talking about a couple of years ago. They have been talking about what they call object level search, where instead of looking at the full page they are looking at parts of the pages, and they are saying "this page is showed with thirty reviews for this product, let's break it down into individual reviews, and count them each as a separate entity".
It may link to the particular product, it may just mention it, but we are going to break this down, and we are going to look to the different products, and we are going to count all the reviews, and we are going to look it to see if the reviews are positive or negative. If the site we are looking at has a ranking system, one star through five stars, we might take that information. The idea is that they are indexing information within blocks, within segments in a page instead of on a page level. Google's talked about doing that, Yahoo has recently, and a couple of patents talked about doing that too. Google talked about something called agent rank at the beginning of the year.
This is where they break down a page into segments, and look for the authors of those different segments. So, if you have a blog that has thirty comments on it from people other than a person who wrote the post, the original post, you have thirty segments, thirty objects. In agent ranking those different objects maybe ranked differently based upon the reputation of the person who wrote each one. So, we are again on a smaller scale than page ranking, which is an interesting approach.
Eric Enge: It's a way of dealing with user generated content, right? The micro analyzing the individual components, I think that's where that special analysis comes in, it's just simply recognizing the individual component.
Bill Slawski: With special analysis they have talked about being able to distinguish between header and footer, main content, sidebars, where a link within the main content on a page is worth more than a link on the sidebar maybe. Nothing says that page rank has to be evenly distributed amongst outgoing links.
Eric Enge: Right. The really simple stuff the higher up a link is in the page content, the more valuable it is, but that's too simple really for evaluating a links value. It seems intuitively more interesting to evaluate is the link embedded in the content? That removes any doubt, well it removes most doubt I should say as to a link being purchased. Links in the left navigation or on the right sidebar are certainly more subject to being a purchased link, and they are certainly more likely to not really be integrated into the unique content of that page.
One funny thing I got to tell you about which is that I had a client who had a link which they had bought, and when we work with clients we always work with people to replace those paid links with natural links. Put that aside for a moment, the link was in right in the middle of this content, so it was like this perfect hard to detect link, except after the link it said "this is really a great site I know, because they paid me to say so". So, a human review component exposed that link rather quickly I am afraid.
Bill Slawski: It's possible that an automated review system might target a window of content and take your text link itself and look for certain words that might target either flagging the content for human review or might say don't count this one. That type of extended window around the link is something I have seen alluded to.
Some of the other differences in strategies between the search engines, is how they implement universal search. One of the things that interests me is how they decide which results to show from where, and with Google we've had a number of possibilities, and a number of different models. The format of the question itself may trigger certain results, and you could have a query such as "define:" and then some word, and you will see a definition. For question and answer type stuff, you will still get a definition. If you ask a question, ask in a question format something like what is Derek Jeter's date of birth, you'll get an answer in a Q&A type format. We are seeing more information extraction ideas showing up in some of these patents. For example, simply choosing which result. Ask.com does it differently. They segment their results pages into different sections, and so having to choose whether or not they are going to display a certain result, they'll just show most of them. If you search on the name of a famous person, they are going to show celebrity type stuff.
Eric Enge: Right. They are more likely to show images. This is something that Microsoft focused on in their Searchification announcement, not only showing more images, but showing rankings for celebrities, and more.
Bill Slawski: We have had certain sections on the page that are likely to show certain different types of things; we have a section which shows query refinements, and we have an images section. With Google they are looking at a statistical model, the user query, and their user query repository. For example, consider people who have been searching for lots of pictures of lions.
They search engine will show pictures of lions, and perhaps there have been many stories in the news recently about lions. Yes, let's show those news results in there somewhere. It might be the Detroit Lions, but it's still lions, but we are getting user behavior, user information influence in what gets shown in those results, and it's filling in the gaps on a page of different types of results. So, that's a different strategy, that's one way that the search engines differ.
Eric Enge: We should wrap up, and the last question I would like to ask you is for the average webmaster out there. How do they deal with all those things in terms of trying to understand it, and how it affects what they do, or what is the smartest thing to do, because what the search engines keep telling us is to make great content, and for users, and call it a day.
Bill Slawski: It's just such a broad statement that it's really not helpful. Success really means having a good marketing plan and a good business plan, and your marketing plan should include more than just what you do online. But, when you go to the online part of it, it doesn't hurt you to setup a strong foundation for success with your website in terms of making it easy for search engines to crawl, having unique content on each page, unique titles Meta tags, so on.
Using the language that your audiences are likely to use to search for the stuff, understanding who your audiences actually are; making it easy for them to complete tasks, making a usable website, being persuasive without being overbearing, understanding where are the places that your potential customers like to go online, maybe advertising there, or participating if it's community or something like that. You go fishing where the fish are kind of thing.
The other thing I think is important is recognizing that there are different types of searches people conduct where people search for information about stuff; they try to conduct transactions, they look for ways to navigate to stuff. We've had there types of queries based upon that most searches tend to be informational, people want to find out how they can do something themselves and save money, or just find the information itself. So, if you have an Ecommerce site that doesn't help people use their information, help them making informed, shopping decisions, you are not going to get so many queries. You are not targeting as big an audience
But, if you make a site that's engaging, that makes it easier for people to shop as possible, but also helps them learn about what they maybe buying, you are that much more likely to succeed in these days of universal search, thinking more about the images that you use, adding video, thinking about audio, using the podcasts like your podcast here, those are good ideas.
You're creating an interesting user experience for your visitors; you are providing them different ways to learn about what you provide. When you throw pictures on your website, make them good, strong, interesting pictures that help supplement the content that you feed on your page, but also they can stand alone, that can by themselves are interesting and might attract people to your web pages; the same with videos, the same with podcasts.
Eric Enge: Indeed. Well great, Bill. Thanks for coming to speak with me in the audience today. I could talk about this for hours on in, but that would make the podcast a little long. So, thanks again.
Bill Slawski: Oh, thank you very much Eric. It's been a pleasure.
About the Author
Eric Enge is the Founder and President of Stone Temple Consulting (STC). STC offers Internet marketing optimization services, including SEO, Social Media and PPC optimization, and its web site can be found at: https://www.stonetemple.com.