Adam Lasnik Discusses WebSpam with Eric Enge
Published: February 4, 2008
What follows is an interview with Adam Lasnik, who is an SEO Strategist at Google. Adam has become extremely well known in the community as a new voice for communications from Google to the webmaster community. Here is his bio:
Before there was a public Internet, Adam was e-mailing. Before there was Netscape or Internet Explorer, he was surfing the Web. He's written comprehensive search engine optimization reports, managed sponsored ad campaigns for Fortune 500 companies, and provided broad communications consulting to successful startup companies. Adam earned MBA and law degrees -- focused on Global Electronic Communications and Commerce issues -- and then moved to Germany to serve as an entrepreneurial consultant to a multinational IT company.
Grateful for the international experience but fascinated by the burgeoning American dot.com scene, he hopped over to San Francisco and joined the high tech PR firm Niehaus Ryan Wong as an Interactive Strategist, helping clients understand and leverage the power of online communities. When the dot.com boom turned to bust, Adam spent the next years broadening his online communications and advertising chops with a mix of small and large companies. In 2006, Adam became Google's first Search Evangelist, dedicated to building stronger relationships between Google and Webmasters.
Eric Enge: Let's talk about hit counters to start. We have seen some search terms where a large percentage of the sites that come up in the top 10 have a very heavy percentage of their back-links coming from a hit counter. It seems to me that these links are obviously highly irrelevant, and they are not really endorsements of their content.
Adam Lasnik: Well, first as you know, what you see in "Link:" is not always fully representative of the back-links that we know about. As I understand it these are hidden links, so if someone puts a counter on their page, and there is a link there that the site owner doesn't actually know about, they are not editorially giving that link. And, hidden links are definitely against our webmaster guidelines, so right there that I think confirms your thought that this is not a practice that we condone or support. And, when we are aware of it, we will take action against the site that is establishing these counters and distributing them.
Eric Enge: Yes, indeed. So, let's say someone has a Tupperware site, and then they start distributing a widget with unrelated content, like a hit counter. They allow people to use this hit counter, and the links back to their site are not invisible, but they are not relevant. Even if it's visible, it would seem to me that, that's an edgy practice.
Adam Lasnik: It is.
Eric Enge: After all, there can't be any reason for them to just want to distribute a hit counter except to get the links.
Adam Lasnik: This definitely is entering a little more of a grey area, whereas hidden links are an absolute no-no. Relevance also is important, but it's not mandatory. So, in this particular case, what we were trying to do in the aggregate is ask ourselves what type of links would this site be getting naturally, because it's all about the natural linking structure.
In your example, it seems most likely these folks that are putting that widget on their blog or their site would not be choosing to link to the other sites otherwise. So, while it may not necessarily be a clear webmaster guideline violation, it's something that we probably would not look very favorably upon. Because these links are essentially still not being editorially given; they are being snuck in even if they are visible.
Eric Enge: Right. It's basically compensation for using the hit counter in this case. There is barter going on rather than an editorial vote.
Adam Lasnik: I think we want to also distinguish this from other situations whereby a company ("great counters for you") makes a really neat hit counter with great graphics, and Wizbang whatever, and, people do put that counter on their site, and it says if you'd like to get a counter like this, check out great counters for you.
That's related, and that is a link that the person might not have otherwise necessarily added on their site without the benefit they received, but, in this particular case it makes sense. Here, relevance does highlight a situation which is realistically more okay.
Eric Enge: Right. We are clearly in one of those areas where the line is a little blurry, right?
Adam Lasnik: Right.
Eric Enge: Let's say you have a company who currently hasn't been distributing a hit counter, that's competing with some of the companies in the examples that I emailed you. They see somebody doing this, and the companies are ranking first, second, or third, forth, or fifth; for big terms in their space. It's one of those situations where the website owner in that situation has to wonder if Google is going to cover their back if they don't do this shady practice.
Adam Lasnik: Sure. If I were talking to these folks that are looking at this particular practice and are frustrated, I would mention three things to them. First is a caveat, what you see or what you may believe is the reason for high ranking is not actually always the case. It's quite possible that we have already discounted all or the majority of those links. But, they may have other really high quality back-links from reputable sites, from relevant sites that are helping boost them towards the top.
Now, that may not be the situation. It's possible that at least temporarily this has given them a boost. So, I would mention two other things, one is that we certainly encourage any Webmaster, any Google user in general to report a spammy practices; to go to our spam report page either within Webmaster tools, or just from Google report the spam using the external tool.
Eric Enge: Right.
Adam Lasnik: Or, submit a paid links report, because as you said this does smack of bartering for payment. The third thing I would mention is that even when certain techniques, whether they are grey hat or black hat may succeed in a given time period, it is certainly not a sustainable practice, and not a practice that is a good one to emulate. If any of these other folks were thinking that if they implement such a counter that they will rocket to the top position, they need to remember that what works today may not work tomorrow, but more importantly this is a practice that we frown upon. We use all the means we have available, algorithmic and manual, to detect, and to make appropriate adjustments based upon this tactic.
Eric Enge: Moving along, the common belief is that anchor text from inbound links is a powerful factor. Is it possible that you can get too much of a good thing where your distribution looks unnatural? After all, the most likely thing people are going to use for anchor text when they link to you is the name of your site, the name of your business, or the name of your page.
You could end up with very odd looking distributions if you start steering the anchor text. Can you talk a little bit about what kind of problems that causes?
Adam Lasnik: First let me confirm that anchor text is important. It does play a role in how we view pages, and especially so in some particular cases. For instance, with videos, images, or flash movies, we lack other signals that we can use to help us understand what that page is about.
Useful descriptive Anchor text can be great, not only for the user who gets a better idea what he or she is going to be clicking through to, but helpful for Google to better understand what that page is likely about. On the flip side of things, what I think you are touching upon here is what I like to refer to as a smell test, and that if something doesn't smell right, smells fishy; doesn't look or seem natural, that's going to certainly raise a red flag for us.
Someone else asked me about a similar issue with regards to anchor text while back, so let me give you a real life example here. If you are in a bar, and someone walks up to you and says, "try Minty-mint gum, the refreshing cool choice". You are like okay, and then a second person a few minutes later walks up to you, and says "hey, try Minty-mint gum, the refreshing cool choice", and then a third person, and a forth person, and so on, you start to get both annoyed and suspicious. It's not because of necessarily what they wrote or what within this particular case what they said. But, it's the unnatural sound of it.
We recognize that in many cases like you highlighted there aren't that many different ways to link to a video or link to a page. People are going to often times on their own accord, link to it in the same way, and that's fine. But if we see a whole ton of links, with very similar or identical anchor text, and that those account for the majority of links to a particular page or resource, that's going to fail the smell test, and that's going to raise red flags; and we would reserve the right then to not count most or even all of those links.
Eric Enge: Right. Let's say you have the perfect domain for a space. Let's say the name of your business is Blue Widgets, and you sell blue widgets, and this is a killer search term in your space. You are going to get an awful lot of links to your site with the anchor text "blue widgets". Let's then say that you have a competitor, which is "Thomas Dewey and Sons".
Eric Enge: Doesn't "Blue Widgets" gain something of an advantage just because they got a really good domain name.
Adam Lasnik: Yeah, I think people overemphasize the value of good domain names, and I don't want to be a single-handedly destroying this wonderful multi-bazillion dollar business where domains that are short words or very descriptive words for a business go for bazillions of dollars. But, I would say that while a domain name can be a factor in some ways how we view sites, how we view links; I would really say that it is relatively a minimal factor.
What's more important is the content on that page, the history of that page and that site frankly. And so, we see often times companies with really wacky names, that aren't even necessarily descriptive of what they do, but they are brand leaders or they provide really great content and tools. They are probably going get a massive amount of links relative to their space. It's less important that the anchor text link up directly with or even very close to the name of the domain, and, more important that it is related to what the user will find on that page.
What's nice about this is that it's something that companies have a heck of a lot more control over, as opposed to going out and trying to get the best selections from a very limited barrel of perfect domain names, that exactly name what they do.
Eric Enge: Right. So, Thomas Dewey and Sons isn't screwed in competing with Blue Widgets.
Adam Lasnik: There are many companies that are successful, many sites that are successful; that have names that sound neat, they are clever, and they are funny, even though they don't have the critical keywords in them. I would suggest that it's really a negligible factor quite frankly.
Eric Enge: Let's talk a little bit about internal links, specifically I have seen recommendations from Google that a hundred links per page is about as far as you should go. What happens when there are too many links on a page?
Adam Lasnik: There is actually a historical technical reason why we said try and keep it to a hundred links. In the really early days of Google with so many sites out there, we actually had at that time relatively limited resources. In our earlier days, we had to make tradeoffs on how broadly the Googlebot would crawl, and how deeply it would crawl.
We actually had this set limit; if the particular page is greater than a certain length then just grab the first part of that page.
Eric Enge: Right, first 100k or something.
Adam Lasnik: Exactly. I don't remember the exact number, but it was limited to a certain size. We found that the number of links also correlated with that. So, we realized that we would be doing Webmasters a favor by essentially reminding them or urging them to put their content on a number of different pages including your links, so that there is a greater chance we will be able to see, index, and appropriately digest them, and include those pages and the text on those pages in our cache.
Now, at this point, as you might guess, we have a heck of a lot more resources and the Googlebot is a lot more hardy, and it's not getting indigestion after even really, really, surprisingly large pages. However, there is a case for recommending smaller number of links based upon user interests. What we found is that sites that have a lot of links, two hundred, three hundred, of five hundred, tend to have links that have not been strongly editorially vetted.
We would rather see fewer links that the webmaster has actually looked over, and that they are maintaining to make sure they are still fresh. And, that's not to say that there aren't webmasters who actually are familiar with the three hundred to five hundred great resources that they or their team have checked out.
But then, you get to a point where how many users really want to scroll down on a page particularly as more and more users today are using alternate devices such as mobile phones, screen readers, and so on. For the user it largely makes sense to break these up by whatever criteria makes sense for the user. Also, from a PageRank perspective, having a ton of links means that each one of those links is going to pass less juice.
Let me give another real world analog here. You ask a friend to tell you their favorite movies, and she says "let me tell you my two hundred favorites right now". You are going to view that less powerfully. That's going to have less affect on your perception of those movies than if your friend tells you her top four favorites.
Eric Enge: Right. Of course I can easily come up with examples, where it makes sense to have more than hundred, or maybe even in some cases more than two hundred links on a page. For example, you have a web page that lists all the high school websites in Ohio. You can either organize those alphabetically right on that one page, or you can provide some sort of alphabetical index that users have to drill down through. My bet is that the users would prefer to see them all on one page.
Adam Lasnik: Yes, I can think of other examples as well. I think that that does sound like it's a case where the user may just want to do control F, and search on that page for a particular high school name. That's fine given that the site listing of all those high schools is probably less concerned about giving a whole lot of Google juice to their three favorite high schools.
Your example is of a resource that independent of PageRank considerations quite likely makes sense for the user.
Adam Lasnik: I have seen a case for instance in which a fellow listed a whole ton of swing dancing resources on one page. He could have compartmentalized that, but it's just a bit hard. For example, there are teachers that teach in more than one city, so you can't really divide it up by city, there are sites that contain videos and informative text. So, in this particular case it made sense to list them on one page; he knew them, he had vetted them, he liked them, there you go.
Eric Enge: Right. And then there is a cousin situation where you have 200 links on a page, but there are ten of them that you want to focus the PageRank on, so you take the hundred and ninety of those links, and you NoFollow them.
Adam Lasnik: If you really love those particular sites that you want to link to, what I would recommend would be to actually talk a little bit about those four schools. Did you go to them, or those four resources or whatever; why do you like those, why are those interesting to your users, and actually have some text around them.
That is more likely to capture a user's attention, and what's more powerful about that from a PageRank perspective since it's more likely to capture their attention, they are more likely to go to them. And then, by extension they are also probably more likely to mention them, and link to them on their own site, their own blog, etc.; so by calling attention to those sites even without using the whole NoFollow thing, you are essentially setting up a chain reaction which again in the aggregate is likely to result in greater link love for those pages.
Eric Enge: Right.
Another thing that I have been looking at lately, and it seems to be everywhere, is what I call the default document problem. It seems like a lot of CMS systems will have the internal links on a site refer to the homepage of the site as www.site.com/index.html, and effectively create duplicate content. Is that a correct interpretation of how that works?
Adam Lasnik: I definitely think you are correct in understanding that this is pretty widespread. I have also seen various CMS platforms that the way they work ends up creating essentially duplicate URLs or canonicalization problems. However the happy news here is that in the vast majority of cases I have seen, we do the right thing. Our Googlebot looks at the page and if it sees a page that is pretty much identical to something it has already seen, it will automatically make a determination regarding which page makes more sense, and it will run with this URL for that given page.
In the vast majority of cases, it really has no negative affect at all. Now, one thing that you didn't touch upon, but I think it's also an important consideration here because this can be more painful for the Webmaster than it is for the Googlebot: if the webmaster decides later on to change to a better CMS for instance, and that CMS tends to have pages ending in ".php" instead of ".html", then there definitely is a problem.
If folks have actually linked to that index.html, they are either going to hit a 404 error, or you are going to have to go through and do 301s on all the HTMLs to PHPs. So, by simply doing site.com at least for the homepage, you are going to end up with a situation that is more convenient, and you won't risk canonicalization problems. There is one other tip here though that can also help Webmasters that have this particular problem.
That is to submit an XML sitemap using our webmaster tools, because what we've been doing increasingly is taking a look at the URLs that are submitted on that sitemap, and using that as a canonicalization hint. So, if we are uncertain whether we should be using the, just site.com, or site.com/index.html, but, you list site.com and not site.com/index.html in your sitemap; we are going to be more apt to go with that.
Eric Enge: So, if all your internal links point to index.html and the world links to site.com, and one of the pages basically gets filtered out, right you are wasting some of your link juice in the process.
Adam Lasnik: Actually that is not something the webmaster should worry about. When we canonicalize stuff on our end, we also combine PageRank. So, if we see that people are linking to the exact same resource in three different ways, again thankfully in the majority of cases that I have seen, we are able to not only know that's the same page, we are also able to take the different links, the different URLs that are linking there, and combine that PageRank so that it gets the total PageRank from those links, and it's not separated out.
Eric Enge: OK. So, as long as the Googlebot is able to figure it out, they are in good shape.
Adam Lasnik: Absolutely. As I said, it's best if they can make the changes on their end to avoid greater hassles or avoid what is at least a slight risk of some problems. But, if they are unable to make those changes, I would not spend a lot of sleepless nights worrying.
Just one more quick thing to note on that as well, and that is in the very least webmaster should be consistent within their own site. So, it's best to link to at the same way within your site at the greatest extent you can.
Eric Enge: Right. Because, then you are creating different kinds of problems, and it is harder for you to canonicalize.
Adam Lasnik: That's right.
Eric Enge: Okay, so the next one is a flash question. More and more people want to do flash sites because of how nice they can be made to look. There are two solutions that people talk about for showing HTML text to a bot that corresponds to a Flas movie. One is Scalable Inman Flash Replacement (SiFR) and SWFObject. Can you provide me with a sense as to whether one is preferable over the other from Google's perspective?
Adam Lasnik: We are pretty technologically agnostic in this context. I have seen some SiFR sites that have been indexed just great. I haven't happened to catch any of the SWFObject based flash sites, so, I can't give a definite answer on that one, but the key thing here is that if the text that is essentially gracefully rendered outside of the flash for those who don't have it, is identical to what folks that do have flash capabilities in their browser are seeing, then generally there is not going to be a problem.
The times in which we have seen problematic uses of this technology involved when someone has just a very basic description or menu set in the flash like airplane reservations, train travel, and busing. And then, outside of the flash, in the HTML they have great bargains on airfare, wonderful great rates, train trips first class, business class etc., that augments and extends what was in the flash, that's a no-no, because that's showing different users different things, and that is also essentially akin to what we would see as keyword stuffing.
Regardless of the technology used to make flash content available to all the users, make sure that content is identical, and you are likely to be on the safe side of things. I would also note too that, not too long ago one of our guides on our webmaster help group wrote a really good blog entry on best practices for flash. So, I would encourage your readers to check that out as well. (http://googlewebmastercentral.blogspot.com/2007/07/best-uses-of-flash.html)
Eric Enge: My understanding is that with SiFR as you call it, you have a situation where the visible text and the flash movie are integrally linked, and guaranteed to be identical. Whereas with SWFObject, you are able to create in HTML object, but the content is relatively arbitrary. In other words, you could keyword stuff were you are so inclined. Wouldn't that make SiFR a safer technology, because there would be no lack of trust from an algorithmic perspective?
Adam Lasnik: Basically what you are saying is that SiFR doesn't even really allow you to be bad. From that perspective I can totally understand how you perceive that to be safer. But, I want to stress that we want people to feel free to use whatever technology is best for their site and for their users. If they are trying to decide between the two, SiFR might be safer bet as you put it, but, if folks already are using SWFObject, and they are keeping the content the same, I think that's great too.
Eric Enge: Can you talk a little bit about your extended efforts in terms of communicating with a webmaster?
Adam Lasnik: It generally falls into two different categories; online and offline, and both are equally important. In the online area we've actually expanded what we call our user groups now to more than a dozen. These are the webmaster help groups we started off in English, and we've more than tripled the number of guides just in that group. We now have more than a dozen languages, including most recently Hebrew and Turkish. So, we are expanding across the globe to help webmasters everywhere. We have also realized that we could not personally tackle every question that comes up on these boards. There are simply hundreds of threads a week just on the English one alone.
What we have really supported and pushed towards is empowering the community with any of those groups. In fact that's been a large part of my role over the last month. I've gone to different Google offices around the world, and worked with the Google guides there to help them build these communities, to help them communicate with the webmasters in a scalable and a friendly way all over. We've also been, I think on a bit of a tear with the blog.
You may have seen, both in the English blog and now in the German blog, that we've been posting more articles. We have also been expanding our help center. We have committed ourselves to support webmasters all over the world. We already have at least 20 languages the help center has been translated into, and we are working on more.
Whenever we add or revise content, we have the responsibility to also make that content available not just in English, but in all the other languages that we support. We are also looking to make a lot of our outreach more interactive. One of the most popular things we did in the user groups was actually called popular picks where we just asked folks point blank hey, what are some of the questions that are affecting you; what are some of the specific things you want to know, and we are going to commit to answering a big handful of these with detailed replies. We are going to see more of that, and not just in English group, but in the other ones as well.
As for our offline outreach we are also pushing the international aspect of this. So just last year in addition to being on both coasts of the US, we visited webmasters in Australia and China, Germany, Sweden, England, and there are probably more countries I am missing in that list. We want to continue to do that, because while we started in America, we have indexed a huge number of pages outside of the States. One thing I touched upon there was scalability, and so I know that people have also asked us well, why can't you just hire a million people, so you can answer every single question on the forum, and you can do ten blog posts a day.
I think what people don't necessarily realize is we have committed to having people on the frontlines that are actually the ones immersed in what they are talking about. So, for instance the people that are answering questions about webmaster tools are typically the ones that are actually involved in coding and building and refining those tools. People that are answering questions about webmaster guidelines and spam, those are the people that sit with Matt Cutts and the Webspam team.
It's not scalable to simply go and just hire a whole ton of people purely for the purpose of answering questions, because those aren't the people that are making things happen, that understand at a fundamental and deep level how things work, how things are changing, and how webmasters can take advantage of the new tools that we have.
Eric Enge: Can you talk about some of the fun stuff coming up?
Adam Lasnik: Sure. We're also looking to increase the transparency to share more with webmasters and to share it faster, again in all the languages we support. We are looking to expand the help center with multimedia content, because videos are some of the most popular help content we have ever featured.
We are also increasing the number of user groups that we have. Right now we have more than a dozen. I'd love to double that in the next years, so that people in other countries can ask in their native languages questions and interact with the webmasters in their language, and in their comfort zone.
One more thing too just to give you an idea of direction, as I mentioned we've been going to a lot of conferences, myself included, and I love doing that. But, I think one of my biggest passions which is also shared by a lot of my teammates is helping out a lot of these webmasters that don't even think of themselves as webmasters. The folks that can't afford to go to conferences; they are just running a small business. One of the greatest challenges they've had, and definitely the thing that I feel most passionate about in this area is how do we reach these folks? How do we help teach them best practices, how do we find out what their special concerns are with regards to webmastering and getting in Google and so on?
There is just so much amazing content out there, especially quite frankly outside of the States where people tend to already have a lot of resources and knowledge about webmastering. So, how do we reach these folks, how do we let them know that we are out here? So, to work with them, to help them, to listen to them; and how do we do that in a scalable way without actually cloning all of us and sending us to every country in the world? Figuring that out is not going to be easy, but it's definitely what makes my work really exciting, really rewarding everyday when I come in.
Eric Enge: Indeed. You touched briefly upon webmaster tools. I think that's a very interesting component of this too, because when you want to dig a little deeper and get some insight in how the Googlebot sees your website, it offers a really interesting dataset that you really can't get anywhere else.
Adam Lasnik: That's absolutely true, Webmaster Tools has seen amazing growth, and just from a meeting this morning I also know some really neat things that will be coming out, but I can't talk about them just yet. But, some great tools that are actually coming have come directly out of requests from webmasters. Look at what tools we have rolled out just over the last few months.
Our webmaster trends analysts see what the webmasters concerns are, and what the feature requests are. They actually sit directly with the webmaster tools people, so on a day-to-day basis they can say hey, would it be possible for us to do this, or you know what would it take for us to add this tool, to add this information? And, so I think that's also one of the areas of webmaster communications that I am most proud of and most excited about as well.
Eric Enge: Well, excellent. Thanks Adam!
Adam Lasnik: My pleasure, thank you Eric!
Have comments or want to discuss? You can comment on the Adam Lasnik interview here.
Previous Interview with Adam Lasnik
Other Recent Interviews
- comScore's James Lamberti - Jan. 28, 2008
- Incisive's Kevin Ryan - January 28, 2008
- Eurekster's Grant Ryan - January 14, 2008
- Eurekster's Steven Marder - January 7, 2008
- Google's Sep Kamvar - December 17, 2007
- Microsoft's Grad Conn - December 10, 2007
- Seth Godin - November 26, 2007
- Microsoft's Ramez Naam - November 12, 2007
- Google's Matt Cutts - October 8, 2007
About the Author
Eric Enge is the Founder and President of Stone Temple Consulting (STC). STC offers Internet marketing optimization services, including SEO, Social Media and PPC optimization, and its web site can be found at: http://www.stonetemple.com.