Why You Want to Use Your Blog to Build Trust

Rand just put up a post about The Vast Ocean Between Shoemoney & SEOmoz and Why You Should Be Able To Trust Blog Links. Other than the fact that he singled me out in it, I think it’s a great post. I started to write this post as a comment on SEOmoz, but it just got way too long. If you want to read about this topic, read Rand’s post, and then come back to this one.

I do think there is a balance between the time you invest in blogging and content development, and making a living.

When I look at SEOmoz, I see a CEO, and a company, committed to reputation building through offering a wealth of free information offered up with nothing asked for in return. I know of no one who has shared more about their company. Reference SEOmoz’s sharing of it’s 2006 financials (in detail) and more data, such as this post: More SEOmoz stats than you can shake a stick at.

This level of openness builds an enormous amount of trust, and deservedly so. I can tell you from my personal experience that this openness is something that is quite evident when dealing with Rand in person.

Ultimately, that reputation and trust should help SEOmoz build up its premium subscriber base, and, it’s ability to get high value clients. But speaking for myself, I can tell you that I don’t regret a single penny that SEOmoz makes, and I was personally dissappointed that SEOmoz did not make more than they reported in 2006.

So now, the other side of the coin. Lack of disclosure and lack of openness builds mistrust. You become unsure about how to value the information you are receiving. You get uncomfortable with a person when you know they are not telling you something. This is not somewhere you want to be in this social web of ours.

The social web is far too efficient at spreading this type of reputation and trust information around. And, it gets more efficient every day, so this trend is going to continue for the forseeable future. In other words, the genie is out of the bottle, and has no intention of going back in.

So I agree wholeheartedly with Rand’s positioning that bloggers should be open about how they are being compensated (saying “I’m getting paid”, for example, is enough detail), and that readers deserve that, but I also think that it’s in the blogger’s self interest to be open. There is a BIG difference (should I say vast ocean?) between being paid to write a review, and getting compensated for your efforts in the way that SEOmoz does.

FYI – all of the content development efforts of STC are uncompensated (in the direct sense). This includes answering questions in comments, and in emails I receive from people, without there ever being a chance of getting a penny out of it. I really enjoy doing it (that’s compensation too!), and we do have companies that have become clients as a result of our efforts.

Should you NOINDEX your RSS feed?

One of the questions you see swirling about the forums and blogs these days is whether or not you should Noindex your RSS feeds to avoid duplicate content problems. The source of the problem is that RSS feeds are being crawled by the search engines. In addition, many people are now recommending that you include the entire content of your articles directly in your feed. You can read more about this in my recent interview of Rick Klau of FeedBurner.

So if the search engine sees the content on your site, and also sees it in your feed, will that being seen as duplicate content? There certainly have been instances of finding RSS feeds in the search results of the search engines. For those of you worried about this possibility, you can see Yahoo’s spec for Noindexing RSS feeds here. My understanding is that both Google and Yahoo! will honor this Noindex request of your feed.

But let’s step back, and think about how search engines try to deal with duplicate content. They are always trying to figure out who the authoritative source of the content is. It should be pretty obvious to the search engine when it’s crawling an RSS feed, and it should also be relatively obvious that the RSS feed for a site’s content is not the authoritative search.

In my interview with Rick Klau, and my earlier conversation with Google’s Adam Lasnik, we talked about this issue.

Rick points out, as I did above, that it’s really obvious to a search engine when it is crawling a feed, and that feeds are not the authoritative source of content. In addition, providing a feed often helps a search engine more rapidly find new content on your site.

Adam Lasnik commented that during his time at Google that he had not hard of any instances of a site being negatively affected by a duplicate content issue with an RSS feed.

Based on this input, I would conclude that there is extremely little risk in letting your feed get indexed. After I first learned about the issue, I did move forward and Noindex our feeds here at STC. But then, after the conversations with Rick and Adam, we became convinced that it’s just not a problem, and this is the recommendation we make to our clients as well.

17 Poor Quality Signals Your Site May Be Sending

You see the postings in the forums. People talk about their sites going in an out of the index on a regular basis. Their sites go in for 5 days, and then are out for 10. It’s a wrenching experience.

In my recent interview with Adam Lasnik, Adam explained that it simply means that Google is detecting what it considers to be some poor quality signals with regard to your site. The other thing that Adam outlined is that the reason for the in and out behavior is that Google is tweaking their algorithms on a regular basis.

Based on some of my experience in helping some people with these problems, a regular basis could be as frequent as weekly, or even more often. It’s interesting to contemplate why Google would be making these tweaks so frequently. It could be that it’s a part of constantly testing search quality with the live index. It would be reasonable to speculate (and I am speculating) that they have a variety of automated measurement tools in place to see how various tweaks affect overall click though rates, bounce backs, searchers per use per session, etc.

But if this problem is happening to you, you want to have some idea about what to do. Here is a list of things that you may want to look for:

  1. Too large a percentage of your links are reciprocal
  2. Lack of high quality inbound links
  3. Too large a percentage of your inbound links are not relevant to your site topic
  4. No outbound links
  5. Outbound links to poor quality sites
  6. Too large a percentate of your outbound links are Nofollowed (note: does not apply to blog comments and forums, where this expected behavior)
  7. No coherent topic for your site
  8. Too large a percentate of your pages with duplicate content
  9. Too large a percentage of your pages with minimal content
  10. Titles are duplicated
  11. More than 5 keywords in your keywords metatags on your pages
  12. Meta descriptions are the same on every page
  13. Image alt tags that are way too long
  14. Keyword stuffing
  15. Hidden text, or almost hidden text
  16. Hidden Links, or almost hidden links
  17. Web server downtime too high

So if your site is going in and out of the index, this is a list of things for you to think about. It’s natural to agonize about what is going on, and many webmasters that are experiencing this problem are honestly trying to run good web sites. But, the in and out behavior of your site tells you that you have something to work on.

Searchology Event Review With Pictures

This is an overview of the day that Google called Searchology (May 16, 2007). In this post I will cover some of the aspects of the event other than the announcements themselves. This will include a series of pictures from my trip to the Googleplex, with some comments about each one, and what Google seemed to be trying to accomplish with the event.

1. The first picture shows the ‘Plex as viewed from right outside Building 43, which is where Matt Cutts, Adam Lasnik, and crew reside. The funny thing is that this picture looks a bit like a war zone. In fact, the architecture looks quite neat in person. Note that the tables laying on the ground you see in the foreground are there because they are about to setup for an event of some sort, perhaps a concert, which is a common event at the ‘Plex.

Google Campus

2. One of the new projects on the Google campus is an herb garden:

Google Herb Garden

3. Elliot Schrage, Google’s Vice President of Global Communications and Public Affairs, chaired the event. He introduced the plan for the day, and each of the speakers:

Elliot Schrage

4. Craig Silverstein, Google’s Technology Directors, was next up. He spoke about the “ghost of search engines past”. Craig was employee number 1 at Google, and was involved when it was still in the dorm rooms. Sergey Brin’s dorm room was used for business, and Larry Page’s dorm room was used as the machine room. This all took place during the height of the Internet bubble.

The rack you see on the left is one of the very first racks of machines used by Google. When the search engine was first launched in 1998, it had only 25,000 web pages in it’s index. Sometimes the result set was only 4 web pages, so the scoring system was not that critical. However, this was still complicated enough that the notion of a human edited directory no longer worked.

By 2000, they had introduced “Giga Google”. With this, they scaled to millions of pages. Suddenly they had new problems to look at. Replication of the data to the East coast was a big issue they had to face and solve. And, now the scoring system became critical, because most searches offered up more than 4 results.

Here is Craig, giving his talk:

Craig Silverstein

5. Next, we had Ben Gomes, who was introduced as Google’s search quality czar, and Kerry Rhoden, a senior person on the usability team. They showed neat examples of how they use Eye Tracking studies to model human behavior, and improve usability. Another interesting tidbit from this is that they store a whole extra copy of the web for their testing purposes.

So when they test new algorithms that they are not ready to push live yet, they have the complete set of data to use for testing purposes. Then they provided several examples of usability changes that Google has made over the years:

  1. “Did you mean” lines intended to offer spelling corrections to users who misspell their query.
  2. Onebox results that attempt to show you the answer without you having to click through to anyone’s web site.
  3. Query Refinements that provide users with a list of links designed to help them tailor their search quickly. As an example, look at the search results for the search phrase “Cancer“, and notice the section titled “Refine results for cancer:”.
  4. Site links. For example, when you search on “Circuit City“, you will get the main site, but you will also get direct links to the most visited pages on that site.
Ben Gomes and Kerry Rhoden

6. Then we had Udi Manber, Google’s VP of Engineering. He continued the theme of the day to this point that wash Google is doing is really, really hard. The organizer’s of this event wanted the press to get that message loud and clear. One of the most interesting statements made by Udi is that 20 to 25% of the queries that Google sees in any given day are queries that they have never seen before.

Wow. 20 to 25%. That took a while to sink in. Talk about reinforcing the value of a long tail strategy. After a few introductory comments, Udi moved on to talking about some things that Google is doing to improve. He started with a list of queries for which they successfully map one search phrase to another that is a better fit for what the user wants. The example below go from easiest to hardest, with links in those cases that are already live:

  1. GM = General Motors
  2. Ramstein AB = Ramstein Air Base
  3. ab ca successfully translates AB into Alberta
  4. typing – words per minute text brings up a first result which is a tool that will give you a typing speed test
  5. “unchanged lyrics van halen” will be mapped to “lyrics to unchained van halen”
  6. “overhead view of bellagio pool” will be mapped to Bellagio pool pictures
  7. “F-15 launch launched from a sub” will be mapped to “F-15 submarine launch”
  8. “distance from Zurich, Switzerland to lake Como, Italy” will be mapped to “train Milan Italy Zurich Switzerland”. Why? Because it happens to provide the distance the user originally requested.

Google is also looking at Cross Language Information Retrieval. This was not officially announced, but what they plan to do is to accept a user’s query in their natural language, translate it into every other language they have in their data base (12 languages to start), get the best results, translate the web pages with the best results, and present the results back to the user. One key part of how they do this is that they will end up keeping on hand 12 copies of the web, pre-translated into all 12 languages they will support initially.

Udi-Manber

Last up was Marissa Mayer, Google’s Vice President, Search Products & User Experience. Marissa made all of the official announcements. This has been covered in many places, so here are a few links you can follow to find the details of the announcements:

Marissa told us that Universal Search was something that she originally suggested back in 2001, but they couldn’t do it then, because the infrastructure challenges were too great. They really needed to solve the problem of how to have a common relevance calculation across all of their search properties without dumping hundreds of millions of queries per day onto each of them. She also provided some examples of queries that demonstrate Universal Search in action:

  1. restaurants in mountain view, ca shows a map with the locations of several restaurants
  2. nosferatu presents a result at Google video where you can play the result. In fact, you can play the entire movie right there inline in the search results page.
  3. “Mexican poetry” will show results directly from Google book search, and yes you will be able to read the entire book right there.
  4. i have a dream will allow you to view a video of Martin Luther King’s very famous speech inline in the search results
  5. big wheels races presents a video of such a race down Lombard street in San Francisco, complete with crashes
  6. Clay Bavor brings up a time lapsed video of Clay Bavor building a portrait of Abraham Lincoln, using only pennies. Gray scaling is done through the use of the tarnishing and dirtiness level of the pennies.
  7. things you can’t do when you are not in a pool brings up a hilarious video. Note that you get results here from video properties that are not owned by Google too, and you can still play them inline.

Here is Marissa giving her presentation:

Marissa Mayer

Lastly, there was a question and answer session. From let to right in this picture, we have:

  1. Sergey Brin
  2. Udi Manber
  3. Marissa Mayer
  4. Alan Eustace, Senior Vice President, Engineering & Research
  5. Elliot Schrage
Craig Silverstein
Craig Silverstein

Overview of Google’s Announcements this past Wednesday

May 16th was a fascinating day. I had the pleasure of sitting in on a major press announcement by Google in Mountain View, at an event that Google called “Searchology”. In this announcement, Google announced four significant enhancements to search as we know it:

  1. Universal Search: This is the biggee. Now when you use Google’s web search interface, you will get results from their web index, and from book search, local search, GoogleMaps, image search, news search, and blog search. Some of the highlights include the presentation of video thumbnails with one click access to see the video, or even an entire movie, inline on the web results search page, direct display of images, one click access to pulling the entire text of a book, etc.

    What this gets you is a potential relevance boost in the results you get. For my detailed thoughts about this, check out my articles on Search Engine Watch: Will Universal Search Drive Universal Domination? and Will Universal Search Drive Google’s Vertical Search Properties?.

  2. Contextual Navigation Links: As another step in this integration, Google is now listing their search properties across the top of their results pages. These are being done on a contextual basis. So the links you see listed across the top will vary depending on the search query you have just entered. If you search on Britney Spears picture, you will see a link for “Images”, but if you type in Nosferatu (a classic horror film), you will see links for both “Images” and “Video”.

    Of course, we we mentioned before, you can play the full movies (all 84 minutes) right there in the search results.

  3. Universal Navigation Bar: Google also announced a Universal Navigation Bar, which shows up across the very top of their results pages. This lists all the major products. And for those of you who were painfully aware that it took 8 clicks to get to your Gmail account previously, you can now do it in one click.
  4. Google Experimental: Many of the things that are normally found over at Google Labs will now be found over at Google Experimental, but Google Labs is still being continued. The difference is that you, as a user, can sign up for Google Experimental, and then all of your searches will use the new features they are testing. Want to see the latest stuff in advance? Now you can make it a part of your everday experience.

    One great thing you can see right now through Google Experimental is the “view” concept. If you search on thomas jefferson view:timeline, you will see a completely different presentation of the search results. Here is a screen shot of what it looks like:

    Google Timeline View

    Or, you can search on pga tours view:map and get a completely different search result.

Another cool thing that Udi Manber, Google’s VP of Engineering, talked about as a future development was the notion of Cross Language Information Retrieval. As part of this, they will be keeping many whole copies of the web, each translated into different languages (with 12 languages to start). They will then take your search query and translate it into each of those 12 languages, find the best result, and then translate that result back into the language of the searcher.

There was a lot of stuff in here, and participating in the experience was great fun. I will be writing more about this here in this blog, and at Search Engine Watch over the next few weeks.

Tomi Poutanen talks about algorithmic and social search

I had a great discussion with Tomi Poutanen last week about the future of algorithmic and social search. Tomi identified 3 challenges faced by algorithmic search:

  1. The size of the web they are searching and indexing
  2. Subjective queries, such as “what’s the best hotel in New York” can’t be addressed by algorithmic search
  3. The spammers use algotihms too, and there is an inherent arms race between spammers and search engines

But Tomi does feel that algorithmic search is still better for navigation and deep research. So, ultimately, there is a home for both.

We also talked about the return of tagging, and I asked Tomi why tagging would work this time, when the metatags of the past were such a dismal failure. It turns out that there is one key difference. Metatags were used by webmasters to indicated what there site was about, and hence were highly subject to SPAM.

But with social search, tagging is done by people to help them bookmark content they like. On sites such as del.icio.us and Flickr, the most heavily tagged content rises to the top. So it’s a bit like an election process. It becomes very hard to spam this, because spamming of other people’s tagging decisions is difficult to do algorithmically. In addition, if a spammer succeeded in influencing how their site ranked for one type of tag, the return just is not that high.

For more, check out the full transcript of my interview with Tomi Poutanen.