Report Generating SEOs

SEO is not a report generating business. It’s a roll your sleeves up and dig in kind of activity. What’s sparking this post is that a few times recently I have been shown the large formal reports that some SEO firms offer. Very pretty. Glossy. 80% to 90% completely filled with non-unique content (i.e. not specific to the customer, but instead consisting of general SEO advice).

But reports do nothing for you. There is plenty of free SEO advice out there, being shared by people who regularly break new ground. The client could read those free reports and figure out what makes sense for their business themselves if they had access to the right type of resources and time. But they don’t. That’s why they went looking for an SEO firm in the first place.

Action drives results. Here are just three examples of typical client needs we have seen over the past few months:

  1. Duplicate content issues. To help you understand the complexity of this, here is a list of 12 ways webmasters create duplicate content. Figuring out where the duplicate content is, and how it is generated is a hard core exercise requiring real effort.
  2. Site moves, with significant renaming of URIs. This requires developing a specific and detailed map of the old site and the new site and defining the 301 redirect plan to minimize any collateral damage.
  3. Saving the best for last – link campaigns. There is no way that this is anything other than a custom activity. It’s too wrapped up in the content and tools available on the site, the budget of the client, the availability of resources to generate new content, and the topic area of the site.

An SEO really needs to be someone who is a business analyst, with a strong mix of both marketing and technical skills. They also need to be able to dig in and get their hands dirty. If you don’t there is no way that you are going to help your client increase the level of traffic they get from search engines.

A Review of Unica’s Affinium NetInsight Web Analytics

Latest up in our series about web analytics packages is Unica’s Affinium NetInsight, with our article: 12 Cool Things You Can Do with NetInsight. And there are lots of cool things you do with it. The article comes with screen shots and a description for each of the cool things. Here are the headlines:

  1. Creating custom dashboards
  2. Ad-Hoc Analysis
  3. Drag, Drop, and Drill, Drill, Drill Down
  4. Correlate Data
  5. A/B Analysis mode
  6. Integrate Offline Customer Data
  7. Examine Individual Click streams
  8. Robot/Spider analysis
  9. Remarketing
  10. Ask NetInsight Wizard
  11. Heat Map Overlay
  12. Date Comparison Reporting

Comments on the article? Feel free to leave them here.

15 Things About How Google Handles Duplicate Content

Duplicate Content is one of the most perplexing problems in SEO. In this post I am going to outline 15 things about how Google handles duplicate content. This will include my leaning heavily on interviews with Vanessa Fox and Adam Lasnik. If I leave something out, just let me know, and I will add it to this post.

  1. Google’s standard response is to filter out duplicate pages, and only show one page with a given set of content in its search results.
  2. I have seen in the SERPs evidence that large media companies seem to be able to show copies of press releases and do not get filtered out.
  3. Google rarely penalizes sites for duplicate content. Their view is that it is usually inadvertent.
  4. There are cases where Google does penalize. This takes some egregious act, or the implementation of a site that is seen as having little end user value. I have seen instances of algorithmically applied penalties for sites with large amounts of duplicate content.
  5. An example of a site that adds little value is a thin affiliate site, which is a site that uses copies of third party content for the great majority of its content, and exists to get search traffic and promote affiliate programs. If this is your site, Google may well seek to penalize you.
  6. Google does a good job of handling foreign language versions of site. They will most likely not see a Spanish language version and an English language versions of sites as duplicates of one another.
  7. A tougher problem is US and UK variants of sites (“color” v.s. “colour”). The best way to handle this is with in-country hosting to make it easier for them to detect that.
  8. Google recommends that you use Noindex metatags or robots.txt to help identify duplicate pages you don’t want indexed. For example, you might use this with “Print” versions of pages you have on your site.
  9. Vanessa Fox indicated in her Duplicate Content Summit at SMX that Google will not punish a site for implementing NoFollow links to a large number of internal site links. However, the recommendation is still that you should use robots.txt or NoIndex metatags.
  10. When Google comes to your site, they have in mind a number of pages that they are going to crawl. One of the costs of duplicate content is that when the crawler loads a duplicate page, one that they are not going to index, they have loaded that page instead of a page that they might index. This is a big downside to duplicate content if your site is not (more) fully indexed as a result.
  11. I also believe that duplicate content pages cause internal bleeding of page rank. In other words, link juice passed to pages that are duplicates is wasted, and this is better passed on to other pages.
  12. Google finds it easy to detect certain types of duplicate content, such as print pages, archive pages in blogs, and thin affiliates. These are usually recognized as being inadvertent
  13. They are still working on RSS feeds and the best way to keep them from showing up as duplicate content. The acquisition of FeedBurner will likely speed the resolution of that issue.
  14. One key think they use as a signal as to what page to select from a group of duplicates, is that they look at and see what page is linked to the most.
  15. Lastly, if you are doing a search and you DO want to see duplicate content results, just do your search, get the results, and append the “&filter=0″ parameter to the end of your search results and refresh the page.

Here is a summary of Ways to Create Duplicate Content, and Adam Lasnik’s post on Deftly Dealing with Duplicate Content that explains how you handle this problem on your site.

Search Engine Land Announces Sphinn

At 11:00 am today, Third Door Media and Search Engine Land will announce a new social news site known as Sphinn. This site will offer users the ability to submit articles about interactive marketing, search, and social media, and then other users can review those articles and vote on them. As with other social news sites, such as Digg and Reddit articles that receive the most votes will be shown on the home page.

I had the opportunity to preview the new site over the past couple of days, and think it will offer people a quick way to find the hottest articles in the search and social media landscape. This can act as a proxy for scanning tons of individual search marketing and social media blogs. How important that feature is to people will, in my opinion, determine the success level of the site. As a cousin of that, if the readers come, then those who submit articles will certainly come, as it will become a target for getting traffic.

Creating an account was easy, and submitting my first article, and joining into a discussion or two was easy as well. Unlike Digg and Reddit, where submitting your own content is taboo, users are also actively encouraged to submit their own content as well. Honestly, I think this makes more sense. Self promotion should be OK on these typs of sites, and if you submit garbage you will get predictable results and eventually lose interest, as it will be a poor way to spend your time.

Google VP Search Quality, Udi Manber

Just last week I had the pleasure of speaking with Google’s VP of Search Quality, Udi Manber. We focused our conversation on various aspects of search quality, Google’s algorithms, and Google’s development processes. Understandably, Udi was not going to provide a tremendous amount of detail about these things, because these are areas that Google considers a proprietary advantage.

Nonetheless, the conversation proved to be very interesting. For example, it was suggested that Google is looking quite closely at what can be done with a variety of social search inputs. Clearly, this is an opportunity for Google to leverage human input in a scalable way. It will be interesting to see what comes out of this investigation at Google.

Rather than tell the story here, just check out the interview.

Page Rank, and Query Specific Page Rank

Page Rank gets a bad rap sometimes. It’s easy to understand why. People got obsessed with Page Rank a few years back, and for a long time, people would not let go of the notion that Page Rank was the only thing you needed to worry about in SEO. For that reason, many really smart SEOs started to downplay it greatly, and to suggest that Page Rank is meaningless.

But that’s over doing it. Page Rank is still very, very important, and I still use it.

Page Rank still provides the best way of measuring the importance of a page, or a site. So let’s step back for a minute and talk about what I mean by importance. For example, why is Amazon more highly ranked than Joe’s Book Store? Because Amazon has a lot more links (page rank). Ultimately, a search engine has to decide two things about each page, in response to each search query:

  1. How relevant is the page to the search query – they do this by textual analysis of the page, the site, and an analysis of the relevance of the inbound links (by analyzing the text on the linking page, and the text on the linking site, and the relevance of the links to the linking site). As you can see, this rapidly becomes a highly recursive process, that provides the best results if you do this on a search query by search query basis. You can think of every page on the web as having its own “relevance score” with respect to every single search query.
  2. How important the page is compared to other pages that are relevant to the search query – This is a page rank calculation, as filtered by the relevance of the inbound links. I think of this as “query specific page rank”. So if the query is “bananas”, and your page is about bananas, and you have an inbound link from a site about selling used cars, that link will not add very much to the importance score of your page for the query bananas. But if the inbound link is from a page about bananas, on a site about bananas, and uses the word bananas in the anchor text of the link, the inbound link will add a tremendous amount to the importance score of your page. The final kicker is an evaluation of the importance of the linking page and the linking site. One simple way to do that is to look at their page rank. To get really artful though, look at how it ranks in the SERPs for the query bananas. If it’s on the first page, you have a killer link. So with this notion of query specific page rank, you have a way of thinking about your linking strategy, and you are acknowledging that page rank is still at the core of that strategy

In SEOmoz’s Search Engine Ranking Factors survey of top SEOs, the top 3 ranking factors selected were:

  1. Keyword Use in Title Tag
  2. Anchor Text of Inbound Link
  3. Global Link Popularity of Site
  4. Age of Site
  5. Link Popularity within the Site’s Internal Link Structure
  6. Topical Relevance of Inbound Links to Site
  7. Link Popularity of Site in Topical Community
  8. Keyword Use in Body Text
  9. Global Link Popularity of Linking Site
  10. Rate of New Inbound Links to Site

Looking at this list, and our notion of “Query Specific Page Rank”, items 2, 3, 5, 6, 7, 9, and 10 all fall into that category. That’s 7 out of 10 – not too shabby.

So, ultimately, page rank still counts for a lot. And when you adapt that thinking to thinking about Query Specific Page Rank, and relating it to the most important keywords for your site, you are really on the right track with your SEO strategy.