29 Tidbits from my Interview of Matt Cutts

It is always a pleasure when I get a chance to sit down with Matt Cutts. Google’s Webspam chief is always willing to share what he can for the benefit of webmasters and publishers. In this interview we focused on discussing crawling and indexation in detail.

Starting with this interview, I have also decided to provide the interview series with a bit of a new look. I am going to continue to publish the full transcript of interviews in the STC Articles Feed and on the articles page on our site, but I am going to use the related blog posts as a way of highlighting the most interesting points from the interview (for those of you who want the abridged version).

One of the more interesting points was their focus on seeing all the web’s content, regardless of whether or not it is duplicate, an unreadable file format, or whatever. The crawling and indexing team wants to see it all. You can control some of how they deal with it, but they still want to see it. Another interesting point was that listing a page in robots.txt does not necessarily save you anything in terms of “crawl budget”. (But wait there’s more!)

What follows are some of the more interesting statements that Matt made in the interview. I add my own comments to the end of each point.

  1. Matt Cutts: “there isn’t really any such thing as an indexation cap”
    My Comment: Never thought there was one, but it’s always good to confirm.
  2. Matt Cutts: “the number of pages that we crawl is roughly proportional to your PageRank”
    My Comment: Most experienced SEO professionals know this, but it is a good reminder how the original PageRank defined in the Brin-Page thesis still has a big influence on the world of SEO.
  3. Matt Cutts: “you can run into limits on how hard we will crawl your site. If we can only take two pages from a site at any given time, and we are only crawling over a certain period of time, that can then set some sort of upper bound on how many pages we are able to fetch from that host”
    My Comment: This will likely be a factor for people on shared (or under-powered) servers.
  4. Matt Cutts: “Imagine we crawl three pages from a site, and then we discover that the two other pages were duplicates of the third page. We’ll drop two out of the three pages and keep only one, and that’s why it looks like it has less good content”
    My Comment: Confirmation of one of the costs of duplicate content.
  5. Matt Cutts: “One idea is that if you have a certain amount of PageRank, we are only willing to crawl so much from that site. But some of those pages might get discarded, which would sort of be a waste”
    My Comment: More confirmation
  6. Eric Enge: “When you link from one page to a duplicate page, you are squandering some of your PageRank, correct?
    Matt Cutts: “It can work out that way”
    My Comment: Yes, duplicate content can mess up your PageRank!
  7. Matt Cutts: “If you link to three pages that are duplicates, a search engine might be able to realize that those three pages are duplicates and transfer the incoming link juice to those merged pages”
    My Comment: So Google does try to pass all the PageRank (and other link signals) to the page it believes to be canonical.
  8. Matt Cutts: re: affiliate programs: “Duplicate content can happen. If you are operating something like a co-brand, where the only difference in the pages is a logo, then that’s the sort of thing that users look at as essentially the same page. Search engines are typically pretty good about trying to merge those sorts of things together, but other scenarios certainly can cause duplicate content issues”

    and

    Matt Cutts: re: 301 redirect of affiliate links: “People can do that”, but then “we usually would not count those as an endorsement”
    My Comment: Google will take links it recognizes as affiliate links and not allow them to pass juice.

  9. Matt Cutts: re: link juice loss in the case of a domain change: “I can certainly see how could be some loss of PageRank. I am not 100 percent sure whether the crawling and indexing team has implemented that sort of natural PageRank decay”
    My Comment: In a follow on email, Matt confirmed that this is in fact the case. There is some loss of PR through a 301.
  10. Matt Cutts: No HTTP status code during redirect: “We would index it under the original URL’s location”
    My Comment: No surprise!
  11. Matt Cutts: re use of rel=canonical: “The pages you combine don’t have to be complete duplicates, but they really should be conceptual duplicates of the same product, or things that are closely related”
    My Comment: Consistent with prior Google communication
  12. Matt Cutts: “It’s totally fine for a page to link to itself with rel=canonical, and it’s also totally fine, at least with Google, to have rel=canonical on every page on your site”
    My Comment: Interesting way to protect your site from unintentionally creating dupe pages. Just be careful with how you implement something like this.
  13. Matt Cutts: “the crawling and indexing team wants to reserve the ultimate right to determine if the site owner is accidentally shooting themselves in the foot and not listen to the rel=canonical tag”
    My Comment: The canonical tag is a “hint” not a “directive”
  14. Matt Cutts: re using robots.txt to block crawling of KML files: “Typically, I wouldn’t recommend that. The best advice coming from the crawler and indexing team right now is to let Google crawl the pages on a site that you care about, and we will try to de-duplicate them. You can try to fix that in advance with good site architecture or 301s, but if you are trying to block something out from robots.txt, often times we’ll still see that URL and keep a reference to it in our index. So it doesn’t necessarily save your crawl budget”
    My Comment: One of the more important points of the interview: listing a page in robots.txt does NOT necessarily save you crawl budget.
  15. Matt Cutts: “most web servers end up doing almost as much work to figure out whether a page has changed or not when you do a HEAD request. In our tests, we found it’s actually more efficient to go ahead and do a GET almost all the time, rather than running a HEAD against a particular page. There are some things that we will run a HEAD for. For example, our image crawl may use HEAD requests because images might be much, much larger in content than web pages”
    My Comment: Interesting point regarding the image crawler.
  16. Matt Cutts: “We still use things like If-Modified-Since, where the web server can tell us if the page has changed or not”
  17. Matt Cutts: re faceted navigation: “You could imagine trying rel=canonical on those faceted navigation pages to pull you back to the standard way of going down through faceted navigation”
    My Comment: Should conserve PageRank (and other link related signals), but does not help with crawl budget. Net-net: sites with low PageRank cannot afford to implement faceted navigation because the crawler won’t crawl all of your pages.
  18. Matt Cutts: “If there are a large number of pages that we consider low value, then we might not crawl quite as many pages from that site, but that is independent of rel=canonical”
    My Comment: Lots of thin content pages CAN kill you.
  19. Eric Enge: “It does sound like there is a remaining downside here, that the crawler is going to spend a lot of it’s time on these pages that aren’t intended for indexing”.
    Matt Cutts: ” Yes, that’s true. … You really want to have most of your pages have actual products with lots of text on them.”
    My Comment: Key point is the emphasis on lots of text. I would tweak that a bit to “lots of unique text”.
  20. Matt Cutts: “we said that PageRank Sculpting was not the best use of your time because that time could be better spent on getting more links to and creating better content on your site”
  21. Matt Cutts: more on PR sculpting: “Site architecture, how you make links and structure appear on a page in a way to get the most people to the products that you want them to see, is really a better way to approach it then trying to do individual sculpting of PageRank on links”
    My Comment: Google really does not want you to sculpt your site.
  22. Matt Cutts: “You can distribute that PageRank very carefully between related products, and use related links straight to your product pages rather than into your navigation. I think there are ways to do that without necessarily going towards trying to sculpt PageRank”
    My Comment: Still the best way to sculpt your site – with your navigation / information architecture.
  23. Matt Cutts: on iFrame or JS sculpting: “I am not sure that it would be viewed as a spammy activity, but the original changes to NoFollow to make PageRank Sculpting less effective are at least partly motivated because the search quality people involved wanted to see the same or similar linkage for users as for search engines”
    My Comment: An important insight into the crawling and indexing team’s mindset. Their view is that they want to see every page on the web, and they will sort it out.
  24. Matt Cutts: “I could imagine down the road if iFrames or weird JavaScript got to be so pervasive that it would affect the search quality experience, we might make changes on how PageRank would flow through those types of links”
    My Comment: Even though a particular sculpting techniqe may work now, there is no guarantee that it will work in the future.
  25. Matt Cutts: “We absolutely do process PDF files” … “users don’t always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that’s often a little more useful to users than just a pure PDF file” … “There are, however, some situations in which we can actually run OCR on a PDF”
    My Comment: Matt declined to indicate if links in a PDF page will pass PageRank. My guess is that they do, but they may not be as effective as HTML links.
  26. Matt Cutts: “For a while, we were scanning within JavaScript, and we were looking for links. Google has gotten smarter about JavaScript and can execute some JavaScript. I wouldn’t say that we execute all JavaScript, so there are some conditions in which we don’t execute JavaScript. Certainly there are some common, well-known JavaScript things like Google Analytics, which you wouldn’t even want to execute because you wouldn’t want to try to generate phantom visits from Googlebot into your Google Analytics”.

    and

    Matt Cutts: We do have the ability to execute a large fraction of JavaScript when we need or want to. One thing to bear in mind if you are advertising via JavaScript is that you can use NoFollow on JavaScript links”
    My Comment: You can expect that their capacity to execute JavaScript will increase over time.

  27. Matt Cutts: “we don’t want advertisements to affect search engine rankings”
    My Comment: Nothing new here. This is a policy that will never change.
  28. Matt Cutts: “might put out a call for people to report more about link spam in the coming months”
  29. Matt Cutts: “We do a lot of stuff to try to detect ads and make sure that they don’t unduly affect search engines as we are processing them”
    My Comment: Also not new. Google is going to keep investing in this area.

So if you got this far, you must be really interested in Matt’s thoughts on search and webspam. Check out the rest of the interview for more!

Comments

  1. says

    So I assume he is talking about one domain to another? Not internal 301 redirects that done very often through design changes?

    What about the Webmaster Tools feature that signals directly to Google that a domain is moving? Is their decay there?

    What about the canonical tag they Google as preached to use. Is their decay there?

    Why does Google say something like this without explaining the full story?

  2. says

    Hi Evans,

    No question, a 301 redirect is the best solution for dealing with moved content. The cost, in terms of lost link juice, is most likely small. The reason why there is any coast at all? As I noted in the interview, because the person who gave the link to the content linked to it in its original location, not the new location, and that context might have impacted their willingness to link to that content.

    Eric

  3. says

    Sure Eric that makes sense, but since the linker did not know the URL changed should decay occur? No, but hey its up to Google right? Seems like they are more protecting their own algo or relevancy by doing this.

  4. says

    We have seen a tremendous amount of success from targeting your top products and then building out your site from there. The link juice does follow and bottom line is that your convert better on those products (which is really what matters)

  5. says

    So what you are saying Eric, is that it behoves us to find link backs.. get a list ready of who is linking to us, find their addy’s and send out that mailing as soon as the 301’s are implemented (like when moving domains)… that would mean some time spent doing this.. is it worth the cost of the time?

    It would certainly be sensible to make sure any high ranking sites that linked to one’s site were informed and you followed up to make sure it had been done.

    Perhaps its a question of balance here if the cost is not extremely high?

    It’s a good interview btw, and good about making key factors links on here. I am not much of a seo person I just know the basics, but I could follow the interview quite easily though a few terms went over my head. One aspect interested me, about JavaScript, I wanted to know more about js being followed. (It might help to have more links to terms within the post and fuller details about certain things linked to as well)

    I have a friend who uses a js menu at the top of the site with a text link menu at the bottom … she says so that the last thing the bot see’s are the links .. they see the content first. I prefer to make more user friendly and accessibility website(s) in the first instance, so I wanted to know more about this aspect, any resources you can point me to?

    Thanks for taking the time to do this.

    Tina Clarke

    • says

      Hi Tina,

      There is no question that after you move some content or a domain, it makes sense to go to the top pages linking to that content and ask them to update their links. This is a strong positive signal to Google that the move is legit. Remember that (I believe) that the key issue is that the person who originally granted the link may not have chosen to do so when the content is in a new location.

      So, If you have 20% of the people who linked to the old location update the links to the new location, this is a strong positive signal. Then they might even credit the other 80% of the links fully (just a theory, but you can see why it might be true).

      Regarding the Javascript, there is not clear published data on this. We know that Google executes some, but not all, Javascript. To me, the best practice is to review what Amazon (or EBay) does, because Google is likely to make that work.

  6. thank you eric says

    Great interview for people like me that know very little. If I can post a few extra questions for your knowledgeable community.

    My policy on my tiny ecommerce site that sells products that I manufacture locally is no outgoing links. Amazon does not allow outgoing links so I copied them and also thought if Google looked at my site and it has good quality incoming links, which it does, and no outgoing links it would know links were earned not swapped. I have no ads as I make money from ecommerce.

    Should I carry on with this strategy as I am emailed constantly by sites wanting a link in exchange for one on their site.

    My other question is, Google’s results for shopping terms seem to me to be all affiliate sites which are adverts as I understand it with a lot of content. I don’t understand when Matt says search results should not be ads?

    • says

      Hi thank you eric – I’d avoid swapping links with sites that are not related to yours. I’d also avoid swapping links at a level where it starts to become a large percentage of your overall link profile.

      I think what Matt said is that the search results should not be other search results. The other thing he said is that Google tries to treat ads as if they were not editorial votes for purposes of search ranking. Highly commercial sites, such as some affiliate sites (not all affiliate sites are highly commercial) can appear in Google’s results if they are well optimized and have a good link profile.

  7. says

    Excellent post and interview, thank you Eric. I’d agree that many affiliate sites do appear very highly in search results when those sites are well optimised and take care to be full of content that is original in all except the names of the products they are selling. I’d also agree that a small number of relevant links from quality sites is far better than a stack of links from unrelated sites.
    I do have one question about duplicate content, specifically articles. Lets say you write an article to promote a product, and this article is not published on your site but distributed through article directories, and republished on blogs etc. Is the fact that so many inbound links are coming from duplicated content likely to adversely affect your site’s SE profile and ranking? I’ve not been able to find a satisfactory answer for this.

    • says

      Hi Joe – my experience is that this is not a problem. Of course, there are always extremes that someone might go to with any SEO technique. So as a component of your marketing strategy this is fine. One other note – make sure you are focusing on quality article directories, because the crappy ones won’t help you.

  8. says

    “In a follow on email, Matt confirmed that this is in fact the case. There is some loss of PR through a 301.”

    Let’s back up for a second here, what would be some loss, PR drop from i.e. 4 to 2, I had that before and also Google does take their sweet time to pass any PR from example1.com to example2.com when 301 is in place.

    Last site I did this it took them over 30 days and half of PR was slashed.

    Change domain and use 301 only if you have no other alternatives.

    Thanks for great interview to both of you,
    Emil

  9. Rowan says

    Just referring to the section in the main interview regarding 302 redirects for ads. Squidoo is now using 302’s for all inc content links. Not sure how long this has been the case, I’ve only recently started using it. Does this mean that links from Squidoo do not pass PageRank?

  10. says

    28th. “might put out a call for people to report more about link spam in the coming months”

    is this mean that in near future there will be definitely end of splogs aka spam blogs?

  11. Jack says

    Eric,
    Thanks for sharing. Great interview. Regarding PDFs, what is the impact of posting a PDF to a page for download that is very similar (if not identical) in content to the web page itself? Is there any duplicate content issue here to be concerned about?

  12. Diane Tuman says

    Question: If I have a handyman business in a large metropolitan region, and I wanted to capture businesses in certain neighborhoods, is it beneficial to create 5 different websites that target keywords for those locations and create original content for all those sites, but have the same e-mail contact address and phone number on each site? Is that cool? Or, would I get penalized? Or, is it not worth it?

  13. says

    Hi Diane – my own personal preference is to put all the content on a single domain. The biggest reason for this preference is that every web site you create is a new marketing problem. I.e., you have to go through the process of getting links to each of the sites.

    That said, I do know that sometimes having content on a separate domain can work pretty well for a small business like yours. But, we have never tried that, so perhaps someone else can comment on that.

  14. Diane Tuman says

    Thanks, Eric. If I created domains with strong keywords in the URL and redirected to my main site, does that help? Or is it all a waste of time and money?

  15. says

    Eric:

    My favorite tidbit is in reference to using optimal site architecture in consideration to page rank decay as you move further from the root.

    I am a fan of theming content for content-rich sites, but for powerhouse landing pages, nothing gets SERP respect more than using flat site architecture to get that link equity working on behalf of a site.

    The tier link method Matt suggested about keeping your main pages in the root, then linking to ten more second tier pages and then 10 more from the second tier to the third (in order of importance) is something we have been preaching for a while and I must say, it is the golden grail when it comes to buoyant page rank.

    You have a commendable array of interviews, but I think this one, is my all time fave. It’s nice to get confirmation on so many levels for SEO tactics and techniques we employ, but to hear validation from the source is nice for a change.

    Kudos to you (for getting the interview) and thanks Matt for these morsels of SEO validation.

    All the best…

  16. says

    Fantastic interview Eric, and some really good questions (and answers from Matt) to clear a number of cloudy SEO areas.

    Loved the SEOMoz cartoon aswell!

  17. Jim says

    Enjoyed the interview but frustrated we didn’t learn anything advanced SEOs didn’t already know.

  18. says

    Hi Jack – The PDF is a tricky one. Google does not seem to think PDFs are a great idea, so my guess is that they would treat the PDF as a duplicate page and ignore it.

    There is no great harm in that, as this should not affect the ability of your other pages to rank.

  19. says

    Hi Eric

    Nice interview and summary, thanks. I don’t know if anyone else has picked up on this yet:

    MC: “One thing to bear in mind if you are advertising via JavaScript is that you can use NoFollow on JavaScript links”

    From this can we assume that some JS links pass PR?

  20. says

    Hi Reuben – Google can, and is, executing lots of Javascript to allow it to find links, and Google will allow those links to pass PR. However, what is not known is what forms of Javascript they are able to execute.

  21. says

    “You can try to fix that in advance with good site architecture”

    Ok, but what if you’ve got poor architecture in a site that your are working on, but you’ve also got backlinks, printed promotional materials, emails, etc. that promote specific addresses on that site. It you redo the site, you’ll have to use 301s to get people to right place and the 301s “don’t pass juice.”

    So, what’s more important, or is there a way around this dilemma?

    • says

      Hi Karl – 301s do pass the juice. There is some loss, but it is pretty small. So if your site architecture is messed up, it is still worth fixing it, and then using the 301s to pass the juice around.

  22. says

    I am a big fan of Matt and tonight i m going to read this whole interview..
    Thanks stonetemple.com and also Eric for providing us this great material..

    Again thanks

  23. says

    I think the fact that Matt is still using the term PageRank , even though they took it away from Webmaster Tools, and the rumors around that PR won’t matter anymore, goes a long way in saying that PageRank is still very important part of the Google algorithm and will for at least the near future.

  24. says

    “The tier link method Matt suggested about keeping your main pages in the root, then linking to ten more second tier pages and then 10 more from the second tier to the third (in order of importance) is something we have been preaching for a while and I must say, it is the golden grail when it comes to buoyant page rank.”

    Are you saying put your main pages in the root then put your second lot into a folder and the third lot into a folder? I do it this way on all my sites apart from my very first site 11 yr old site.

    Thanks

  25. says

    “Hi Karl – 301s do pass the juice. There is some loss, but it is pretty small. So if your site architecture is messed up, it is still worth fixing it, and then using the 301s to pass the juice around.”

    I was reading elsewhere that you should use 302’s for a short term while you get all your backlinks to update links to the new page(s) then swtich to 301’s, therefore the ‘loss’ is likely to be less.

    This does make sense

    What do you think?

    Tina

  26. says

    Hi Eric

    This is my first time to read anything from you but have to say you’ve got great stuff. Getting an interview with Matt already edifies you as an expert but reading your comments and your excellent summary plus comments assures me you know what you’re talking about :)

    Whatever we think, even Matt said in the interview that it’s something that needs to be tested. Testing is something we all should be doing consistently and it’s tests that reveal the best way to navigate through all the SEO mumbo jumbo!

    Most of us are aware of what works..great content and good structure are a recipe for success. Everyone tries for some shortcuts but in the end it’s a great user experience that gets people coming back and possibly a link while you’re at it ;)

    Thanks, I’ll be reading more of your posts now!

  27. says

    Throwing in that you sitemap.xml can only get you so far. Building links back to the category pages helps the SE’s crawl from there. And if you are doing any article submission, be sure to link the other sources back to the original on your site.

  28. says

    A new client has used all the sitemap varations xml a .htm with a list a .html with a list and descriptions with no template around it . xml.gz and .xsl when i asked him about it he said he was using /www.automapit.com/index.html Isn’t that over kill? Is’nt all you need a xml and .htm with template and links and descriptions?

    Thanks
    Tina

  29. says

    Very, very long and very, very informative interview, Eric. I am not an expert in SEO in any way. I would rather describe myself as a “hobby webmaster”. Some of the insues covered were therefore over my head, but I was able to understand most of the interview (with Google opened in another window). I am not quite sure how I can use this new knowledge with regards to my own website, but wiser I am. Thanks

  30. Linda says

    Hi Eric,
    Great interview! What was the date of the actual interview? (not the publication date…)
    Thanks!
    Linda

  31. says

    It’s awesome that you got a chance to visit with Matt. I gleaned a lot from reading the interview.

    Thanks for taking the time to share it.

  32. says

    When you say and that’s why it looks like it has less good content, in section 4, you refer to “trustrank”, the overall quality of the site?
    For the 301, we have noticed changes, you confirm?

    Cordialy

  33. says

    The crawling section with regards to “NoIndex and NoFollow metatags” question cleared up a lot of confusion on that topic. And how Google will make its bots crawl the site. Some Google Clarification, finally.

    Excellent interview and post.

  34. Anil Kumar Pandey says

    Nice Set of interview questions . Must For every seo to read. Thanks to Eric for sharing this informative inerview with us.

  35. says

    The reporting of spam would be a good addition, but how google would be able to check all reports is the question that would need resolving as there are sure to be competitor activity with wrongful allegations.

  36. says

    The most interesting fact for me was the confirmation that 301 redirects dont pass 100% of the Pagerank. I know there have been rumors about that a long time, but its hard to explain it without proof.

  37. says

    Just re-read this and its a shame to see a lot of Matt’s suggestions to have actioned are still not done – nearly a year now guys!

  38. says

    Interesting interview. A lot of the informations were new for me!
    Is there a newer interview online? Maybe I should search a bit…

  39. says

    You appear to give the impression that the majority of what Cutts says is for the benefit of search users am I right? Fair play given you do need to step back sometimes and assess peoples motives. Cutts pay check derives from Google therefore he will always have their interests frst and foremost (again, fair play) but there are a number of disingenious comments in there which I am inclined to believe are false truths. You always have to consider that whatever the guy says or appears to intimate, is in fact the way they would ‘LIKE’ you to do things and not nessasarily the way they currently CAN do things themselves – remember, its a machine running a program and that will always have its limitations.

  40. says

    Nice interview, I have seen several interviews with Mr. Cutts over the years, and this is one of the better IMO. Interesting to see that he confirms that a 301 makes you lose some linkjuice.

  41. says

    What is the difference between “merged” and “discarded” pages. At one point in the interview, Mr Cutts says, paraphrasing, that, if three crawled pages have the same content, two might be discarded. Later, still paraphrasing, he says that the link juice to three similar pages is not necessarily lost because they can be merged. Is merging and discarding two different process? What about “omitted” pages, not mentioned in the interview, but mentioned at the end of SERPs in “In order to show you the most relevant results, we have omitted some entries very similar…” Is this yet a third category?

    • Eric Enge says

      Hi Dominic –

      By discarding, Matt is referring to Google ignoring one of the duplicate pages in favor of the other. The ignored page is effectively “discarded” and cannot show in the search results.

      Merging as discussed here refers to the PageRank/link juice. If Google sees two duplicates, and sees that both copies have some links to them, they *may* treat it as if the links pointed to the discarded page went to the page they did not discard. However, my opinion is that this is not something they do particularly effectively. In other words, they “merge” the link juice that points to the two pages into link juice pointing to the one page.

      Omitted in the context you mention does refer to something slightly different in that pages can be omitted even if they are not fully duplicate. They may get omitted from being displayed if Google does not think that they are valuable pages to show as a response to the search you did, even though they do match at some level. Note though, that you can go ahead and ask Google to show you the omitted pages, but you cannot ask them to show you the discarded pages.

      Hope that helps!

      • says

        Yes it does help a lot. Still a bit confused. Just checking my understanding. Discarded pages can be merged or not. If they are merged, they are indirectly indexed, but under a canonical URL, just like in a 301 redirect, so that their link properties can be added. If they are not merged, they are simply not a part of the index at all. If they are omitted, they are a part of the index, not discarded, not merged, but simply omitted for that particular query. The part that confuses me is why Google would discard a crawled page (without a noindex meta tag) that is similar to another one that is acceptable? What could be wrong with that page and not with the other one that is similar?

        • Eric Enge says

          Hi Dominic,

          When we talk about a discarded page, we are talking about one that is 100% duplicate (or very close to that). There is no reason to keep it, as it adds no unique value to the user. So the page will not be shown in the search results. It may in fact be in the index, but never be shown in search results because Google has a copy of the same content that it can return.

          • says

            Shouldn’t these pages be always merged with the almost identical one? Let say they are 100% identical, why not merge them?

  42. Eric Enge says

    In principle, they should but in practice they don’t. I believe this is because it is much more complex than it seems on the surface. It is easy for us to say do that when we think of two pages side by side that appear identical. However, the web has trillions of web pages, and there is a lot of complexity in everything that Google does.

    • says

      Interesting ! The canonical tag should then help. They say that it is managed as a 301 redirect, except that they check for obvious flaws and, of course, that they reserve the right to change the way it works if people abuse.

  43. says

    Soooo,

    What I am getting are mixed signals then. A newsletter I just got from Social Media Sun uses your article to suggest we stop doing news from press release, or any kind of “rewrite” journalism. I feel bad taking the “angry” approach of late, being a fan of Cutts and Google for years, but to be honest Panda and subsequent Google updates has all but killed me. Mine. And no doubt thousands of other good writers.

    All of this favors The New York Times, and does in citizen journalists or anyone wanting to break in to writing for the web. Not everyone has 24 hours a day to play SEO expert, techie, follow the Google Official Blog, get hammered by trolls over at Webmaster, let alone take care of editing and so on.

    So the reason no one reports on our press releases is for fear of stumbling over some Google rewrite algorithm? But SFGate can embed PRWeb releases and scrapers can steal my articles? Then, I have researchers telling me Google is giving what I would call “no surfer left behind” readers what they want – “munchable bites” – I thought Digg was all but dead?

    Sorry, but I feel like that Armadillo running back and forth in the middle of the road. Which way do I go now? Are we elevating here? That’s all I want to know.

    Phil

    • Eric Enge says

      Hi Phil – Its hard for me to comment on the specifics of the situation, but of course, the concept of creating news from press releases is an old one. One large factor is whether or not the press release is simply being republished. Well branded news sites certainly have little trouble doing exactly that though, but that is because of their brand. For less well branded sites I would suspect that its a good idea to add thoughtful analysis and commentary or combine it with other content. Of course, even in a world without Google that would be true.

      But as I say, I don’t know enough about the situation, or how Google will treat spreading news to comment.

  44. says

    Thanks for the investment in time to summarize key points from the interview with Matt. Also thanks for your editorial comments, as they proved helpful to me in clarifying/supporting points which were made.

    Thanks again!

    Andy

  45. says

    Would be good if you could get a other interview with Matt and ask him about the new social media aspects and there effects like Google Plus, as i hear reports of big jumps when linking your website to Google Plus.

    That aside good read and interview with Matt

Leave a Reply

Your email address will not be published. Required fields are marked *

*