Cedric Dupont is the Product Manager for SearchWiki at Google, Inc. He has been with Google for almost 2 years (since June 2007). Prior to Google he worked as a consultant at Bain & Company, where he worked on strategy and due diligence for Private Equity deals for funds in the US and Europe. Prior to Bain he worked as a Senior Research Engineer at Volkswagen in a research capacity.
Eric Enge: Can you provide an overview of what SearchWiki is?
Cedric Dupont: SearchWiki is a feature we announced a few weeks back. It is a feature that gives users more control over their search results. So, you can re-rank results or you can add a result that you feel is missing. You can delete results that you don’t want and you also can add notes to specific results for yourself.
The next time that you do a search, we’ll respect that new modification, so we’d show your modification again. So, the changes that you make only affect your own rankings, but you can also see how other people have ranked or modified their searches by going to a special page, what we call the All-Notes page.
You can go to the All-Notes page, do a search if you think it’s popular and you’ll see a bunch of websites that have been promoted and deleted for this search, and a bunch of comments that have been added as well. What we show in terms of rankings is an aggregate number of deletions and promotions from our other searches.
For comments, we associate the comments with a nickname that is pulled from your Google account. The reason we are doing this is because the web is becoming more and more participatory in nature. So, people interact, they are no longer just passively pulling from each corner of the Web. People are interactive, they post stuff, they comment on things, they want to change what they see. Search is adapting to this phenomenon, and therefore becoming more interactive and more participatory.
Eric Enge: What were your goals were for the whole program?
Cedric Dupont: Our goal, as it always is, is to make our users happy. So, this is taking customization to the next level. Say there are results that you don’t want to see for certain queries, you can delete them. If there are results that you think are missing, you can add them, and we’ll respect that every time you are searching in the future. So, we want to give you exactly the results that you would like to see for a query. Some people are probably already happy with what Google does for them normally, and they’ll be even happier in the future with all the search improvements that we are making everyday. But, some people just really want their search results the way they want them, and for these people SearchWiki brings them that level of control.
The other thing that we’ve seen people use SearchWiki for is repeated searches. I am sure you can think of things that you keep searching for over and over again and you still don’t remember the URL. SearchWiki will remember you’re searches and make sure that you’ll find whatever you’re looking for much quicker and easier, by putting that site on top. So, that’s the behavior that we are seeing, and it’s the behavior that we have also seen in the experiments that preceded this one, so there are basically no surprises here..
Eric Enge: Right. So, do you anticipate allowing a more refined type system where the user can pick the first result, the second result and the third result, and re-order at that level?
Cedric Dupont: Yes. There are ways that you can order it in a very granular fashion. And, it just depends on what order it is in. Every time you promote a result, it gets back to the bottom of all of those results you have promoted so far. If you promote it again, then it will go to the top, so there is a way of playing around with the order so you can basically do the exact image that you want. But really what we’ve seen in user tests is that people just say, no, this thing is part of the bucket of things that should be on top, and this thing is something that I want to see. It’s very weird as people want to take result #8 and say no, this should really be result #5. That’s not something that we have seen people demand, at least not in user studies.
Eric Enge: Right. So, you mentioned that one common use is repeated searches. Are there other common uses that you can comment on?
Cedric Dupont: Yes there are. We also see that users use SearchWiki as a means of expression. So, one experiment that we launched was during the election, there were websites that were created just for the proposition 8 campaign. When you would search for proposition 8 and go through SearchWiki notes, you could see a similar amount of opinions that go one way or another. You would also see it just as a means of expressing an opinion, which is interesting in this case of course.
People start using Google as a bookmarking tool, and this makes it just much, much faster to find what you want. We also found people that are really out there to cleanup the web. They want to clean spam out of their search results, and they really go in and have a large volume of deletions of websites that they consider spammy. So, these are the major cases that we observed.
Eric Enge: Right. So, obviously this is a major step towards personalization of search. Do you see interaction of this program with other personalization initiatives?
Cedric Dupont: Personalized search is a little different. We are trying to adjust search to the profile of the searcher. Here, it’s much more direct, so it’s a much simpler feedback tool. I mean, you are telling us this is what I want to see on top and this is what I don’t want to see, and we give that back to you every time you search. So, at this point it’s completely independent of personalized search.
Everyone starts with a different page, and you start doing SearchWiki actions on those from a different page.
Eric Enge: Is there a plan to offer people who want a way to opt out?
Cedric Dupont: SearchWiki is a feature of search and we typically don’t let people opt out of features. You can’t just say, well, I don’t want any news results, or I don’t want any book results appearing on my page. I wouldn’t talk about opt out in any case, but what I can imagine is that people who don’t want to see SearchWiki for one reason or another could choose a setting that turns that off for their particular account. But, at this point we don’t see an opt out feature being available, and I think the numbers show that it is not something that is widely requested or needed.
Eric Enge: Right. So, a while back at the Web Conference in Europe Marissa Mayer commented a bit on the potential for SearchWiki to be used as a ranking signal for other peoples’ searches. Can you talk about that a little bit?
Cedric Dupont: We are constantly looking at every signal that we can imagine to improve our search. SearchWiki is a new feature and it will provide a new signal. We are carefully looking at it to see if it’s usable. we just launched a few weeks ago, and at this point there is really nothing to announce. That doesn’t mean we are closing the door on using SearchWiki as an additional signal though.
At this point, we are mostly focusing on making the SearchWiki users happy, and increasing the usefulness of that feature more than anything else.
Eric Enge: But in the aggregate you could potentially collect some interesting data, right?
Cedric Dupont: A simple example is if we suddenly get a very large fraction of dissatisfied users with a site that is shown across the board, we could manually review that site. These are some very simple things that we can imagine doing early on, but again, at this point we’d like to improve the usefulness of the features before anything else.
Eric Enge: Right. Now, that certainly makes sense. I mean I could see that it certainly has the potential for spamming tactics that people could try to implement. It’s part of the landscape of what you would have to deal with in using it as a ranking signal.
Cedric Dupont: And, that’s not unlike what happens in the web in general. So yes, we are certainly very much aware that we have to create a robust spamming control.
Eric Enge: Sure. But, you guys are used to that.
Cedric Dupont: Yes, exactly.
Eric Enge: Even without the issue of spamming, you have to consider the general noisiness of the signal, and how good an indicator it really is of what should be the best search order.
Cedric Dupont: Right. It’s important to understand that the first order improvement is what you created for yourself. And then, the second order is whether the signal is powerful for one thing or another. So, really how people are using SearchWiki and how is it improving their results, that is what’s really important to us right now.
Eric Enge: Are you going to be doing some things in terms of evaluating the reputation or participants who provide comments?
Cedric Dupont: We are always looking at things like comment quality, quality of ratings and things like that. Evaluating the trustworthiness of each of these inputs will have to be part of any signal that we would use for ranking. So yes, we’ll have to do it right if we want to use the signal more broadly, and not every rating or comment will be considered equal if we do our job right.
Eric Enge: Right. There should be some method that allows you to basically track user behavior, and the kinds of things that they are doing and seeing. Maybe evaluating whether or not they are a set of users that are considered trustworthy enough to be part of that bigger evaluation.
Cedric Dupont: Yes, and these are very simple things. Behaviorally you can imagine the type of spamming that you would get. For instance, comments that are posted many, many times across many, many sites would be a warning signal. Maybe a comment that has been posted many, many times to many, many sites is less valuable than another less-posted comment. Again, it’s still very early and we have only been released for a few weeks, but these are the types of things we are observing carefully and we are very optimistic on how well we can use them to limit abuse.
Eric Enge: Right. It’s good that you’re giving users what they want by itself right out of the box. Given the problems with ranking signals, the most famous being links of course, being able to diversify sources of information is a good thing. I mean, you could see an interaction that somebody seems to have all kinds of great things happening in the world of links. Everybody encounters their site in the results, but a large percentage seem to be removing it. So, that becomes suggestive of bad behavior.
Cedric Dupont: Yeah, it’s an interesting idea.
Eric Enge: And, of course you can do it vice-versa, right? If large numbers of users are promoting their site, but the links aren’t following. I mean, those are two very simple examples, but it’s really good to have multiple signals to offset each other.
Cedric Dupont: Right. So, we are really looking forward to the types of things that we’ll see. And again, it’s a very recent launch, and we have few things that we want to deploy and that we think will make it more useful for users. We are very excited, and we are looking forward to this being a great thing for searchers.
Eric Enge: What are you doing to deal with inappropriate comments today?
Cedric Dupont: Well, today when you come across inappropriate comments you can flag it as inappropriate. And, what that does is it begins the review process. If the comments need to be taken down from public view, they will be. Of course we also have automated tools that review content.
Eric Enge: So, the automated tools look for things like bad language?
Cedric Dupont: I can’t share specifics, but I can imagine that that’s one of the things that you could look at.
Eric Enge: Are you considering any other kind of comment verification?
Cedric Dupont: No at this point. Any filter would have to be very, very scalable, because we have a lot of users and a lot of searchers. So, at this point we haven’t considered that, but we do have automated systems that try to figure out whether a comment is appropriate or not.
Eric Enge: Have you thought about implementing something where you could keep private notes which aren’t available to the general public?
Cedric Dupont: At this point, no. Comments are publicly visible, but they are associated to your nickname, which obviously is anonymous in a sense, because you can choose whatever you want as your nickname. And, it doesn’t point to a specific URL or something that can tell me that this John Dale is the same as that John Dale.
Eric Enge: Sure.
Cedric Dupont: Your comments are as anonymous as you want to make them, depending on how anonymous your nickname is. And so, we feel like that strikes the right balance.
Eric Enge: It would also be interesting to potentially have a way to get all your comments through a permalink or something like that.
Cedric Dupont: That’s an often requested feature, so what people would like to do is to imagine that somebody asked you about what’s the best LCD TV to buy? And, all you’d have to send is a permalink of your search LCD TV, and you have promoted and demoted the sites that you think are relevant or not. You may have added a few notes here and there, and you just sent that as a link. And so, now the person receiving it has the full context, including your notes. That’s a very, very frequently requested feature, and we think it would be valuable, yes.
Eric Enge: This would allow people to share their conclusions, and you can see that playing into question and answer sites where people ask these questions.
Cedric Dupont: Yes that request often comes up in forms. People want to be able to just send a link instead of having to write and then save a bunch of URLs. It would be something valuable.
Eric Enge: Right. And, what can you say about how the order of the comments is determined?
Cedric Dupont: We try to rank the better comments first, but we also try to show some variety and try not to show duplicate comments.
Eric Enge: Or, perhaps extremely similar comments.
Cedric Dupont: Right. To be completely honest, there is a lot of work to be done here. These are things that are pouring in right now, and we’re improving the ranking system as much and as fast as we can. But, this is definitely an area where you will see changes, because if you play with it, you will see that it can be improved pretty dramatically. So again, whatever I would say today would probably be wrong tomorrow. All these things we’re talking about are really only relevant for extremely top level sites where there are lot a comments.
Eric Enge: Do you take into account user votes on the comments?
Cedric Dupont: Things like that, yes. There are many things that come into this ranking function, and it will improve quickly because it is so young.
Eric Enge: It’s a pretty novel thing, because a lot of places on the web display comments in the order they are entered, so you don’t attempt to rank them.
Cedric Dupont: Yes, and there is a benefit to that but there is a drawback too, which is if somebody is really motivated you can make sure that his comment will always be there, because he updates them very quickly. On the positive side, it guarantees some freshness and some discovery, so chronological order is not a bad thing to do as a start. We find it useful to rank according to relevance, but I don’t think chronological is bad at all.
Eric Enge: They are two different things with different goals.
Cedric Dupont: Yes. Exactly.
Eric Enge: When you have a traditional-threaded discussion there is a reason for the chronology because of the way things feed off of each other. Here you are trying to surface to the top the comments that add the most value to the search results.
Cedric Dupont: Right. You mentioned something that’s very important I think.
You write notes to yourself, and you don’t see the context of the other notes at that point unless you go to the all-notes page and then do it there. But, there is no guarantee that your comment would be following this other comment by this other person that you are responding to. That’s an important thing to note. For our purposes, the conversation aspect of things is less useful and less clear. It’s not like you are on a certain web property where you have some assurance that other people will go with the same interest. So, we don’t really worry about supporting a conversation here.
Eric Enge: The goal is different.
Cedric Dupont: Yes
Eric Enge: Have you thought about broadening SearchWiki so that users can do more than just remove a site from a query, but say I don’t want to ever see this site.
Cedric Dupont: Yes, that’s another good suggestion. It is a tricky problem, because you may find one site completely relevant for one search and completely irrelevant for another search. It’s difficult to just delete everything across the board. In addition, the site itself could improve over time.
Eric Enge: Right. But, I think the biggest thing that you hit on already is that the user may not know what they are doing. If they don’t think about all the scenarios where they might be searching, they won’t realize that a site may be a perfect query for them in another case.
Cedric Dupont: Right. So, the choice that we made is that deletions are query-specific. So, when you delete a site for certain query, we assume you really know what you are doing at this point. You did a search and deleting a particular site made complete sense. When you promote a site, we do something a little bit more complicated. These sites may be promoted for similar searches, because there is no harm in promoting them again for queries that are very much related to your first query.
So, the short answer is we do this sort of thing for promotions but we don’t do it for deletions at this point. The risk of what could happen when it doesn’t work is much larger from deletions than it is from promotion.
Eric Enge: Thanks Cedric!
Cedric Dupont: Thank You Eric!