Bill Tancer discusses how search data can tell us about the offline world

Picture of Bill Tancer

Bill Tancer

Bill Tancer is the General Manager, Global Research at Hitwise. Bill brings 12+ years of marketing, market research and corporate strategy 12+ experience to Hitwise. As the GM for Global Research, he provides cutting-edge research and insight into online consumer behavior and the application of online competitive intelligence.

Bill’s analysis of the online landscape has been quoted extensively in the press, including the Wall Street Journal, New York Times, USA Today and Business Week. Bill is a frequent guest on CNBC, and has been interviewed on MSNBC, NPR, CNN Radio and CBS Radio. In addition to speaking at keynote events, Bill is the author of a weekly online column for TIME, “The Science of Search.” He was also named Television Week’s “12 to Watch” for 2008 and is currently on the advisory board for the PEW Internet and American Life Project.

Prior to joining Hitwise, Bill has led market research and strategy teams at LookSmart, Zaplet, NBC Internet and Pacific Bell. Bill has also covered the Internet sector for Gartner Group as a senior technology marketplace consultant.

Bill has a Bachelor of Science degree from the University of Florida in Quantitative Management and a Juris Doctorate degree from the Walter F. George School of Law, Mercer University.

Interview Transcript

Eric Enge: Tell me what it’s like to be a data geek?

Bill Tancer: Well, it’s interesting. Since I have started speaking about Hitwise data, and I have been at the company almost 5 years now, I find that I am not alone in being a data geek. The more people I talk to about data, the more people are coming out of the data closet and saying hey, you know what, I am a data geek as well. I think I have been into data since I was very young. And, the thing that I enjoy most about what I am doing right now is that I am finding out there are lot of more people out there like me.

Eric Enge: It gets into just how you want to go about solving your problems and what kind of information you might base your decision on.

Bill Tancer: Yes. I think it starts with a curiosity about how things work and what makes us tick. Hidden in the data are a number of different sources that can tell us things like how we react to what happens in the offline world and what we can tell about our reaction based on what we do in the online world.

It’s that curiosity that starts everything, figuring out what makes us tick. And then, from there I think the next step is finding the business application for what we learn from that additional curiosity

Eric Enge: Right. And, that’s one of the things that’s wonderful about the book (Click) and the things that you have been speaking about; there are some things that are not naturally easy to quantify.

Bill Tancer: Yes, absolutely. A lot of things aren’t quite easy to quantify, and there is very much an art and a science to a lot of the analysis that we do. There are a number of examples that we go through where you start with a very simple example and you think that perhaps searches on personalities in a relevant reality television show, like Dancing with the Stars, should correlate very well to the vote on the show for those same personalities. But it actually takes getting into those search terms and finding out what are the variations in the way people are searching, and where are they going from that search to find the intent behind it. And then you have to factor in that intent to your analysis.

Eric Enge: Right. And, you gave the example in your SMX presentation about how you were sitting in a hotel room one night and you were watching Dancing with the Stars while working on a presentation for the next day, and you did a click check to see who was getting the most search volume. And, you predicted that Stacy Keibler would win, but it didn’t turn out that way, did it?

Bill Tancer: No, it didn’t. And, that was the first example where it became very clear to me that it was necessary to take that second step in the analysis to actually get behind the search term and figure out what the intent is of these searches. Thus, in this example of Dancing with the Stars, if there is a search on Stacy Keibler does it really equate to somebody who has the intent to vote for her, or is there some other motive?

With that specific example it turned out that a lot of people were searching for pictures of Stacy Keibler. So, when you looked at searches for her versus Drew Lachey, where people were actually interested in his performance on Dancing with the Stars, you have to make some adjustments in your predictions.

Eric Enge: Right. If I remember correctly, you also did something where you saw what the demographic or age range was and you realized that 18 to 24 year old males probably weren’t going in to vote on Dancing with the Stars.

Bill Tancer: That’s right. That’s one of the advantages of having the Hitwise data in our system behind this analysis. I can take Stacey Keibler’s name, and beyond just charting volume of search results for Stacey Keibler, I can look at things like what are the combinations and ways that people are searching for her? What are the terms that they are using, and then I could even take the most popular searches for Stacy Keibler and find out where people go when they search on her.

That’s where you can start to infer the intent. And, what I did in this example is I looked at some of the top sites like the Official Women Of Wrestling site, which had males 18 to 24, and 25 to 34 as its primary demographic. From there you can start to put together your analysis and figure that this demographic probably is not so likely as to vote on a reality television show. And again, looking at the search terms that are generated that end up in people visiting that site, they were looking for pictures of Stacy Keibler.

That’s what led to something we call the Stacy Keibler Correction Coefficient, where we adjust our prediction based on what we see in the search term data. So, that’s a good example of where there is art to this analysis; you really do have to make some logical inferences when you look at the data. At the same time I try and keep things very simple.

Eric Enge: It seems like to do this kind of analysis there is a certain amount of creativity needed, because you don’t necessarily know where your Stacy Keibler Correction Coefficient is going to come from.

Bill Tancer: Yes, there is creativity. I think the first place creativity shows up is just figuring out what to analyze because there are so many things to be analyzed. We are capturing data on over a million different websites broken into 172 different industry categories. It’s a dataset that updates every single day. One of the challenges that we face as analysts with this dataset is determining what to analyze.

It can come from a variety of different solutions, from seeing something on TV like I did to reading something in the news or even hearing a conversation in passing. It will cause a flag to go up and I’ll think I should look to the data to see if I can corroborate or dispute what I just heard.

Eric Enge: Alright. So it’s one of those things where you need a data geek; you need someone who has certain amount of creativity and passion. And then, you need to make sure that the person doing that analysis is really focusing on things that are actionable.

Bill Tancer: That’s right. And, that’s another step. If we know we get beyond the initial curiosity to answer a question, and then we get to combining the art and science of making predictions based on this data, the next step is to say, okay, what’s the actual insight that we can get from this. I am not going to make a business out of predicting a reality television show result, but if we take what we’ve learned in that whole analysis, we can apply it to business.

We can apply the fact that people will often search on things based on what’s happening in the offline setting, like television. If search correlates with that offline activity, there must be some very valuable insight on the effect of a product placement, or a television show on a brand, and using search terms as an indication of brand equity.

Eric Enge: Right. So, you can measure the impact of your campaign. It gives you another gage on how you did. How much search traffic did you generate?

Bill Tancer: Right, exactly. And, not just how much search traffic you generate, but what can that tell us about our brand? One of the hardest things to get at is brand associations, what’s our brand equity? Not only is it difficult, but it’s also timely and costly. What I’ve found is that search term data can give you very valuable, quick insight into what’s happening with your brand.

Eric Enge: In your SMX presentation you talked about understanding what innovators and early adopters were doing.

Bill Tancer: That’s right. One of my favorite parts of the book was this idea that we had, where we looked at websites that have gained popularity very quickly, sites like YouTube. If we could look at the segments of visitors to those sites and even roll it back to a time before those particular sites were really popular, we should be able to identify who are the early adopters of new technology.

In the book I talk about Everett Rogers, who is a professor and sociologist who came up with his diffusion of innovation curve. He studied how long it takes products or technology to diffuse throughout a population? From that he developed the diffusion of innovation curve that identifies the adopters in the product lifecycle from innovators to the early adopters, to the early mainstream, late mainstream, and so on.

What I found very fascinating with this data is that we could actually visualize this curve that Everett Rogers talked about back in the 1960s and 1970s. If I took our data from YouTube and rolled it back to before the site became very popular by segment and retract things in mosaic segments, I could figure out what the segments are that are the early adopters of this technology. If I can do that, and we did that across a number of different sites, I could search backwards say what these same users are doing today, and actually give us some insight into what might be popular tomorrow.

Eric Enge: You had a really good example that you gave about the Google Chrome situation, which actually revealed a new kind of Stacy Keibler Correction Coefficient

Bill Tancer: Yes. That was an interesting example. I spoke at a Google Authors event, and I was talking about this analysis and what we had found, and one of the engineers came up to me afterwards. He said, “we’d be really interested to see what happened to Google Chrome, because we think that the early adopters for this technology are probably MAC users and also Linux users.”

An interesting correction coefficient happened in the offline world when Google rolled out Google Chrome. They released it only for Windows, and it wasn’t available on Linux, and it wasn’t available for the Mac. We took a look at our data to figure out which segments were adopting it. We looked to see if they were the early segments that we identified in our previous analysis, and they weren’t just as the engineer had expected, because the ‘Young Digerati’, ‘Bohemian Mix’, the ‘Money and Brains’, which are our names for the segments that are the early adopters that we’ve identified in the past are the Mac users primarily; some of them Linux users.

We found different segments were adopting Google Chrome. I don’t know what that’s going to mean for Chrome going forward. I imagine that Google will come out with versions for other operating systems. But, it’s been really interesting to track; to see what affect that entry point has had on the adoption of that technology versus what happens when they come out with support for other operating system versions.

Eric Enge: Right. So, you could imagine a different situation where perhaps they pre-anticipated that the best audiences for early adoption of their product, and based on that launched with better support for Mac and Linux environments.

Bill Tancer: Yes. I think it’s little too early to tell for Google Chrome, but it’s going to be very interesting following this going forward to see how this is going starting with a different entry point. Entering more in the middle of the curve versus the beginning of the curve; how it affects the long-term adoption of this new browser.

Eric Enge: Right. Google is in unique position in there. They can stand different types of adoption curves, probably more than many businesses can.

Bill Tancer: Probably.

Eric Enge: So, just another thing to maybe think about is how businesses can apply this kind of thinking with the whole discussion we’ve had about the top search terms for affluent people. Another example you have used is that the top sites for this segment were all the finance sites, such as Merrill-Lynch, Schwab, Fidelity, and so forth. But during the initial part of the down turn they suddenly were looking more at entertainment and diversionary sites.

Bill Tancer: Yes. We looked at the most affluent segments that we track. We went to August of 2007 and we looked at where these people were going from this affluent segment that was called the Upper Crust. What we found was that the top twenty of their sites they were visiting were sites like their Schwab Account , Merrill Lynch Account, Fidelity, Yahoo Finance, Google Finance.

We pulled that same set of top-twenty sites for that segment in August 2008, and those brokerage accounts were at the bottom of the top twenty, and at the top were a lot of the diversionary sites, game sites, and celebrity blogs like TMZ. What the data was showing us was that as things were trending down, it looked as though this affluent segment was deciding not to pay attention to their portfolio, because they would rather engage in other activity.

It did change as the brokerage firms and the investment banks began getting in trouble. Suddenly, this group was going back and actually trying to get into accounts. It’s something that’s on search term data where a number of incidences of terms such as “how to log into my account.” These people had been away from their account so long that they had forgotten how to actually get in and check their portfolio.

The insight there for businesses is the need to understand what’s happening with your consumer. Don’t always think that you know how your consumer is reacting to market conditions as in this example. You might think, but behavioral data, actual observed behavior, can tell us what people are actually doing, and that’s just one example.

Eric Enge: Right. So, one fictitious scenario here is that you might want to be changing where your ad campaigns are running depending on where the audience you are trying to reach is going.

Bill Tancer: That’s right.

Eric Enge: There are probably a million ways that you can spin different insights into these different things. But, the big lesson here is that there is a lot of valuable business information that can be obtained from data as long as you avoid getting fooled by a Stacy Keibler Correction Coefficient.

Bill Tancer: That’s right. You also have to be careful if actually you are looking out for the Stacy Keibler Correction Coefficient. You need to be looking out for what’s the intent behind the behavior that you are observing, making sure you are factoring everything in, and that there isn’t some outside variable that’s affecting what you are seeing in the data.

Eric Enge: Right. You can trick yourself pretty easily actually. So, how is research data different from the other kind of data collection techniques that people use?

Bill Tancer: Well, if we go all the way back to judicial market research, there are surveys and focus groups and a number of different observed behaviors. Then there is all we have available to us in data today which is search term data, visits to website, visits to categories of sites etc. They both have their advantages and disadvantages. The advantage to traditional market research tools like surveys and focus groups is that you can ask specific questions and get answers.

There are a few challenges to traditional market research. One is getting represented users so that you really have something reflective of the population that you are trying to predict or extrapolate. There is also the issue of having a sample size that’s big enough to make an accurate extrapolation of the population. And then, the other issue is that we don’t always say how we truly feel or how we would act.

Often times observed behavior is different from what we say we would do. We also have the advantage of being able to collect on a very large sample of users, the challenge is to observe internet behavior when you are not really asked the question. You are making that inference from behavior, and that’s where you can get hooked up. That’s where the difficulty lies.

Eric Enge: Right. I think it would be interesting if there was a way, for example, to draw a correlation between the various political polls that we are seeing out there all the time, and what people actually do when they vote in this election.

Bill Tancer: Yes. Politics is unfortunately one place where the predictions don’t work very well. As you get closer to an election, the more crossover traffic happens. The more people are visiting, opposing viewpoints in terms of websites or searching and opposing candidates with opposing viewpoints.

It’s probably due to swing voters and people also checking out the positions of an opposing candidate as the way of formulating their own arguments to support their position or their candidate. And, because of that crossover traffic, there are confounding variables that make it impossible make a prediction.

Eric Enge: Can you talk a little bit about the genesis of Click? What gave you desire to write the book?

Bill Tancer: I think one of the crucial points in my career at Hitwise is when we decided to start our blog, which is still active at We were posting some of these interesting things that we were seeing in the data here. The response that we got was incredible, both in terms of traffic, and of influencers that were calling us and asking us for more data and really engaging in this data.

I think at that point I realized I was probably going to write a book on this topic. From that point I started writing a column and I am currently writing a column for Time Magazine called “The Science of Research,” which is pretty much the same as Click. There is lot of interest on this specific topic as what we think is inferred about ourselves from what we do online.

Eric Enge: Indeed. So, it’s the national progression of being a data geek.

Bill Tancer: Yeah, the product life cycle of a data geek.

Eric Enge: Thanks Bill!

Bill Tancer: My pleasure. Thank you Eric!

Leave a Reply

Your email address will not be published. Required fields are marked *