Misuse of Big Data Can Cost You at the Cash Register

The good folks at BloomReach shared some data with me recently. This data showed how “gift” oriented search queries differ from other types of queries. In the process of reviewing this I realized that it provides an excellent example of how drawing premature conclusions from data can cause you to make bad mistakes.

As background to the source of the data, BloomReach provides a product called BloomSearch that is in use by a large number of e-commerce web sites. The product enables those sites to scalably modify their product pages so that they can capture a lot more long tail search traffic, resulting in significant incremental revenue.

As a result of this, BloomReach has access to lots of information on how these sites perform. Let’s take a look at a sample of the data!

This shows data for 9 anonymous e-tailers. We see the bounce rates shown for 2 different types of queries – “gift” and “non-gift”. BloomReach found that gift queries contain certain obvious terms like “gift” or “present”, or sometimes not so obvious things like “mother’s day flowers” or “Valentine’s Day chocolates”.

Some gift queries occur at the same time every year, while others are unpredictable and ongoing (i.e. birthdays and anniversaries). For example, “housewarming wine basket” is another example of a gift query that was included. “Non-gift” queries represent all other queries.

Continue Reading…

Graph Search & Social Search With Bing’s Stefan Weitz

Key Points

photo of Stefan Weitz

  1. Initially when Bing launched social search, they wanted to carve out a distinct space for the social results. Later on it became clear that these worlds were blending together and it made less and less sense to keep them in a separate space.
  2. Bing is now indexing 30 times more data from Facebook than they had previously. On average, people will see about 5 times more results than before.
  3. While Bing is doing a much better job of harnessing user’s relevant friend information, they are also focusing on relevant “expert” information as well; influential bloggers, subject matter experts…
  4. Even though search and social results are blending, they are still kept separate because really, how can anyone decide which of those to rank more highly?
  5. The notion of a Like is still a little bit perplexing from a ranking perspective. What does a Like mean for a page? Does the user like the design, the content, or maybe just the picture? Bing tends not to just use a pure Like signal to do ranking.
  6. Shares are basically the same as Likes – not used a ton for web ranking except in velocity (like the way Twitter is used for discovering news).
  7. It’s an uncharted territory as far as what are the best types of queries for social search. It may be that in social search every query should have a person as an answer. Even something like, “what’s the height of Mount Everest,” a very definitive, objective query should have human results.
  8. Bing’s social search has combined together four different services and applied a layer of machine intelligence on top and applied a layer of semantic knowledge on top of that to deliver that one result; something no one else is doing right now.
  9. When someone changes privacy settings or deletes a post from Facebook, Bing gets that update in real time. The result is then purged from their results in minutes not hours or days.
  10. The social pieces in the Facebook experience were all developed by Facebook. Bing uses their own algorithm on the social search data for their social search results. It is completely independent of what Facebook does with Graph Search, even though it operates on the same data set.
  11. When you search on Bing, it gives you the web results plus all the different updates that come from the Facebook social graph. On Facebook it really pivots more around the person and their interests.

Full Interview Transcript

Eric Enge: Let’s talk about Bing and Social Search!

Stefan: Initially when we launched social search, we really wanted to carve out a distinct space for the social results. That was done partially from a user experience standpoint to identify the fact that we think social results are often very different than web results. The web results are what the web knows about your query; the social results are what people know about your query.

As we really got into it, it became clear that a lot of times these worlds were blending together and it made less and less sense to actually keep them separated off in that carved-out space. They are still separate in the new experience, but it’s much more in line with the overall experience than it was before.

Let me show you what that looks like. If I try something simple, like Hawaii, what we get are the web results on the left-hand side. In the middle, you get our snapshot, which pulls in data and services from across the web. You can see people who were born in Hawaii, who their governor is, celebrities who are from there, all sorts of different things.

Continue Reading…