This post will provide some more info on the September 26, 2007 Searchification event that Microsoft put on. In particular, this post will focus on the core search engine improvements part of the presentation. I will provide some brief comments on each announcement, as well as pictures of the speakers.
1. Introduction by Brad Goldberg. Brad Goldberg, general manager of the Windows Client Product Management Group at Microsoft, who manages the search team from a business perspective, started things off for the event:
Brad spoke about data that indicates some of the basic problems with search. For example, 40% of search queries fail to provide an answer, and 50% of these queries require refinement before an answer is found. People find that getting what they want requires a high level of cost and commitment.
One of the more interesting things he spoke about was the search market data share data from comScore:
|Engine||Users||User share||Query share|
Based on this data, he stated that Microsoft’s focus is on getting more repeat queries from their user share, or doing a better job of delighting their current customers.
2. Overview by Satya Nadella. Satya Nadella, Group VP, Search and Advertising Platform Group, was up next and dug into a bit more detail about Microsoft’s areas of focus for this update:
Note that one of the comments Satya made in a pre-show discussion I had with him was that the Microsoft infrastructure is finally getting caught up, and this is enabling them to do much more with their search product.
The last major update that Microsoft did was in September of 2006. Microsoft has been doing rolling updates through the year, including a number of performance and relevance changes. In this release, several of Microsoft’s search products were affected:
- A major update was done to the web search index
- Mobile Search
- Shopping Search
- Health Search
- Image Search
- A Microsoft Video Search Product was Announced
Here is a screen shot of one of Satya’s slides:
Next up was a summary of customer feedback:
The data in the above pie chart was based on an analysis of user click behavior. The sidebar point about relevance was based on an analysis of over 10,000 feedback submissions.
Given the preponderance of concerns about relevance, Microsoft did some further research to get a better understanding of the nature of the relevance concerns. This broke out as follows:
Based on what Microsoft learned from these analyses, they invested in 6 major areas:
- Coverage – They increased their index size from about 5B pages to about 20B pages
- Query intent – Making a better determination of what the user is really looking for
- Query refinement – Determining how to refine a query to provide the user with better results
- RankNet improvements – A variety of tweaks to Microsoft’s Neural Net algorithm to improve results
- Structured information extraction – Doing a better job of using structured data bases to improve relevance
- Rich answers – Incorporation of blended data from verticals, such as image and video search
For those of you who want a brief definition of what RankNet is, you can see it here:
Demos of Improvements. At this point in the presentation, Ramez Naam joined Satya to run some live demos:
Ramez demoed a variety of search queries and their results. Here are some of the queries that were demonstrated:
- EPRML – Microsoft did not show acronym based answers before, and now does. They also only showed about 1,700 results for this query previously, and now show more than 10,000.
- c.n.n. – Live search used to look for “c n n” after seeing this query, but now it knows to look for CNN.
- Groig Freiderich Nicolai – Live Search now auto-corrects this query to “Groig Friederich Nicolai”.
- China – This search shows off some of the rich media integration, as well as the “Related Searches” functionality at the top right of the results screen.
- Volkswagen Kaefer – Shows a German page, which probably is the best result for most users. The page can be translated on the fly, and in fact, Live Search offers a mode in which you can see the original German version and the English version side by side. You can also modify how this works using on screen controls.
- San Jose weather – now shows the weather right on the screen. This feature will be live to the public soon.
- San Jose traffic – You now can near real time traffic info right there on the screen.
- MSFT – Provides an intraday stock chart, along with pricing and volume information on the search results page.
- Barack Obama – News results are integrated in for those queries where that would be relevant.
- space shuttle videos – Video results are incorporated directly in the results, and you can play the videos inline on that page.
the office – search engines normally strip off “the” from this query, but many users are actually looking for the TV show. This now comes up in Live Search results.
IL soccer – The new Live Search understands that in this query that “IL” means Illinois.
This is a sampling of some of the more interesting queries demonstrated during this part of the presentation.
Ultimately, the objective of this effort was to increase their search results relevance. Microsoft then did some testing with live human subjects to assess relevance. Each participant was trained on how to assess relevance. They had this group do a large number of searches and presented them with results in format where they did not know which search engine the results were from.
Net-net, the results of this testing showed a dramatic improvement in Microsoft’s relevance scores. Here is a chart showing the details:
In summary, it is clear that Microsoft made a lot of improvements to their core search results. The real tale of the tape will emerge from tens of millions of searches done by real users. That said, a number of different issues were presented, and Microsoft has addressed them, so this presents good progress.
The other concern I would have would be is whether the underlying strategy of trying to capture more market share from their existing users will work. After all, is the reason they capture the initial search, but not the follow-on searches, simply because users know which search engine they already trust?
This could provide some resistance to getting comfortable with Live Search results. However, I can tell you that I am more intrigued by Microsoft’s improvements and Live Search then I have ever been. I know I will be doing some testing of it on an ongoing basis.