This is an overview of the day that Google called Searchology (May 16, 2007). In this post I will cover some of the aspects of the event other than the announcements themselves. This will include a series of pictures from my trip to the Googleplex, with some comments about each one, and what Google seemed to be trying to accomplish with the event.
1. The first picture shows the ‘Plex as viewed from right outside Building 43, which is where Matt Cutts, Adam Lasnik, and crew reside. The funny thing is that this picture looks a bit like a war zone. In fact, the architecture looks quite neat in person. Note that the tables laying on the ground you see in the foreground are there because they are about to setup for an event of some sort, perhaps a concert, which is a common event at the ‘Plex.
2. One of the new projects on the Google campus is an herb garden:
3. Elliot Schrage, Google’s Vice President of Global Communications and Public Affairs, chaired the event. He introduced the plan for the day, and each of the speakers:
4. Craig Silverstein, Google’s Technology Directors, was next up. He spoke about the “ghost of search engines past”. Craig was employee number 1 at Google, and was involved when it was still in the dorm rooms. Sergey Brin’s dorm room was used for business, and Larry Page’s dorm room was used as the machine room. This all took place during the height of the Internet bubble.
The rack you see on the left is one of the very first racks of machines used by Google. When the search engine was first launched in 1998, it had only 25,000 web pages in it’s index. Sometimes the result set was only 4 web pages, so the scoring system was not that critical. However, this was still complicated enough that the notion of a human edited directory no longer worked.
By 2000, they had introduced “Giga Google”. With this, they scaled to millions of pages. Suddenly they had new problems to look at. Replication of the data to the East coast was a big issue they had to face and solve. And, now the scoring system became critical, because most searches offered up more than 4 results.
Here is Craig, giving his talk:
5. Next, we had Ben Gomes, who was introduced as Google’s search quality czar, and Kerry Rhoden, a senior person on the usability team. They showed neat examples of how they use Eye Tracking studies to model human behavior, and improve usability. Another interesting tidbit from this is that they store a whole extra copy of the web for their testing purposes.
So when they test new algorithms that they are not ready to push live yet, they have the complete set of data to use for testing purposes. Then they provided several examples of usability changes that Google has made over the years:
- “Did you mean” lines intended to offer spelling corrections to users who misspell their query.
- Onebox results that attempt to show you the answer without you having to click through to anyone’s web site.
- Query Refinements that provide users with a list of links designed to help them tailor their search quickly. As an example, look at the search results for the search phrase “Cancer“, and notice the section titled “Refine results for cancer:”.
- Site links. For example, when you search on “Circuit City“, you will get the main site, but you will also get direct links to the most visited pages on that site.
6. Then we had Udi Manber, Google’s VP of Engineering. He continued the theme of the day to this point that wash Google is doing is really, really hard. The organizer’s of this event wanted the press to get that message loud and clear. One of the most interesting statements made by Udi is that 20 to 25% of the queries that Google sees in any given day are queries that they have never seen before.
Wow. 20 to 25%. That took a while to sink in. Talk about reinforcing the value of a long tail strategy. After a few introductory comments, Udi moved on to talking about some things that Google is doing to improve. He started with a list of queries for which they successfully map one search phrase to another that is a better fit for what the user wants. The example below go from easiest to hardest, with links in those cases that are already live:
- GM = General Motors
- Ramstein AB = Ramstein Air Base
- ab ca successfully translates AB into Alberta
- typing – words per minute text brings up a first result which is a tool that will give you a typing speed test
- “unchanged lyrics van halen” will be mapped to “lyrics to unchained van halen”
- “overhead view of bellagio pool” will be mapped to Bellagio pool pictures
- “F-15 launch launched from a sub” will be mapped to “F-15 submarine launch”
- “distance from Zurich, Switzerland to lake Como, Italy” will be mapped to “train Milan Italy Zurich Switzerland”. Why? Because it happens to provide the distance the user originally requested.
Google is also looking at Cross Language Information Retrieval. This was not officially announced, but what they plan to do is to accept a user’s query in their natural language, translate it into every other language they have in their data base (12 languages to start), get the best results, translate the web pages with the best results, and present the results back to the user. One key part of how they do this is that they will end up keeping on hand 12 copies of the web, pre-translated into all 12 languages they will support initially.
Last up was Marissa Mayer, Google’s Vice President, Search Products & User Experience. Marissa made all of the official announcements. This has been covered in many places, so here are a few links you can follow to find the details of the announcements:
- Overview Searchology coverage by me
- My detailed analysis of Google’s announcement at Search Engine Watch
- Overview coverage by Search Engine Watch
- Searchology coverage by Search Engine Land
Marissa told us that Universal Search was something that she originally suggested back in 2001, but they couldn’t do it then, because the infrastructure challenges were too great. They really needed to solve the problem of how to have a common relevance calculation across all of their search properties without dumping hundreds of millions of queries per day onto each of them. She also provided some examples of queries that demonstrate Universal Search in action:
- restaurants in mountain view, ca shows a map with the locations of several restaurants
- nosferatu presents a result at Google video where you can play the result. In fact, you can play the entire movie right there inline in the search results page.
- “Mexican poetry” will show results directly from Google book search, and yes you will be able to read the entire book right there.
- i have a dream will allow you to view a video of Martin Luther King’s very famous speech inline in the search results
- big wheels races presents a video of such a race down Lombard street in San Francisco, complete with crashes
- Clay Bavor brings up a time lapsed video of Clay Bavor building a portrait of Abraham Lincoln, using only pennies. Gray scaling is done through the use of the tarnishing and dirtiness level of the pennies.
- things you can’t do when you are not in a pool brings up a hilarious video. Note that you get results here from video properties that are not owned by Google too, and you can still play them inline.
Here is Marissa giving her presentation:
Lastly, there was a question and answer session. From let to right in this picture, we have:
- Sergey Brin
- Udi Manber
- Marissa Mayer
- Alan Eustace, Senior Vice President, Engineering & Research
- Elliot Schrage