Building a Quality Custom Search Engine

by Eric Enge

One of the key factors that will determine the long term success of Google’s Custom Search Engine announcement is whether or not users will find value in Custom Search Engines (CSEs). Put differently, will CSEs differ enough from Google’s core search results to be worth the trouble? Users are not going to use these things unless they improve their lives somehow, in some manner that they care about.

We do know that users care about searching. When a user begins a search, they want to find something. Since their real work begins when they are done searching, they want the search to be done fast, perhaps even instantly. We have all experienced it – where we do an initial search, find that the results are not what we want, we refine our search, and try it over again. Fundamentally, this whole process is a waste of our time until we get the result we want.

So here is the promise of custom search engines – if we find one we know and trust, we can find the answer we are looking for faster.

So now that we know what users are looking for, can we deliver it? Yes we can! The trick is to find those things that custom search engines can do better than core search. What I learned from discussions with Google Custom Search Engine creator Shashi Seth, is that Google does not know the user’s context. In other words – if a user types in “Ford”, are they looking for information on the company, a place to buy one, reviews of Fords, information on how they are manufactured, and so forth.

In simple terms, we don’t know if the query is being made by a consumer, a media person, or an automobile engineer. If we knew these things, we might be able to provide better results. However, it gets slightly more complicated – web sites don’t typically do a good job of indicating what type of audience they are tailored for either. So what to do?

Enter the CSE. Google’s Custom Search Engines enable human editors (Subject Matter Experts, or SMEs) to filter Google’s core search results. What this means is that the SME can decide what sites to include or exclude from the results, and they can decide to attach weighting parameters to the results to either increase or decrease the rankings of particular sites.

Using this mechanism, the SME can create a CSE that is specifically targeted to a given audience. For example, they can create a CSE for automobile engineers, that includes only those sites that provide engineering and design related information (design magazines, related patent filings, any sites where relevant forums have been setup, etc.).

This will provide the auto engineer a directly relevant result, without the clutter of sites that are intended for other audiences. Ultimately, this will save them time and energy in going their work.

So now that we have the idea, let’s look at a couple of examples. First a comparison of a CSE designed to provide medical information to doctors, as compared to core Google search (the Google results are on the right):

Doctors CSE Diabetes Results Google Diabetes Results

As you can see, more relevant results have been brought up to the first page of the results, increasing the chances of the doctor getting their answer right away. Let’s also compare the results of a CSE designed for patients with core Google search results (the Google results are on the right):

Patient CSE Diabetes Results Google Diabetes Results

You see that they both have been optimized for their audiences. So CSEs do in fact address a real problem in algorithmic search. Our experience in reviewing CSEs is that about 10% of the people fully understand this issue at this point. But more and more people will come to understand it over time, and this will create increasing usage.

