In a test of over 850,000 search queries, we found that Google offered some form of rich answer (AKA a knowledge box) 19+% of the time. If you are not familiar with what a “rich answer” is, the following is an example:
In short, Google provides a direct answer to your question, rather than requiring you to click through to another web page to get that answer. In some cases, Google may provide only part of the answer as well.
UPDATE: We ran this same query set again in July 2015. Find out how much Rich Answers had increased in this query set by then!
While our queries were selected to be more likely to show a rich answer result, this still shows just how far they have come. In this article, I will share the details of our findings, and I will also discuss how well Google is doing in attributing results to their original sources.
The query set was assembled using a mix of Google Autocomplete (AKA Google Suggest), Bing Suggest, and a query set manually assembled here at Stone Temple Consulting. The counts of each query type were as follows:
The manually assembled set of queries covered a variety of categories, including:
If you prefer to get the data via a quick video, here is a video that has Caitlin O’Connell of Stone Temple interviewing me about the study:
And here is a shareable slide deck presentation of the major findings of this study:
Queries Extracted From “Suggested” Searches
This data focuses on queries extracted from Google Autocomplete and Bing Suggest. These queries were mined equally from Bing and Google – we took 250,000 from each for a total of 500,000 queries. We did it this way to avoid biasing the results towards one search engine or the other. Here is what we saw:
In case you are not familiar with what I mean by a suggested query, here is an example from Google Autocomplete:
Other Types of Search Queries
The suggested search queries are a valuable source, but by no means the only source we use, because we felt it was limiting in nature. To that end, we generated over 355,000 additional queries to test as we detailed above.
Information Extracted From Third Party Domains
Google clearly has invested a lot in extracting information from third party domains. Over 73% (more than 122K) of the rich answer boxes provided by Google include a link to a 3rd party URL cited as a source. Google uses a wide variety of sources for information. Authority appears to matter, as nearly 48% of the results that cite a 3rd party source come from sites with a Moz Domain Authority of 100, and Google leverages 31 domains with that DA.
Bing also relies heavily on 3rd party sources, with more than 87% (4,273) of the results including a link. 74% of the results from 3rd party sources provided by Bing come from Wikipedia, which has a Moz Domain Authority of 100.
Is Google Providing Credit Where Credit is Due?
In total we found 166,366 rich answers in our testing. Of these, 42,160 of them did not include an attribution link:
We took a deep dive into the queries that included no link. We broke them down into 51 categories to speed the review, and found that the overwhelming majority of the queries with no attribution contained public domain information.
This included queries such as: “How many quarts in a gallon,” “what is the capital of Washington state”, “how tall is the empire state building”, and “when is Barack O’Bama’s birthday.”
The most notable exception was in the case of song lyrics, where Google sometimes shows the lyrics to a song as shown here:
I did ask Google about this, and a Google spokesperson said the following: “We only show the lyrics for songs for which we have the appropriate rights”. From this statement, it seems quite likely that Google is licensing the rights to the lyrics that they do show.
Do Rich Answer Boxes Cost Sites Traffic?
For about 75% of the rich answers we saw, Google does provide attribution, and these do appear to follow best practices for Fair Use (though I am not a lawyer, so that’s not really my call to make).
There are many sites that have received traffic for many years by publishing public domain information. Clearly, these sites will lose traffic when Google publishes the answer directly in the search results.
There is little to be done about this, as Google has just as much right to publish such information as anyone else.
There is also the case of Google licensing information, such as they appear to have done with song lyrics. You can read an excellent case study by Glenn Gabe about this here. Once again, Google has every right to do this as well.
Last, but not least, there were 1,871 (1.1%) rich answer results in our testing that presented results in list format (these are sometimes called step by step results). These extract detailed information from 3rd party web sites. Google started showing these results back in June of 2014.
Of these, 1,366 (73%) of them include incomplete steps, as marked by a “…”. Here is an example of such a result:
In addition, 799 (42.7%) of the list items include a More Items indicator at the bottom of a partial list, as shown here:
This would seem to me to be a huge win for the publisher, as it would strongly entice the user to click through to the site to get the complete list.
The bottom line is that, from a traffic perspective, some of the rich answer results will end up taking traffic away from sites that were previously getting that traffic. However, all our examinations suggest that Google is playing it by the book.
From a publisher’s perspective, the risk of traffic loss is greatest if you rely heavily on public domain info, or if you rely on info that Google can easily (and cheaply) license.
Different Query Types Included in the Study
There are a number of different features and sub-features that are evident in the rich answers provided by Google. The following table will show what these are, and how often they showed up in our 855K+ tested queries:
What do these features look like? It would be a bit lengthy to show what they all look like, but here is an example image that shows 3 of them:
I recently wrote a post on LinkedIn that shows many different types of rich answers here. This post includes screen shots of a dozen very different types of rich answer results.
Google has already made a huge investment in developing rich answers in search, and changes keep happening all the time. In my annual industry predictions article on Forbes, I predicetd major additional advances in this area by Google in 2015. This is viewed as a critical initiative within Google.
So let me take that prediction one step further. My guess is that by end of year Google will be in the 40% range in responding to our 855K test query set with rich answers. To check that, we are going to rerun our tests of this on a regular basis.
For your purposes, the importance of preparing for this should be clear. This change is coming, and it would be wise to prepare for it. At a minimum, it means not relying on public domain information, or info that Google can easily and cheaply license or derive, for your organic search traffic.
As a result, focusing on very high value content that is specific to the core value of your business is not only a good idea, it’s the only way to thrive in this new world.
See all of our big data SEO and social media studies on one page!