Blogger: Larry Cannell
Earlier this week Google announced a number of new features coming to their Internet search engine. The most interesting to me was their plans to start returning a “snippet” about a web page in search results. Here is what they said:
“These ‘rich snippets’ extract and show more useful information from web pages than the preview text that you are used to seeing. For example, if you are thinking of trying out a new restaurant and are searching for reviews, rich snippets could include things like the average review score, the number of reviews, and the restaurant's price range:”
Sounds cool. But, the most intriguing line from the post was (emphasis added):
“We can't provide these snippets on our own, so we hope that web publishers will help us by adopting microformats or RDFa standards to mark up their HTML and bring this structured data to the surface.”
Which made me think about how Google has been marketing their Google Search Appliance (GSA), which they sell to enterprises to crawl and index content on intranets. They have been quite vocal about how a GSA can be dropped into an intranet and immediately return great search results, with virtually no effort at all. For example, here is a line from a whitepaper on Google’s enterprise search site:
“A come-as-you-are approach to indexing eliminates the overhead of preparing documents for admission to the body of searchable data. In any case, your data shouldn’t need a laborious makeover for your search solution to provide relevant results.”
In short Google is saying: “Metadata!? We don’t need no metadata! Our search appliance eliminates the need for metadata.” So while other enterprise search vendors are encouraging customers to attach metadata to their documents and web pages, Google is telling the same people not to worry about “preparing documents.” (Although, to be fair, recent releases of the Google Search Appliance have added features that make better use of metadata)
However, to deliver this new Internet search feature Google now admits they “can't provide these snippets on our own” and that they need additional information embedded within the web page. In other words, sometimes even Google needs a little help from…don’t say it too loud…metadata.
Of course, I am being a little tongue-in-cheek here. But all kidding aside this could be an important development. The source for these snippets is communicated to the search engine through its support for microformats and RDFa, which describe how to structure metadata embedded in a web page. This metadata provides information in a way that search engines and any other application crawling a web page can read directly, rather than inferring from a web page. In the example above the four star review for “Drooling Dog Bar B Q” came from this metadata.
Although there has been some skepticism expressed about Google’s efforts here it will be interesting to see if these efforts start getting more content providers to use these standards.