Searching for Meaning With Semantic Searching

Kieron Murphy

Searching for information on the Internet can be a trying task. Contemporary search engines require you to know enough about the subject you're investigating in order to enter the right words to find out more about it. Then the process gets iterative. The more you learn about the subject, the better you get at refining search terms to get more specific information.

Most everyone has come to accept this model. Not the folks at the Xerox Research Centre Europe (XRCE), though, who today announced a new method for doing research using networked data.

Known as semantic search, the idea is to improve traditional research searching by using XML data from semantic networks of information and then rank return results for their meaning--as opposed to predicted relevancy to keywords. This has been something of a holy grail for scientists in the field of informatics in recent years. Today, Xerox said it has come up with an implementation. Called FactSpotter, the new Xerox tool enables users to more effectively find relevant information by analyzing the meaning and context of queries in everyday language.

Xerox claims that FactSpotter combines a powerful linguistic engine with an easy-to-use interface to: comb through almost any document regardless of the language, location, format, or type; take advantage of the way humans think, speak, and ask questions; and discriminate the results, highlighting just a handful of relevant answers instead of returning thousands of unrelated responses.

"Our advanced search engine goes beyond today's typical 'keyword' search or current data-mining programs, which typically end up searching only 40 percent of all the documents that are relevant, because the keywords are too limiting," said Frédérique Segond, manager of parsing and semantics research at XRCE.

"Xerox's tool is more accurate because it delves into documents, extracting the concepts and the relationships among them. By 'understanding' the context, it returns the right information to the searcher, and it even highlights the exact location of the answer within the document."

Xerox plans to deploy FactSpotter initially as part of a new toolkit for legal and regulatory researchers. The office equipment giant, which also pioneered many of the personal computer controls we now take for granted, such as the mouse cursor and the graphical user interface, said by way of example that FactSpotter could allow specific facts to be found quickly and easily among thousands (and often millions) of different documents in the discovery phase of a trial.

Separately, in response to pressure from officials in the U.S., Microsoft has agreed to open the search functionality in its new Vista operating system to third-party applications such as those provided by Google and Yahoo!--and now maybe Xerox's FactSpotter.

Who knows? Maybe the folks in Redmond got a little nervous about a search engine automatically poring through all its documents in a future legal standoff. It would make for some interesting queries.

[Editor's Note: For more on semantic searching, please see our special report " Weaving a Web of Ideas".]


Tech Talk

IEEE Spectrum’s general technology blog, featuring news, analysis, and opinions about engineering, consumer electronics, and technology and society, from the editorial staff and freelance contributors.

Newsletter Sign Up

Sign up for the Tech Alert newsletter and receive ground-breaking technology and science news from IEEE Spectrum every Thursday.