Peter Mika knows how search works. He is the leader of the semantic search innovation unit at Yahoo. We are very happy to welcome this very talented researcher and hands-on practicioner at SEMANTiCS 2015.
You have been part of the semantic search developments at Yahoo from the very beginning. Where does commercial search engines still need improvement? What kind of innovation can we expect in search in the upcoming years?
Peter Mika: Search has come a long way with the still-dominant paradigm of document retrieval, but knowledge-based search has been taking over the search experience in the past years, and it will continue to do so. The document retrieval paradigm has natural limitations in that it assumes, first, that there is already a written answer somewhere on the Web, and second, that this can text can be found by simple text similarity between the text and the user query. There are large numbers of queries for which these assumptions don't hold, e.g. where the answer is distributed across multiple documents, or not written as text at all (e.g. data in tables), or not written explicitly but requires reasoning. Search companies are thus investing in information extraction and data fusion, as well as more and more advanced question-answering capabilities on top of the collected information. The need for these technologies is only increasing with mobile search, where providing results as ten blue links leads to a very poor user experience.
Companies have adopted semantic methodologies for their internal knowledge management. How does their work in establishing a semantic information network differ from yours?
Peter Mika: We are a consumer internet company, so for us there is little difference between our internal and external representations. At Yahoo, we collect and integrate data sets that we expect to be interesting for our users, and at the same time we also invest in conceptual modeling for our own sake, in particular to improve data sharing and reuse across products and teams, wherever they are in the organization.
Semantic technologies are not just about search. There are infinite application scenarios. In which domains do you think can semantics provide value? Which other technologies can complement the semantic approach?
Peter Mika: As an example of functionality beyond search, we also use the Knowledge Graph to represent user interests, which powers recommendation functionality on our homepage. In particular, we analyze the articles our users have read, and build a user profile in terms of concepts from the Knowledge Graph. This is in turn is used to produce more relevant recommendations on subsequent visits. Besides knowledge representation, we also invest heavily in information extraction, in particular in identifying entities in user queries and content, as well as extracting information from web pages, email and other personal information spaces. At Yahoo Labs, we work in advancing the sciences that underlie these approaches, i.e. Natural Language Processing, Information Retrieval and the Semantic Web.
Thank you for this interview. We are looking forward to your keynote speech at SEMANTicS 2015.
Register now for SEMANTiCS 2015!