I had the chance to participate to this year International Semantic Web Conference (ISWC2014) in Riva del Garda and to assist to the keynote given by Prabhakar Raghavan, Vice President of Engineering at Google.

It was a really interesting talk, full of content, but also enjoyable and very approachable by non technical people too. These are my personal takeaways from his keynote. I highly reccommend you watch the keynote video that has been published online

ISWC 2014 Logo
ISWC 2014 in Riva del Garda (Italy)

Prabhakar presented a nice overview of the evolution in the field of search engines in the last 20 years. He started from the first innovations like Continuous Crawling from 1995, to the introduction of Recall as a measure of performance quality. He went through the problem of ranking. Starting with the "Inktomi scoring function", the concept of Page rank where the link to a page is a signal of endorsement or, altenatively, a description of the content. Thus highlighting how fundamental it was, and is, to evaluate the "importance" of a page not only based on his content.

An example of Page Rank
Mathematical PageRanks for a simple network

Then he moved on trying to answer why people are running "queries". From the search for finding a URL (navigational queries), to the search for knowledge (informational queries), to transactional queries. Those are queries a user perform in order to find a product to buy. Additionally he highlighted that approximately 99% of queries contain entities nowadays. One outstading example is what he called the "party factoid", i.e., Wikipedia fact checking during parties.

Then he explained that now there are queries for fine grained, every-day-life tasks issued trough mobile devices. So understanding a query is highly dependent on the context. This is what he called "understanding the verb" implicit in the query. The example: you look for "restaurant", but:

  • to select a good restaurant and get infos
  • to book a restaurant you selected
  • to find the way to go to the restaurant

Then he made a important remark about the fact that in the past 15 years a query was made of 2.5 words on average, and now things are going to change with the advent of natural language questions (e.g., ask Siri).

In the end, for what I've understood, his vision of the future will be about three main keypoints:

  • understanding the context: the user intent given by the context
  • dealing with more complex queries (natural language questions)
  • forecasting the next query before the user issues it ( Google now )

A nice remark about the last point: if we can predict the next query, then we can perform it when is more convenient for the servers, i.e., when load is lower.