Keynote Talks
Keynote Talk 1
Querying Text Databases and the Web: Beyond Traditional Keyword Search
Luis Gravano (Columbia University)
ABSTRACT:
Traditional keyword search --where a query is a list of keywords and
query results are a relevance-ordered list of documents-- is, of course,
a powerful query paradigm for text databases and the Web. However, more
expressive query paradigms, where both queries and their results can
exhibit a richer structure than in traditional keyword search, are often
desirable. Information extraction systems identify and extract
intrinsically structured data that is embedded in natural-language text
documents, hence enabling these alternative query paradigms.
Unfortunately, information extraction is a time-consuming process, often
involving complex text analysis, so exhaustively processing all
documents in a large text database --or on the Web-- could be
prohibitively expensive. Beyond efficiency, query result quality is also
important: information extraction is error-prone and not all extracted
data is equally likely to be correct, so result quality is an important
consideration during query processing. In this talk, I will discuss
recent work on cost-based optimization of structured queries in this
information extraction scenario, where modeling query result quality
--in addition to execution efficiency-- is a distinctive and important
challenge.
SPEAKER:
Luis Gravano (http://www.cs.columbia.edu/~gravano/) has been on the faculty of the Computer Science Department,
Columbia University, since September 1997, where he has been an
associate professor since July 2002. From January through August 2001,
Luis was a Senior Research Scientist at Google (on leave from Columbia
University). He received his Ph.D. degree in Computer Science from
Stanford University in 1997 and a B.S. degree from the Escuela Superior
Latinoamericana de Inform¨¢tica (ESLAI), Argentina, in 1991. Luis is an
associate editor of the ACM Transactions on Database Systems and a
recipient of a CAREER award from the National Science Foundation.
Keynote Talk 2
Structured Data and Web Documents: Better Together?
Surajit Chaudhuri (Microsoft Research)
ABSTRACT:
Keyword search over structured databases is silo-ed from web search in
that their results are independent of those from web search. We claim
that keyword search can benefit significantly from the knowledge of
web documents and results of web search. Conversely, traditional web
search only returns documents. We discuss how the search results can
be enhanced by additional structured information. Thus, query over
structured data and web search can leverage each other.
SPEAKER:
Surajit Chaudhuri (http://research.microsoft.com/en-us/people/surajitc/) is a Principal Researcher and a Research Area
Manager overseeing data management research activities in Microsoft
Research. His areas of interest include self-tuning database systems,
query optimization, data cleaning, and synergy between search and DBMS
technologies. Surajit has a PhD from Stanford University and is an ACM
Fellow. He was awarded the ACM SIGMOD Contributions Award in 2004 and
a 10 year VLDB Best paper Award in 2007.