neds.gif (1190 bytes)

New England Database Society

Friday, February 19, 2010

sponsored by Sun Microsystems

sunlogo.gif (4979 bytes)

NEDS

   Dependence and Truth

 Divesh Srivastava
AT&T Labs-Research

Friday, February 19, 2010, 4PM
Volen 101, Brandeis University

(preceded by a wine and cheese reception at 3:00 pm, and followed by dinner at 6:00 pm)

Abstract:

The Web has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not.  Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information.  However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources.  Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is very challenging.

We present a novel approach that considers dependence between data sources in truth discovery.  We start from a static world where we have a snapshot of data from various data sources.  We apply Bayesian analysis to decide dependence between sources and design an algorithm that iteratively detects dependence and discovers truth from conflicting information.  We then consider a dynamic world where the true values can evolve over time and sources can update data to capture such changes.  We develop a Hidden Markov Model that decides whether a source is a copier of another source and identifies the specific moments at which it copies.  Experimental results on both real-world and synthetic data show high accuracy and scalability of our techniques.

This is joint work with Xin Luna Dong and Laure Berti-Equille.
 

Speaker's Bio:

Divesh Srivastava is the head of Database Research at AT&T Labs Research.  His current research interests include data quality, data streams and data privacy.



Maintained by Olga Papaemmanouil olga AT cs.brandeis.edu