The Department of Computer Science and Brown's NSF-funded multi-disciplinary IGERT Program present

"Getting the Best of Both Discretization and Parametric Fitting"

Moises Goldszmidt, SRI International

Machine learning problems often confront users with many representation choices including the level of granularity for discrete features, and the family of distributions to be used for continuous features. Needless to say that these choices involve tradeoffs that influence the resulting model and its performance. In this talk I will describe an approach, based on Bayesian networks, for simultaneously representing features in different forms. I will illustrate this approach in the context of pattern classification, where continuous features are simultaneously represented in both discrete and (semi)parametric form. This dual representation frees the classifier from committing to one or the other, and enables different features to correlate to either representation in the same model. Our empirical results show that this classifier usually achieves performance that is as good as or better than similar classifiers that commit to a single representation, or that include both representations without modeling their relation. During the talk I will place this method in the general context of inducing Bayesian networks from data and discuss a set of open problems and future work.

Part of this talk includes joint work with Nir Friedman of the Hebrew University and Tom Lee of SRI.

Kee-Eung Kim

Get Back