Tech Report CS-07-04

A Generative Discourse-New Model for Text Coherence

Micha Elsner and Eugene Charniak

May 2007

Abstract:

Recent models of document coherence have focused on the referents of noun phrases, ignoring their syntax. However, syntax depends on discourse function; NPs which introduce new entities are often more complex. We develop a generative model for NP syntax which describes this difference. It can be used to model discourse coherence in the Wall Street Journal; combining it with the local coherence model of Elsner ('07) yields substantial improvements. Our model is competitive with previous systems on the discourse-new detection task; its performance is comparable to Uryupina ('03).

(complete text in pdf)