Tech Report CS-07-04
A Generative Discourse-New Model for Text Coherence
Micha Elsner and Eugene Charniak
May 2007
Abstract:
Recent models of document coherence have focused on the referents of noun phrases, ignoring their syntax. However, syntax depends on discourse function; NPs which introduce new entities are often more complex. We develop a generative model for NP syntax which describes this difference. It can be used to model discourse coherence in the Wall Street Journal; combining it with the local coherence model of Elsner ('07) yields substantial improvements. Our model is competitive with previous systems on the discourse-new detection task; its performance is comparable to Uryupina ('03).
(complete text in pdf)