Edge-Based Best-First Chart Parsing

Eugene Charniak, Sharon Goldwater, and Mark Johnson
Best-first probabilistic chart parsing attempts to parse efficiently by working on edges that are judged ``best'' by some probabilistic figure of merit (FOM). Recent work has used probabilistic context-free grammars (PCFGs) to assign probabilities to constituents, and to use these probabilities as the starting point for the FOM. This paper extends this approach to using a probabilistic FOM to judge edges (incomplete constituents), thereby giving a much finer-grained control over parsing effort. We show how this can be accomplished in a particularly simple way using the common idea of binarizing the PCFG. The results obtained are about a factor of twenty improvement over the best prior results~--- that is, our parser achieves equivalent results using one twentieth the number of edges. Furthermore we show that this improvement is obtained with parsing precision and recall levels superior to those achieved by exhaustive parsing.