Edge-Based Best-First Chart Parsing
and Mark Johnson
Best-first probabilistic chart parsing attempts to parse efficiently
by working on edges that are judged ``best'' by some probabilistic
figure of merit (FOM). Recent work has used probabilistic
context-free grammars (PCFGs) to assign probabilities to constituents,
and to use these probabilities as the starting point for the FOM.
This paper extends this approach to using a probabilistic FOM to judge
edges (incomplete constituents), thereby giving a much finer-grained
control over parsing effort. We show how this can be accomplished in
a particularly simple way using the common idea of binarizing the
PCFG. The results obtained are about a factor of twenty improvement
over the best prior results~--- that is, our parser achieves
equivalent results using one twentieth the number of edges.
Furthermore we show that this improvement is obtained with
parsing precision and recall levels superior to those
achieved by exhaustive parsing.