Tree-bank Grammars
Eugene Charniak
By a ``tree-bank grammar'' we mean a context-free grammar created by
reading the production rules directly from hand-parsed sentences in a
tree bank. Common wisdom has it that such grammars do not perform
well, though we know of no published data on the issue. The primary
purpose of this paper is to show that the common wisdom is wrong. In
particular we present results on a tree-bank grammar based on the Penn
Wall Street Journal tree bank. To the best of our knowledge, this
grammar out-performs all other non-word-based statistical
parsers/grammars on this corpus. That is, it out-performs parsers
that consider the input as a string of tags and ignore the actual
words of the corpus.