Immediate-Head Parsing for Language Models

Eugene Charniak
We present two language models based upon an ``immediate-head'' parser -- our name for a parser that conditions all events below a constituent c upon the head of c. While all of the most accurate statistical parsers are of the immediate-head variety, no previous grammatical language model uses this technology. The perplexity for both of these models significantly improve upon the trigram model base-line as well as the best previous grammar based language model. For the better of our two models these improvements are 24% and 13% respectively. We also suggest that improvement of the underlying parser should significantly improve the model`s perplexity and that even in the near term there is a lot of room for improvement in immediate-head language models.