various software
Note: This webpage has moved! You should be redirected shortly.
The following software may be useful to you. It is certainly not documented or tested as well as it should be, but that's academic software for you. All code is released under the GPLv2 license. Almost all code is in Python.
I've started putting my code on BitBucket and GitHub to make it easier for other people to use and contribute to it.
- Stanford CoreNLP: Stanford's annotation pipeline. I'm one of the many, many members of this project.
- Ensemble: Linearly-interpolated Dependency Parsers: Model combination of several linear-time MaltParser parsing models. I contributed to this code, most was written by Mihai Surdeanu.
- BLLIP reranking parser: My version of the parser, now on BitBucket.
- parsedyff: Visualize the differences between two treebank parse trees via graphviz
- Parsing: Python module with parsing related functions (running/training the Charniak parser, tree reading, evaluation)
- PyInputTree: Python interface to the InputTree structure from the Charniak parser via SWIG. This lets you traverse and view Treebank-style trees. (Update 4.7.2011: Possibly bitrotten...I recommend NLTK these days.)
- waterworks: My Python utility library (everyone else has one...) [old page, PyPI page]
Available by request:
- Hogwash: Metaclustering system (only works with the Brown CS and Stanford NLP setups, but could be modified to work with other systems). I also use this to keep track of results from my experiments, so it is a basic persistence mechanism and results database. This code is probably not useful for those not familiar with it, unfortunately and the system has a steep learning curve. I have been gradually working on making it more general though...