Discriminatively Trained Deformable Part Models

Version 3
This is an old release
The latest release is available here

This is an implementation of our object detection system based on mixtures of multiscale deformable part models.
The system is fully described in [2]. An earlier version of the system was described in [1].

The distribution contains the object detection and model learning code.
It also contains models trained on the PASCAL datasets and the INRIA Person dataset.

The system is implemented in Matlab, with a few helper functions written in C/C++ for efficiency reasons.
The software was tested on several versions of Linux and Mac OS X.

To download, click here: voc-release3.1.tgz (updated on 09/13/09)

This project is supported by the National Science Foundation under Grant No. 0534820, 0746569 and 0811340.

References

Slides from a recent talk pdf

[1] P. Felzenszwalb, D. McAllester, D. Ramanan
A Discriminatively Trained, Multiscale, Deformable Part Model
Proceedings of the IEEE CVPR 2008

[2] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Object Detection with Discriminatively Trained Part Based Models
To appear in the IEEE Transactions on Pattern Analysis and Machine Intelligence
pdf

Examples

Recognition results - PASCAL datasets

The models included with the source code were trained on the train+val dataset from each year and evaluated on the corresponding test dataset.
This is exactly the protocol of the "comp3" competition. Below are the average precision scores we obtain in each category.

2006 data bicycle bus car cat cow dog horse mbike person sheep
without context 0.620 0.493 0.635 0.190 0.417 0.153 0.386 0.579 0.380 0.402
with context 0.623 0.502 0.631 0.236 0.437 0.185 0.429 0.625 0.401 0.431

2007 data aero bicycle bird boat bottle bus car cat chair cow table dog horse mbike person plant sheep sofa train tv
without context 0.287 0.551 0.006 0.145 0.265 0.397 0.502 0.163 0.165 0.166 0.245 0.050 0.452 0.383 0.362 0.090 0.174 0.228 0.341 0.384
with context 0.328 0.568 0.025 0.168 0.285 0.397 0.516 0.213 0.179 0.185 0.259 0.088 0.492 0.412 0.368 0.146 0.162 0.244 0.392 0.391

INRIA Person

We also trained and tested a model on the INRIA Person dataset.
We scored the model using the PASCAL evaluation methodology in the complete test dataset, including images without people.

INRIA Person average precision: 0.869

2006 data	bicycle	bus	car	cat	cow	dog	horse	mbike	person	sheep
without context	0.620	0.493	0.635	0.190	0.417	0.153	0.386	0.579	0.380	0.402
with context	0.623	0.502	0.631	0.236	0.437	0.185	0.429	0.625	0.401	0.431

2007 data	aero	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	table	dog	horse	mbike	person	plant	sheep	sofa	train	tv
without context	0.287	0.551	0.006	0.145	0.265	0.397	0.502	0.163	0.165	0.166	0.245	0.050	0.452	0.383	0.362	0.090	0.174	0.228	0.341	0.384
with context	0.328	0.568	0.025	0.168	0.285	0.397	0.516	0.213	0.179	0.185	0.259	0.088	0.492	0.412	0.368	0.146	0.162	0.244	0.392	0.391