Compression Contest: Final Results

Last update: Wed Mar 16 00:00:01 EST 2011
Methodology

We tested your compressing/decompressing programs on both the non-positional and the positional indices. First, we measured the time to compress the two files (Compress column), logging the size of the correponding compressed files (Size column). Then, we fed in 100k postings list IDs for each index, measuring the time that your query program needed to retrieve the corresponding postings lists from the corresponding compressed index (Query column). Finally, we checked that the retrieved postings lists were identical to the original ones.

Results for non-positional index
# Team Size (bytes) Language Compress (s) Query (s)
1 ZM5 11,330,543Python 71 7
2 EvilBits 13,277,460 Python 214 24
Cobb(7z) 13,568,351
3 Justabitmore 14,708,956 Racket 107 4,156*
4 Datamongers 14,951,142 Python 64 29
5 HammerheadHousefly 14,962,010 Python 75 2,242
6 BrinRank 15,326,072 Python 263 47
7 JWA 15,498,505 Python 36 28
8 StreetSharks 15,498,505 Python 31 2,597
9 GeorgeM 15,509,869 Python 20 14
10 arden 15,509,869 Python 41 5
11 emmy 15,509,869 Python 37 10
Ariadne(bzip2) 17,122,370
12 PugnaciousParsers 17,938,805 Python 58 12
13 Ding 17,975,946 Python 57 1,543
14 jim-bo 25,087,469Java 17 2
15 CMonster 32,302,092 Python 288 1,579*
16 TeamAwesome 32,379,323 Python 239 116
Arthur(uncompressed) 69,726,288


Results for positional index
# Team Size (bytes) Language Compress (s) Query (s)
1 BrinRank 58,946,406 Python 660 165
2 HammerheadHousefly 60,211,666 Python 305 2,464
3 Datamongers 63,570,919 Python 269 135
4 EvilBits 66,300,020 Python 954 97
Cobb(7z) 69,788,951
5 PugnaciousParsers 69,911,298 Python 203 37
6 JWA 71,483,005 Python 181 259
7 emmy 71,512,899 Python 174 51
8 arden 71,513,429 Python 216 1,416
9 StreetSharks 71,534,265 Python 135 2,533
10 ZM5 71,555,096 Python 166 20
11 Ding 73,964,407 Python 226 1,607
Ariadne(bzip2) 74,423,056
12 GeorgeM 88,250,299 Python 157 63
13 CMonster 92,665,669 Python 866 1,557*
14 TeamAwesome 95,217,284 Python 703 987
15 Justabitmore 130,214,216 Racket 115 DNQ1
Arthur(uncompressed) 196,048,431
16 jim-bo XXX,XXX,XXX Java DNQ2 DNQ2
Notes
  • * = tested on 25k queries only (the provided code was too slow to be tested on the entire 100k queries)
  • DNQ1 = timeout (2h) reached while querying the compressed index with 25k queries
  • DNQ2 = failed to create the compressed index

WhoWhenWhere