New Homepage: http://www.cs.cmu.edu/~pavlo

Andy Pavlo

BitTorrent User Traces :: Andrew Pavlo - Brown University

BitTorrent User Activity Traces

The original motivation for the Graffiti Network Project was that we wanted to develop a system ancillary to BitTorrent that allowed for longer term data persistence. As part of this research, we set out to measure the user traffic on new torrents added to The Pirate Bay and Mininova during a three week period. Our instrumented clients would pull new torrents from RSS feeds and then scrape the tracker for information every three minutes (TorrentSnapshot). For each peer, we would try to repeatedly connect to it and retrieve their download status (PeerSnapshot). Once a torrent has no peers for a six hour period, we declare the torrent dead and stop collecting information from it.

We found in our experiments that we had to download a very small amount of data in order to appear as if we were a legitimate client (roughly 128 bytes/sec). To avoid copyright problems, we reset the client every 30 minutes and delete any data that we may have downloaded. The IP address of each peer that we connected to has been obscured, but we do record their country of origin and their BitTorrent client's signature.

Downloads

The following is the BitTorrent trace data collected from October 28th - December 9th, 2008. Each table is dumped out as a CSV.

The source code for the collection framework (written using Django) is available by anonymous SVN:

    svn checkout http://graffiti.cs.brown.edu/svn/graffiti/src/harvest/