BitTorrent User Activity Traces
The original motivation for the Graffiti Network Project was that we wanted to develop a system ancillary to BitTorrent that allowed for longer term data persistence. As part of this research, we set out to measure the user traffic on new torrents added to The Pirate Bay and Mininova during a three week period. Our instrumented clients would pull new torrents from RSS feeds and then scrape the tracker for information every three minutes (TorrentSnapshot). For each peer, we would try to repeatedly connect to it and retrieve their download status (PeerSnapshot). Once a torrent has no peers for a six hour period, we declare the torrent dead and stop collecting information from it.
We found in our experiments that we had to download a very small amount of data in order to appear as if we were a legitimate client (roughly 128 bytes/sec). To avoid copyright problems, we reset the client every 30 minutes and delete any data that we may have downloaded. The IP address of each peer that we connected to has been obscured, but we do record their country of origin and their BitTorrent client's signature.
Each table is dumped in as a CSV. The source code for the collection framework (written using Django) is available by anonymous SVN:
svn checkout http://graffiti.cs.brown.edu/svn/graffiti/src/harvest/
Trace Files: October 28th - December 9th, 2008
- BitTorrent Trackers - 2,019 recods (44KB)
- Peers - 3,570,587 recods (46MB)
- Torrents - 36,075 records (4MB)
- Torrent Snapshots - 14,557,461 records (236MB)
- Peer Snapshots - 123,916,260 records (2.3GB)