Is Ratio-Tracking The only Answer for BitTorrent Woes?

From the department of too-many-generalizations-from-too-little-data again: a little bit of armchair theorizing about BitTorrent (especially public BitTorrent trackers) as a reliable distribution mechanism.

Let me clarify that I’m not talking about reliability in terms of whether the data arrives without errors  or even whether the data is really what it claims to be (( We can thank the RIAA/MPAA for this sh*t)) – merely the question of how reliable is BitTorrent over long periods of time.

As a distribution mechanism, BitTorrent has always been well suited for handling "flash crowds" – large spikes in demand for a single file. The default approach of seeding while you download ensures that even as demand builds up, a large number of partial seeds shoulder some demand. There are two questions at this point that need to be answered:

1. What happens when the partial seed has 100% of the file and is in a position to become a "real" seed?

2. what happens to the late adopters – the people who find 6 months too late?

My off-hand analysis of public trackers seems to suggest that public trackers see a seed distribution over time that is left skewed and has very very narrow spike to boot:

Public Tracker Seeds - Time Series 

Seeds are hard to find in the initial stage of a public torrent as the few that exist are typically overwhelmed. At some point, a large number of people get the complete file and there is a huge spike in seeds. But almost immediately, that number drops off as people move the downloaded file around or simply delete the torrent to save on bandwidth. The result is that comment threads on most public trackers look something like this:

Seed seed seed seed....

Of course one could argue that really popular torrents will always have large number of seeds. Which neatly brings me to my second question – what happens if the torrent you are looking for is not so popular? I have often spent a lot of time doing deep Google searches for old versions of certain applications that were cached on download sites after the application developer disappeared. Even if I were to find a torrent for such an obscure file on a public tracker, odds are I would not get very far with the download.

Private trackers on the other hand combat this problem pretty well using ratio tracking. By forcing people to stick around and seed files in order to keep their ratio up, you get a seed curve that looks like this:

Private Tracker - Seed Time Series

It is not uncommon therefore in private trackers, to see a seeder:leecher ratio in excess of 2 or 2.5 even 6 months after the initial release. A more subtle but interesting effect is that download speeds do drop as a torrent ages but not as much as one would expect.

My guess is that this happens because people continue to seed at low speeds old torrents – they know that lots of seeds are available, but the tiny amount of data they speed goes to buffer their ratio. Since a whole lot of folks are seeding at slow speeds, the downloader still gets something approximating a fast download.

It’s my belief that for BitTorrent to really become a distribution mechanism for the ages (in Internet time atleast), some mechanism must exist to ensure supply (seeds) that can meet demand (leechers). The answer today seems to be ratio-tracking.

There is some talk that the ratio tracking game in private trackers creates its own problems, but it seems to me that’s a price a lot of people would be willing to pay.