Unraveling the BitTorrent ecosystem

Chao Zhang, Prithula Dhungel, Di Wu, Keith W. Ross

    Research output: Contribution to journalArticle

    Abstract

    BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.

    Original languageEnglish (US)
    Article number5482574
    Pages (from-to)1164-1177
    Number of pages14
    JournalIEEE Transactions on Parallel and Distributed Systems
    Volume22
    Issue number7
    DOIs
    StatePublished - 2011

    Fingerprint

    Ecosystems
    Internet
    Statistics

    Keywords

    • BitTorrent Ecosystem
    • content distribution
    • measurement.
    • peer-to-peer

    ASJC Scopus subject areas

    • Hardware and Architecture
    • Signal Processing
    • Computational Theory and Mathematics

    Cite this

    Zhang, C., Dhungel, P., Wu, D., & Ross, K. W. (2011). Unraveling the BitTorrent ecosystem. IEEE Transactions on Parallel and Distributed Systems, 22(7), 1164-1177. [5482574]. https://doi.org/10.1109/TPDS.2010.123

    Unraveling the BitTorrent ecosystem. / Zhang, Chao; Dhungel, Prithula; Wu, Di; Ross, Keith W.

    In: IEEE Transactions on Parallel and Distributed Systems, Vol. 22, No. 7, 5482574, 2011, p. 1164-1177.

    Research output: Contribution to journalArticle

    Zhang, C, Dhungel, P, Wu, D & Ross, KW 2011, 'Unraveling the BitTorrent ecosystem', IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 7, 5482574, pp. 1164-1177. https://doi.org/10.1109/TPDS.2010.123
    Zhang, Chao ; Dhungel, Prithula ; Wu, Di ; Ross, Keith W. / Unraveling the BitTorrent ecosystem. In: IEEE Transactions on Parallel and Distributed Systems. 2011 ; Vol. 22, No. 7. pp. 1164-1177.
    @article{fc49e86436ad460eb0787ed640ff94d0,
    title = "Unraveling the BitTorrent ecosystem",
    abstract = "BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.",
    keywords = "BitTorrent Ecosystem, content distribution, measurement., peer-to-peer",
    author = "Chao Zhang and Prithula Dhungel and Di Wu and Ross, {Keith W.}",
    year = "2011",
    doi = "10.1109/TPDS.2010.123",
    language = "English (US)",
    volume = "22",
    pages = "1164--1177",
    journal = "IEEE Transactions on Parallel and Distributed Systems",
    issn = "1045-9219",
    publisher = "IEEE Computer Society",
    number = "7",

    }

    TY - JOUR

    T1 - Unraveling the BitTorrent ecosystem

    AU - Zhang, Chao

    AU - Dhungel, Prithula

    AU - Wu, Di

    AU - Ross, Keith W.

    PY - 2011

    Y1 - 2011

    N2 - BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.

    AB - BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.

    KW - BitTorrent Ecosystem

    KW - content distribution

    KW - measurement.

    KW - peer-to-peer

    UR - http://www.scopus.com/inward/record.url?scp=79957594613&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=79957594613&partnerID=8YFLogxK

    U2 - 10.1109/TPDS.2010.123

    DO - 10.1109/TPDS.2010.123

    M3 - Article

    AN - SCOPUS:79957594613

    VL - 22

    SP - 1164

    EP - 1177

    JO - IEEE Transactions on Parallel and Distributed Systems

    JF - IEEE Transactions on Parallel and Distributed Systems

    SN - 1045-9219

    IS - 7

    M1 - 5482574

    ER -