graffiti networks: a subversive, internet-scale file...
TRANSCRIPT
Graffiti Networks:A Subversive, Internet-Scale File Sharing Model
Andrew PavloDC401 – Rhode Island Defcon GroupOctober 12, 2009
Outline
Open BitTorrent Problems
A Subversive Solution
Experimental Evaluation
Aftermath
Lessons Learned
Concluding Remarks
October 12, 2009
Problem Statement How to provide a means for peers to retrieve
data long after seeds have disappeared?
Requirements:
Must provide public access.
Must not allow malicious users to destroy data.
Must allow all content.
Must be free!
October 12, 2009
Case Studies Average time a peer remains connected to
swarm after finished downloading = 7 hours
Izal et al., Dissecting BitTorrent: Five Months in a Torrent's Lifetime. 2004
Average life-span of a torrent = 9-13 days
Guo et al., Measurements, Analysis, and Modeling of BitTorrent-like Systems. 2005
Faster departure rates for copyrighted content.
Pouwelse et al., The Bittorrent P2P File-sharing System: Measurements and Analysis. 2005
October 12, 2009
BitTorrent Security Problems Swarm participation is public.
Private invite-only communities can be breached.
Pair-wise communications can identify type of traffic (see SandVine).
October 12, 2009
Graffiti Network Project Distributed file-sharing protocol that utilizes
third-party entities as storage sites and data transfer intermediaries:
Storage sites are likely unwanted participants.
Target “abandoned” areas of the web.
Massive replication provides durability.
Strict tit-for-tat:
Peers must make a new replica for each piece that they download.
October 12, 2009
Graffiti Network Overview
October 12, 2009
Tracker
LeechSeedSeed
Binary Data Encrypted Data
<Encryption Key, Start/Stop Sequence, Checksum>
Base64 Text
Graffiti Network Benefits Similar to steganographic data systems:
Plausible deniability.
Difficult to throttle HTTP traffic.
Multi-site subpoenas.
Integration with BitTorrent:
GN tracker incentivizes network participation through BT peer discovery.
Pieces obtained through BT or GN can be shared with either network.
October 12, 2009
Storage Site Alternatives Good:
Cloud Resources (Amazon S3), Usenet
Better:
File upload sites (RapidShare) , Free webmail, Image upload services, Message Boards, Pastebins, Blogs.
Best:
Wikis!
October 12, 2009
Experimental Evaluation Graffiti Network prototype:
Tracker: MySQL + Django
Client: Python + libtorrent
Hybrid BitTorrent + Graffiti system.
Goal: Measure the lifespan of data stored on third-party storage sites outside of our control.
Use Graffiti Network to store a 700MB Linux ISO
October 12, 2009
MediaWiki Most widely used open-source wiki software:
Likely to be many novice users and install-then-forget installations.
Default permissions are very lax:
Anonymous edits enabled by default.
Non-trivial to install CAPTCHA plugins:
Default SimpleCAPTCHA is useless.
October 12, 2009
Discovering Target Sites Scraped two well-known search engines looking
for MediaWiki installations:
Used phrases indicative of a fresh install + a random word from dictionary.
Example: “`Configuration settings list’ squirrels”
Total Sites Found: +22,000
Probe each site to determine security and whether it is “abandoned”.
October 12, 2009
Target Site Criteria No Wikimedia Foundation Sites:
Example: Wikipedia, Wikitionary, etc.
No commercial hosting sites:
Example: wikia.com
Last update >= 3 months
No CAPTCHAS.
If requires login, allow registration.
October 12, 2009
# of Sites
Anonymous Edits: 8,483
Open Registration: 5,983
“Puzzle” Protection: 1,157
Total Open: 15,623
Total Open+Abandoned: 11,987
Open MediaWiki Sites
October 12, 2009
Experiment Setup Leave a “calling card” to inform non-network
users of the system:
Link back to project website using URL containing unique tracking code.
Do not modify existing pages.
More resilient approach would be to modify homepage and then immediately rollback.
Only store one replica per site.
October 12, 2009
Example Replica
October 12, 2009
Tracking URL
Base64+Encrypted
Payload
Exact
Revision
Random
Title+User
Calling Card
We turn it on… Simulation started April 10th, 2009
Shutdown on April 11th, 2009
Only stored 5,646 replicas before I got a call…
October 12, 2009
Aftermath Reversed pages (since we couldn’t delete).
Posted apology.
Removed list of open MediaWiki sites.
Removed link to source code.
Removed Brown logo.
October 12, 2009
Constructive Feedback You should have asked me, I would have
let you store the data…
Not a real-world experiment.
You should have ran it in your ownlocal cluster/PlanetLab…
Again, not real-world conditions.
You should have ran it from Starbucks/Panera…
Hiding makes it looks like you know you are doing something wrong.
October 12, 2009
Replica Availability April 10th, 2009 to October 11th, 2009
October 12, 2009
0
0.1
0.2
0.3
0.4
0.5
0.6
0 20 40 60 80 100 120 140 160 180
percentage
# of days since replica created
Total Replicas Removed Replicas with Tracking Click-Throughs
Replica Availability by Protection April 10th, 2009 to October 11th, 2009
October 12, 2009
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 20 40 60 80 100 120 140 160 180
percentage
# of days since replica created
Puzzle Protected Registration Protected Anonymous Edits
Protection Enhancements reCAPTCHA is the answer:
Allow anonymous edits, but require verification.
Does not require server-side graphics libraries.
Throttle users creating too many pages or posting data larger than threshold.
Automatically lock-down abandoned sites:
Should be part of all open-source CMS software.
October 12, 2009
Lessons Learned Be careful how you advertise yourself.
Consider how your system will affect others.
Consult others outside of your project.
Be open and upfront about project.
Don’t pee in the pool!
October 12, 2009
Project Status Checking replica status daily.
Two workshop submissions, two rejections:
IPTPS ‘09
WOOT ‘09
Planning to submit one last time:
LEET ‘10 (April 5th, 2010)
October 12, 2009
Concluding Remarks Final vindication: The system works!
Tread lightly…
Off probation at the end of this semester!
October 12, 2009