security and privacy implications of url shortening services
TRANSCRIPT
Security and Privacy Implications ofURL Shortening Services
Alexander Neumann, Johannes Barnickel, Ulrike Meyer
IT Security Research GroupRWTH Aachen University
Germany
May 26th, 2011
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University1
Introduction
• Alexander NeumannI Computer science student, RWTH Aachen University, GermanyI Working as Penetration Tester for RedTeam Pentesting GmbH
• Johannes BarnickelI Research AdvisorI Ph.D. Student, IT Security Group, RWTH Aachen University
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University2
Definitions
• Short URL does not exceed 35 characters
• Shortened URL belongs to a shortening service
• Corner cases:I Shortened URL but not short:
http://urlshorteningservicefortwitter.com/4czyxI Short URL but not shortened:
http://www.google.de/search?q=IEEE
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University3
The Problem
• Long URLs tend to wrap (especially in mail)
• Twitter is restricted to 140 characters
• URLs in books should be short
• User tracking and statistics
• “Solution”: URL shortening services (USS)
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University4
Technical Description
1. User posts long URL to servicehttp://itsec.rwth-aachen.de/research
2. Service generates short URLhttp://nvg8.it/8b896c
3. Other user requests short URL
4. Server responds HTTP 301 or 302
5. Other user is redirected to long URL
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University5
Downsides
• Link destination is not transparent
• Reachability of the service is not guaranteed
• USS can silently stop working⇒ All shortened URLs defunct
• Request is delayed when connection to USS has high latency
• Service might be hacked and redirect to malware
• Service might be hostile and serve malware to vulnerable browsers
• Secret URLs submitted to USS cannot be deleted
• . . .
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University6
Facts many users do not know
• URL service accumulates click statistics and creates profiles of users
• Short links might vanish after a few years
• When the USS dies, all shortened URLs are dysfunctional
• When the USS is hacked, links might point to other websites(malware, porn, violence, . . . )
• Short URLs can be enumerated⇒ All submitted URLs are public(!)
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University7
Research Objectives
• Analyze risks for clients, servers, and privacy of users
• Empirical studies:
1. Determine popular USS on Twitter2. Analyze the use of USS in spam3. Test for malicious services4. Analyze user tracking abilities5. Enumerate shortened URLs6. Submit honeypot URLs to USS7. Test latency and availability of popular USS
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University8
Preliminaries: List of USS
Sources:
• Firefox add-on ShortenURL
• List of USS on longurl.org (URL expansion service)
• Lists of USS on several blogs
• Hostnames of URLs in Spam E-Mails
Leads to list of 610 USS (distinct host names), includes 527 generalpurpose USS.
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University9
0lv.ru0rz.tw1link.in1url.com23o.net2big.at2.gp2.ly2su.de2tu.us2ya.com2ze.us301.to301url.com307.to3.ly4ms.me4sq.com4url.cc5.gp6url.com7.ly9mp.coma2n.eua.307.toaa.cxabbrr.comabcurl.netadf.lyadjix.comad.vuafx.cca.gda.ggaisr.usall.fuseurl.comalturl.comamzn.toa.nfar.gyarm.inarst.chatu.caaurls.infoax.agax.lyazc.ccazqq.comb23.rub2l.meb65.usbacn.mebcool.bzbeam.tobinged.itbit.lybizj.usbloat.mebravo.lybsa.lybudurl.combuk.meburnurl.comcanurl.comcdn.mashable.comchilp.itchzb.grclck.rucliccami.info
clickmeter.comclickthru.cacli.gsclk.mycl.lkcl.lyclop.incoge.lac-o.inconta.cccort.ascot.agcrks.mecrum.plctvr.uscurio.uscut.imcutt.uscuturls.comdaa.lidai.lydeadsmall.comdecenturl.comdf9.netdfl8.medigbig.comdigg.comdigipills.comdisq.usdld.bzdlvr.itdn.vcdoiop.comdo.mydopen.usdurl.medurl.usdwarfurl.comdy.fieasyuri.comeasyurl.neteepurl.comelfurl.comelurl.comesyurl.comeweri.comezurl.euf1ru.netfa.byfaceto.usfav.mefb.mefbshare.mefexr.comfff.toff.imfhurl.comfire.tofirsturl.defirsturl.netflic.krflingk.comflq.usfly2.wsfon.gsfreak.tofuseurl.comfuzzy.tofwd4.me
fwib.netget.sfu.caget-shorty.comget-url.comgizmo.dogkurl.usgl.amgo2cut.comgo2.mego.9nl.comgood.lygoo.glgo.qb.bygoshrink.comgourl.grgo.usa.govgowat.chg.ro.ltgurl.esgzurl.comhao.jphex.iohhvx.comhiderefer.comhmm.phho.iohop.imhosturl.comhotredirect.comhref.inhsblinks.comhtxt.ithub.tmhuff.tohulu.comhurl.ithurl.mehurl.nohurl.wsicanhaz.comidek.netikr.meilix.inir.peirt.meiscool.netis.gdits.myix.ltj2j.dejdem.czjijr.comj.mpjmp2.netjust.asketkp.inkindurl.comkisa.chkissa.bekl.amklck.mekore.uskorta.nukots.nukrunchd.comkrz.chktzr.us
kurl.nukurzurl.netk.vul9k.netlanjut.inlat.msliip.toliltext.comlin.crlin.iolinkbee.comlinkbun.chlinkee.comlinkl.rulinkslice.comlink.toolbot.comlinxfix.comliteurl.netliurl.cnlnk.bylnkd.inlnk.gdlnk.inlnk.lylnk.mslnk.nulnk.sklnkurl.comln-s.netln-s.rulookleap.comlow.ccl.prlru.jplt.tllurl.nomacte.chmakeashorterlink.commakeitbrief.commash.tomemurl.commerky.demetamark.netmicurl.commigre.memiklos.dkmin2.meminilien.comminilink.orgminiurl.comminurl.frmiturl.commke.memoby.tomoourl.commrte.chmuhlink.orgmurl.kzmyloc.memysp.inmyurl.innanoref.comnbc.conblo.gsnbx.chndurl.comne1.netnetshortcut.comnm.ly
nn.nfnotifyurl.comnotlong.comnot.myn.prnsfw.innutshellurl.comnvg8.itnxy.innyti.msoboeyasui.comoc1.usodun.netomf.gdom.lyomoikane.neton.cnn.comon.mktw.netooqx.comorz.seow.lyo-x.frparv.uspaulding.netpd.ampendek.inpic.gdpiko.meping.fmpipes.yahoo.compiurl.compli.gsplumurl.complurl.mep.lypnt.mepoliti.copoprl.compost.lypp.ggprofile.topt2.meptiturl.compub.vitrue.compuke.itqicute.comqik.liqlnk.netqr.cxqru.ruqte.mequickurl.co.ukquip-art.comqurl.comqurlyq.comqu.tcquud.comqux.inrb6.merde.merdr.toread.bireadthis.careallytinyurl.comredir.ecredirects.caredirx.comre.p.ly
retwt.merickroll.itr.imri.msriz.gdrmse.rurnk.mernm.mert.nurubyurl.comru.lyrurl.orgrww.tws4c.ins7y.ussafe.mnsai.lysdut.usservices.digg.comsfu.cashar.essharetabs.comshim.netshink.deshmyl.comshorl.comshortenurl.comshorterlink.comshorterlink.co.ukshort.ieshortio.comshortlinks.co.ukshortn.meshort.toshorturl.comshorturl.deshoturl.usshout.toshow.myshrimpurl.comshrink2one.comshrinkify.comshrinkr.comshrinkster.comshrinkurl.usshrten.comshrt.frshrt.stshrt.wsshrunkin.comshurl.netshw.mesimurl.comsitelutions.comsiteo.ussl9.ruslate.meslki.rusl.lysmallr.comsmallr.netsmallurl.insmalur.comsmarturl.eusmfu.insmsh.mesmurl.namesn.imsnipr.com
snipurl.comsnkr.mesnurl.comsokrati.rusong.lysp2.rospedr.comsqrl.itsrnk.netsrs.listarturl.comsturly.comsu.prsurl.co.uksurl.huta.gdtbd.lyt.cotcrn.chtgr.metgr.phthesurl.comthinfi.comthnlnk.comtighturl.comtimesurl.attiniuri.comtini.ustinurl.mobitiny123.comtinyarro.wstiny.bytiny.cctinylink.comtinylink.intiny.lytiny.pltinypl.ustinyuri.catinyurl.comtinyvid.iotk.tl.gdt.lh.comtllg.nettmi.metnij.orgtny.comto.togoto.usto.jeto.lytotc.usto.vgtoysr.ustpm.lytraceurl.comtra.kztr.imtrumpink.lttrunc.ittruncurl.comtsort.ustubeurl.comturo.ustw6.ustweetburner.comtweetl.comtweet.ms
twhub.comtwip.ustwirl.attwitclicks.comtwitterpan.comtwitterurl.nettwitterurl.orgtwittu.mstwiturl.detwtr.ustwurl.cctwurl.nlu28.deu6e.deu76.orgub0.ccuforgot.meuik.inuiop.meulimit.comulu.luu.mavrev.comunfake.itu.nuupdating.meur1.caurizy.comurl360.meurl4.euurl9.comurlac.comurl.agurlao.comurl.azurlbee.comurlbit.usurlborg.comurlbrief.comurlcorta.esurl.co.ukurlcover.comurlcut.comurlcutter.comurlenco.deurlg.inurl-go.comurlhawk.comurl.ieurlin.iturli.nlurl.lotpatrol.comurlme.inurlme.ruurlot.comurlpass.comurl-press.comurlred.comurlredo.comurlshorteningservice-fortwitter.comurls.imurl-site.comurltea.comurlu.msurlx.ieurlx.orgur.ly
urlz.aturlzen.comusat.lyuse.myu.tovb.lyvdirect.comv.gdvgn.amvi.lyvl.amvoizle.comvoomr.comvtc.esw3t.orgw55.dewapo.stwapurl.co.ukwebalias.comweturl.comw.hurl.wswipi.eswp.mexaddr.comxeeurl.comxil.inxlurl.netxr.comxrl.inxrl.usx.sexsm.usxs.toxurl.esxurl.jpx.vuxxsurl.dey.ahoo.ityatuc.comye.peyep.ityfrog.comyhoo.ityiyd.comyoutu.beyuarel.comyumurl.comyweb.comz0p.dezapt.inzi.mazi.muzi.pezip.lizipmyurl.comziprl.comzipuc.comzlig.cazootit.comz.pezud.mezurl.wszzang.krzz.gd
Determine Popular USS on Twitter
• Base: Two samples of Twitter messages (24 hours each, max 10 %)
• Results:I 7.5 million / 8.7 million messagesI 1.2 million / 1.1 million URLsI 553,320 / 431,636 shortened URLs
• Top ten services (cover 96% of all USS on Twitter):
1. bit.ly / j.mp2. t.co3. tinyurl.com4. goo.gl5. ow.ly
6. dlvr.it7. is.gd8. migre.me9. dld.bz
10. lnk.ms
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University12
Top ten services
0
100,000
200,000
300,000
400,000
500,000
600,000
Sample 1
Sample 2
Num
ber
of sh
ort
ened
UR
Ls
RestTop 9bit.ly
Sample 1
Sample 2
0%
20%
40%
60%
80%
100%
Pro
port
ion o
f sh
ort
ened
UR
Ls
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University13
USS in Spam
• Spam e-mails from SCHNUCKI project
• 7.9 million e-mails collected since 2003
• 12.8 million URLs, 0.3 % shortened URLs
• Query shortened URLs and analyze response code
• Calculate spam detection rate for relevant services
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University14
Spam detection rate of USS
0%
20%
40%
60%
80%
100%
is.gdtinyurl.com
snipurl.com
bit.ly
moourl.com
su.pr
migre.m
e
tiny.cc
redir.ec
urlpass.com
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University15
Malicious USS
Attack scenario: Setup a fast and attractive USS, serve all requestsnormally, but after a while start sending vulnerable browsers to malwaresites.
Analysis:
• Query shortened URLs seen on Twitter for 187 different USS with 83different User-Agent strings
• Analyze HTTP response Location header
• Result: No malicious behaviour found
• But: One service handles browsers different
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University16
User-Tracking USS
• Data from previous experiment
• Analyze HTTP response Set-Cookie header
• Results:I 65 USS set cookiesI 38 USS set persistent cookiesI 28 USS set persistent cookies with validity period 6 months or more
• Define Q as quotient: #all unique values#all values received for the cookie
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University17
Validity period of cookies
10
100
1,000
10,000
100,000
1,000,000
Validity P
erio
d(in h
ours
)
one day
one week
one month
half a yearone year
ten years
bit.ly dlvr.it
0.0
0.5
1.0
url
.lotp
atr
ol.co
mb23.r
usa
fe.m
nb23.r
uy.a
hoo.it
tiny.c
car.
gy
idek
.net
fa.b
ysh
rt.s
tu.to
go.q
b.b
ykrz
.ch
2.g
p2.ly
5.g
pri
.ms
lnk.in
xrl
.us
crks.
me
politi.c
oj.m
prw
w.tw
tpm
.ly
bit.ly
sdut.us
bco
ol.bz
o-x
.fr
budurl
.com
dlv
r.it
go.u
sa.g
ov
0rz
.tw
o-x
.fr
min
ilie
n.c
om
tinylink.in
adf.ly
tnij.o
rgb23.r
ush
ort
n.m
eadf.ly
nbc.
cotiny.c
c2.g
p2.ly
5.g
pkl.am
Q
Enumerating shortened URLs
For the top ten USS:
• Analyze structure of shortened URLs in both Twitter samples⇒ character frequency analysis using heatmaps
• Select range, ca. 230k URLs per USS
• Enumerate all URLs in range
• Inspect results by hand, search for secret URLs
• Observation: Only goo.gl imposed restrictions
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University19
Character frequency analysis: bit.ly
123456
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 348,968 bit.ly URLs
1e-05
0.0001
0.001
0.01
0.1
1
freq
uen
cy
123456
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 259,646 bit.ly URLs
1e-05
0.0001
0.001
0.01
0.1
1
freq
uen
cy
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University20
Character frequency analysis: tinyurl.com
1234567
a b c d e f g h i j k l mn o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9
chara
cter
posi
tion
character frequency of 32,831 tinyurl.com URLs
0.0001
0.001
0.01
0.1
1
freq
uen
cy
1234567
a b c d e f g h i j k l mn o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9
chara
cter
posi
tion
character frequency of 29,541 tinyurl.com URLs
0.0001
0.001
0.01
0.1
1
freq
uen
cy
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University21
Character frequency analysis: goo.gl
1
2
3
4
5
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 31,065 goo.gl URLs
0.1
1
freq
uen
cy
1
2
3
4
5
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 24,918 goo.gl URLs
0.1
1
freq
uen
cy
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University22
Character frequency analysis: dld.bz
1
2
3
4
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 3,904 dld.bz URLs
0.001
0.01
0.1
1
freq
uen
cy
1
2
3
4
abcdefghi jklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
chara
cter
posi
tion
character frequency of 3,477 dld.bz URLs
0.001
0.01
0.1
1
freq
uen
cy
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University23
Secret URLs found by enumerating
• Archives of private photos
• Several CVs
• Treasurer’s report for a company
• List of names and numbers of a kindergarten in Lindlar
• . . .
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University24
Submitting secret URLs to USS
• Question: Are secret URLs submitted to USS leaked?
• Set up honeypot web server
• Generate unique URLs for each service
• Suspicious and harmless URLs
• Examples:I http://fd0.me/secret/a0df29ac/bb42ce8bI http://www.fd0.me/blog/archive/2011/01/14/index.php?
article=69e325eb#a5a6c61c
• Submit to 255 USS
• Watch for requests
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University25
Results (after four weeks)
• Honeypot is found by Google, Yahoo and Baidu
• 15 URLs requested by Google
• 13 URLs requested by Yahoo
• 2 URLs requested by Baidu
• 13 URLs manually checked by USS administrators(9 transmitted the admin URL in the HTTP referrer)
• Four administrators contacted us, are interested in the research
• ⇒ Never submit private URLs to shortening services!
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University26
Latency and availability measurements
• Latency and availability measured with Smokeping
• From two different servers (in Germany)
• ICMP and HTTP latency measured for ten services
• Results:I Most services have good average HTTP latencyI Some services have a very bad worst-case HTTP latencyI goo.gl USS wins
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University27
0
100
200
300
400
500
600
bit.ly
t.cotinyurl.com
goo.gl
ow.ly
dlvr.it
is.gdmigre.m
e
dld.bz
lnk.ms
Late
ncy
(m
s)Average ICMP/HTTP Latency
ICMP latency (host 1)ICMP latency (host 2)HTTP latency (host 1)HTTP latency (host 2)
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University28
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
bit.ly
t.cotinyurl.com
goo.gl
ow.ly
dlvr.it
is.gdmigre.m
e
dld.bz
lnk.ms
Late
ncy
(m
s)Maximum ICMP/HTTP Latency
ICMP latency (host 1)ICMP latency (host 2)HTTP latency (host 1)HTTP latency (host 2)
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University29
Conclusion
• USS have risks
• These risks are very real
• USS leak URLs to search engines
• Do not submit private URLs to USS
• USS are used in spam e-mails
• Several services set long-running cookies and can track the user
• Shortened URLs are not completely random
• goo.gl USS dominates all others in every discipline
Alexander Neumann, Johannes Barnickel, Ulrike Meyer RWTH Aachen University30