epl682 -papers · • ml-based system for solving image-based captchas, that extracts semantic...
TRANSCRIPT
![Page 1: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/1.jpg)
EPL682- PAPERS----------
Re:CAPTCHAs– UnderstandingCAPTCHA-SolvingServicesinanEconomicContext
IAmRobot:(Deep)LearningtoBreakSemanticImageCAPTCHAs
Antreas Dionysiou - DepartmentofComputerScienceUniversityofCyprus
February2019
![Page 2: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/2.jpg)
2
BACKGROUND
![Page 3: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/3.jpg)
What are CAPTCHAs?
• CompletelyAutomatedPublicTuringtesttotellComputersandHumansApart(CAPTCHA).• Proposedin2003byVonetal.• AlsoreferredasReverseTuringTests.• CAPTCHAstellifauserishumanornot.• DifferentversionsofCAPTCHAexists.• Blockautomatedbotsystemsattacks.• Mustresistautomatedsolving.• Mustbepainlessforhumans.
3
![Page 4: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/4.jpg)
CAPTCHAVersions
4
![Page 5: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/5.jpg)
Text-basedCAPTCHAs
• MostwidelyusedCAPTCHAscheme.• CAPTCHAdesigning,reflectsatrade-offbetweenprotectionandusability.
5
![Page 6: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/6.jpg)
Paper:“Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext.”
6
![Page 7: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/7.jpg)
Whatisallabout?(Summary)
• BriefexplanationaboutCAPTCHAs.• CAPTCHAsolvingecosystemhasemergedwith2majorcategories:
1. AutomatedCAPTCHAsolvers(software).2. Real-timehumanlabor.
• EvaluationofCAPTCHAsineconomicterms.• CAPTCHA’sunderlyingcoststructurebenefitsdefender.• PlentyofCAPTCHAsolvingserviceswithverylowprices.• CAPTCHAsshouldbeviewedasaneconomicimpedimenttoanattacker(notonlyasatechnologicalone).
$1/1000
Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010. 7
![Page 8: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/8.jpg)
Whatisallabout?(Cont.)
• Theoverallshapeofmarketispoorlyunderstood.
• Bigevolutionofautomatedsolvingtools…
• …but,eclipsedbytheemergenceofhuman-basedsolvingmarket.
• Economicexaminationofhuman-basedsolvingmarket.
Human-basedsolvers Automated(software)solvers Hybridsolvers
8Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 9: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/9.jpg)
Relatedwork
• Theauthorsclaimthattheyarethefirsttoidentifythegrowthofhuman-labor-basedCAPTCHAsolvingservices.• TheclosestworkrelatedisthestudyofBursztein etal.[1],BUT isfocusedonCAPTCHAdifficultyratherthantheunderlyingbusinessmodels.• Nootherrelatedwork(atthattime).
[1]E.Bursztein,S.Bethard,J.C.Mitchell,D.Jurafsky,andC.Fabry.HowgoodarehumansatsolvingCAPTCHAs?alargescaleevaluation.InIEEES&P’10,2010.
9Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 10: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/10.jpg)
AuthorsTriedtoAnswerKeyQuestionsLike
WhichCAPTCHAsaremostlytargeted?
Roughsolvingcapacity?
Qualityofservice?
Pricingofservices?
Workforcedemographics?
Services’adaptabilitytochangesinCAPTCHAschemes?
Overall,thisresearchprovidesareasoningaboutthenetvalueofCAPTCHAsunderexistingthreats.
10Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 11: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/11.jpg)
CAPTCHAEconomics,butwhy???
• CAPTCHA’stechnicalperspective,doesn’tcapturethebusinessrealitiesofCAPTCHA-solvingecosystem.• Theprofitabilityofanyscamisafunctionof3factors:
1. ThecostofCAPTCHAsolving.2. Theeffectivenessofanysecondarydefenses.3. Theefficiencyoftheattacker’sbusinessmodel.
• CAPTCHAsaddfrictiontotheattacker’sbusinessmodel.• CAPTCHAsminimizethecostandlegitimateuserimpactofheavier-weightsecondarydefenses(e.g.sms,etc.).
11Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 12: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/12.jpg)
EconomicsofCAPTCHA-solvingMarket
• ThemarketforCAPTCHA-solvingserviceshasbeenexpanded…
• …but,thewagesofworkershavebeendecliningduetothesereasons:1. CAPTCHAsolvingisanunskilledjob.2. Itcanbeeasilysourcedviainternettothelowestcostlabor.3. Anincreasedcompetitionontheretailsideexist.
• Mr.Esaidthat50%ofrevenueisprofit,roughly10%isforserversandbandwidth,andtheremainderissplitbetweensolvinglabor.
12Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 13: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/13.jpg)
CAPTCHA-SolvingMarketWorkflow
13Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 14: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/14.jpg)
CAPTCHA-SolvingServicesAnalysis
• Evaluatedserviceswhichwerewell-advertisedatthetime.
• Evaluated8CAPTCHA-solvingservicesfor5monthscollectingCAPTCHAsbymostpopularwebsites.
• Evaluatingseveralaspectssuchas:1. Customerinterface.2. Solutionaccuracy.3. Responsetime.4. Availability.5. Capacity.
14Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 15: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/15.jpg)
QualityofServiceAssessment
15Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 16: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/16.jpg)
QualityofServiceAssessment(cont.)
• Medianerrorrateandresponsetime(inseconds)forallservices.Servicesarerankedtop-to-bottominorderofincreasingerrorrate.
16Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 17: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/17.jpg)
ServicesAnalysisResults
• Antigate andImageToText providedthefastestservice.
• AccuracyandresponsetimevariedwiththetypeofCAPTCHA.
• Thevalueofaparticularsolverdependson3factors,namely:1. Accuracy.2. Responsetime.3. Price.
• DeCaptcher andCaptchaBot hadthelargestsolvingcapacity,astheycouldsolve14–15CAPTCHAspersecond.
17Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 18: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/18.jpg)
WorkerWages
• TheyfocusedontwoservicesnamelyKolotibablo andPixProfit.
• Kolotibablo paysworkersatavariablerate(from$0.50/1,000uptoover$0.75/1,000CAPTCHAs)dependingonhowmanyCAPTCHAstheyhavesolved.
• PixProfit offersasomewhathigherrateof$1/1,000.
• Aminimumamountofmoneyshouldbecollectedbeforepayout.
• Mostservicesprovidepaymentviaanonlinee-currencysystem.
18Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 19: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/19.jpg)
GeographicDemographics
• AllservicesincludeasizeableworkforcefluentinChinese,likelymainlandChina.• Antigate hasappreciableaccuraciesforRussianandHindi,presumablydrawingonworkforcesinRussiaandIndia.• Similarly,forCaptchaBypass andRussian.• BeatCaptcha andTamil,Portuguese,andSpanish.• DeCaptcher andTamil.• ImageToText hasappreciableaccuracyacrossaremarkablerangeoflanguages.
19Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 20: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/20.jpg)
AdaptabilityofCAPTCHAServices
• AgainfocusedonKolotibablo andPixProfit services.• TestthemontheAsirra CAPTCHA.• ImageToText displayedaremarkableadaptability,solvingtheAsirraCAPTCHAonaverage39.9% ofthetime.
Figure5:ImageToText errorrateforthecustomAsirra CAPTCHAovertime.
20Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 21: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/21.jpg)
MostPopularTargetedCAPTCHAs
21Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 22: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/22.jpg)
Conclusions
• CAPTCHAs’ low-impactqualitymakesthemattractivetositeoperators,
• …but,atthesametime,easytobeoutsourcedtoglobalunskilledlabormarket.
• CAPTCHA-solvingbusinessiswell-developed,highly-competitive,andwithlargecapacityindustry.
• WholesaleandretailpricesforCAPTCHA-solvingwillcontinuetodecline.
• CAPTCHAsdon’tpreventlarge-scaleautomatedsiteaccess,
• …but,theyeffectivelylimitautomatedsiteaccess.
22Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 23: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/23.jpg)
Conclusions(Cont.)
• AsthecostofCAPTCHAsolvingdecreases,asiteoperatormustemploysecondarydefensesmoreaggressively.
• CAPTCHAs shouldberegardedasaneconomicimpediment(notonlyatechnologicalone).
• CAPTCHAsarelow-impactmechanismsthataddfrictiontotheattacker’sbusinessmodel.
• CAPTCHAsminimizethecostandlegitimateuserimpactofheavier-weightsecondarydefenses.
23Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
![Page 24: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/24.jpg)
Paper:“IAmRobot:(Deep)LearningtoBreakSemanticImageCAPTCHAs.”
24
![Page 25: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/25.jpg)
Whatisallabout?(Summary)
• A studyofthelatestversionofGoogle’sreCaptcha.
• AuthorsinfluencereCaptcha’s riskanalysisprocess.• IdentifyreCaptcha’s flaws,bypassrestrictions,anddeploylarge-scaleattacks.
• Proposalofaneffectiveandlow-costdeep-learning-basedattackforthesemanticannotationofimages.
• Proposalofaseriesofsafeguardsandmodificationsforresistingtheirattacks.
25Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 26: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/26.jpg)
Relatedwork
• Yanetal.,“Alow-costattackonamicrosoft CAPTCHA,”inCCS’08.
• Yanetal.,“BreakingvisualCAPTCHAswithnaivepatternrecognitionalgorithms,”inACSAC’07.
• Lietal.,“Breakinge-bankingCAPTCHAs,”inACSAC’10.• Perezetal.,“BreakingreCAPTCHAs withunpredictablecollapse:Heuristiccharactersegmentationandrecognition,”inMCPR 2012.
• Many,many,otherpapersrelatedtoautomatedCAPTCHAsolving...
26Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 27: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/27.jpg)
Google’sreCaptcha
• ThegoalofGoogle’slatestversionofreCaptcha,isto:1. Minimizetheeffortforlegitimateusers.2. Requiringtasksthataremorechallengingtocomputersthan“simple”text
recognition.
• reCaptcha isdrivenbyan“advancedriskanalysissystem”.
• reCaptcha widgetalsoperformsaseriesofbrowserchecks.
• MostwidelyusedCAPTCHAservice.
• Leveragesinformationaboutusers’activitiesthroughcookies.
27Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 28: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/28.jpg)
HowreCaptcha works?
1. Userclicksonacheckbox.2. A requestissentcontainingallrelatedtousercollected
information.3. Therequestisanalyzedbytheadvancedriskanalysissystem,which
decidesthetypeofCAPTCHAchallengetobepresentedtotheuser.4. Iftheuserrequestsmultiplechallengesorprovidesseveralwrong
answers,thesystemwillreturnincreasinglyharderchallenges.
28Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 29: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/29.jpg)
CAPTCHAVersions
29Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 30: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/30.jpg)
Contributions
• DeployedanautomationtoolwithoutbeingdetectedbyreCaptchawidget.
• Identifieddesignflawsthatallowattackersto“influence”theadvancedriskanalysisprocess.
• ML-basedsystemforsolvingimage-basedCAPTCHAs,thatextractssemanticinformationfromimages.
• Highlyeffectiveandefficientsystem,achieving70.78%accuracy,solvingchallengesin≈19seconds.
• Demonstratedtheirattack’sgenericapplicability.
• Evaluatedtheirtoolintermsofcost-effectiveness(offline-mode).
30Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 31: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/31.jpg)
TheirCAPTCHA-solvingsystem
• TheirsystemisbuildonSelenium,andMozillaFirefox(v.36).
• Theirsystemisbasedon2components:1. The1st isresponsibleforcreatingtrackingcookiesthatinfluencetherisk
analysisprocess.
2. The2nd processes thechallengesfollowingdifferenttechniquesbasedonthetypeofchallenge.
31Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 32: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/32.jpg)
ComputerVisionAlgorithmsandImageAnnotationServicesUsed• Google’sreverseimagesearch(GRIS)forconductinganimage-basedsearch.
• DifferentImageannotationservicesforassigningtags(keywords)orfree-formdescriptionofimages.
• AML-basedclassifierthatcanguessthecontentofanimagebasedonasubsetofthetags.
• A manuallycreatedlabeled-datasetwithimagesandtheirtagfromchallengestheyhavecollected(HistoryModule).
32Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 33: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/33.jpg)
Findings
• Google’sadvancedriskanalysiscanbeneutralizedbyusinga9-dayoldcookie(withorwithoutwebsurfing).
• BeingloggedinaGoogleaccount,with,andwithoutconductingaphoneverification doesnotinfluenceriskanalysissystem.
• Norestrictionbasedonthecountryinwhichacookieiscreated.• Webdriver variabledoesnothaveaneffect.
• User-agent’sbrowserandengineversionsaswellastheactualenvironmentoftheexperiment playscriticalrole.
33Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 34: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/34.jpg)
Findings(Cont.)
• User-agentthatdoesnotcontaincompleteinformation,orismiss-formattedreceivesahard(fallback)CAPTCHA.• Widgetdoesnotdetecttheunderlyingoperatingsystem.• Mismatchbetweenuser-agents,duringacookie’screationandwhenrequestingaCAPTCHAwiththatcookie,doesnothaveeffect.• Screenresolutionandmousebehaviordonotaffecttheoutcomeofriskanalysis.• Cookiesarenotassignedareputationscore(accordingtohistory).• NomechanismprohibitingthecreationofalargenumberofcookiesfromasingleIPaddress.
34Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 35: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/35.jpg)
Findings(Cont.)
• Capacityperday:1. Duringweekdays,theycouldsolvebetween52,000and55,000.2. Duringweekendstheycouldsolve59,000.
• reCaptcha versionsuffersfromsignificantflawsandomissions.
• Formostcases(74%)thenumberofcorrectcandidateimagesis2;therestcontain3andtheyalsofoundtwochallengeswith4.
• Challengesarenotcreated“on-the-fly”butselectedfromarelativelysmallpoolofchallenges.
35Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 36: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/36.jpg)
Findings(Cont.)
• 1,368redundantimagesthatbelongedto358setsofidenticalimages.
• Highlyefficientattacksolvingchallengesin≈19seconds,mentioningthatthemosttimeconsumingphaseisGRIS.
• Alimitedvarietyofimagecategorieshasbeendetected.
• AdversariescandeployaccurateandefficientattacksagainsttheimagereCaptcha withoutrelyingonexternalservices.
• EvaluatedtheirCAPTCHAbreakingsystem’seconomicviability.
36Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 37: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/37.jpg)
Findings(Cont.)
• Discusscountermeasuresfordefendingagainsttheirattacks,andtheirpotentialimpactontheusability.
• reCaptcha hasbeenupdatedafterauthorsinformedGoogleandFacebook.
37Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 38: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/38.jpg)
Conclusions– Futurework
• Furtherimprovementoftheirattack’saccuracycanbeexplored.
• ReassessmentonreverseTuringtests(CAPTCHAs)andtheirdesignisconsideredcritical.
• Demonstratedthefeasibilityoflarge-scaleCAPTCHA-solvingattacks.
• reCaptcha’s advancedriskanalysissystemandwidgetpossessvaluablefunctionality,thatcanbeincorporatedintofuturecaptchaschemesformitigatingattacks.
38Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
![Page 39: EPL682 -PAPERS · • ML-based system for solving image-based CAPTCHAs, that extracts semantic information from images. • Highly effective and efficient system, achieving 70.78%](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8117236c908623f10ecd58/html5/thumbnails/39.jpg)
Thanksforyourattention!!! J
Anyquestions?
39