![Page 1: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/1.jpg)
Typo-Squatting: a Nuisance or a Threat to Your Traffic?
Mishari Almishari
![Page 2: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/2.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 3: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/3.jpg)
Introduction - Motivation
Traffic is important to web domains!• no point of launching without incoming traffic
• Loosing/Gaining traffic means loosing/gaining money
• One way to price the ADS is Pay Per Click Model
Traffic Diversion could be a serious threat to a domain
![Page 4: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/4.jpg)
Introduction - Motivation
Typos may attract traffic• Users vulnerable to making typos
• Users may forget about visiting target domain• Threat to Target Domain!
Intentionally registering such typo domains is called Typo-squatting
![Page 5: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/5.jpg)
Introduction - Goal
To study how much traffic typo-squatters can get from target domains• Are those domains attracting much traffic?
• There are many typo-squatting domains registered (Banerjee et al., 08)
• Search engines typo-corrections and browser auto-completions!
• How much traffic target domains are loosing?• Is it of negligible ratio or a serious threat?• Do users go back to target domains or get distracted?
![Page 6: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/6.jpg)
Introduction - Contribution
Automatic and accurate identification of typo-squatting domains (Measurement Methodology)
Bound on how much traffic target domains are loosing towards typo-squatting domains (Measurement Results)
![Page 7: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/7.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Related Work Future Work Conclusion
![Page 8: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/8.jpg)
Background – Domain Parking
Domain Parking is the practice of showing a temporary page for an unused domain before launching it
![Page 9: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/9.jpg)
Background - Domain Parking
![Page 10: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/10.jpg)
Background – Domain Parking
![Page 11: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/11.jpg)
Background – Domain Parking
![Page 12: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/12.jpg)
Background – Domain Parking
Domain Parking Service• Parks and hosts unused domains
• Monetize the traffic by showing ads
Many Typo-squatting domains are parked domains (Wang et al, 06), (Keats, 07)
![Page 13: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/13.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 14: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/14.jpg)
Methodology
Data Collection Identifying Typo-Squatting Domains
![Page 15: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/15.jpg)
Methodology - Data Collection
UCI NET UCI NETINTERNETINTERNET
UCI ResolverOur Machine
DATE TIME HASHED-IP DOMAIN TYPE CLASS
USER QUERY
![Page 16: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/16.jpg)
Methodology – Identify Typo-squatting Domain
Identify Similar Domainsa. Single Error Typo
• Single error accounts for 90-95% of spelling/typo errors (Pollock et al, 83)
• www.walmart.com and www.wamart.com
b. gTLD substitution • www.amazon.com and www.amazon.org
![Page 17: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/17.jpg)
Methodology – Identify Typo-squatting Domains
But Similar domain is not enough!• www.abc.com and www.abd.com• www.walmart.com and www.walkmart.com• www.usps.com and www.usps.org • Random Sample
• More than 54% are not Typo-squatting
Need to Identify Hijacking Intention
![Page 18: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/18.jpg)
Methodology – Identify Typo-squatting Domain
• Identify Hijacking Indicator Parked Domain (Ads – listing)
~ 88%
Forwarding to other domains ~ 8%
Others: Inappropriate Content, …
Parked Domain as the indicator
![Page 19: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/19.jpg)
Methodology – Identify Typo-squatting Domain
Similar Domain Parked Domain
Typo-Squatting Domain
![Page 20: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/20.jpg)
Methodology – Identify Typo-squatting Domain
How to identify Parked Domain?• Parked Domain Classifier
• 96%
• Presence of Parking signatures• Well-known parking signatures (domain
names/urls)
![Page 21: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/21.jpg)
Methodology - Summary
Identify Similar Domains
Identify ParkedDomains
List of Typo-squatting
Domains
![Page 22: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/22.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 23: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/23.jpg)
Parked Domain Classifier
Build Data Set
Extract Core Features
Combine Into Classifier
![Page 24: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/24.jpg)
Data Set
Data Set consists of 2,800 domains 700 are parked domain
• Collected from MS Strider Website
2,100 are non-parked domains
• Collected From the fourteen Yahoo Directory Top Categories
![Page 25: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/25.jpg)
Feature Selection
• Heuristically, Identify common features in parked domain
• Compute the distribution of those features for verification
•Common Link Ratio Max
![Page 26: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/26.jpg)
Combining Features Into Classifier
Tried Different Classifier Algorithms• Decision Tree
• SVM
• K-Nearest Neighbor
• Random Forest
• The best performance
![Page 27: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/27.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 28: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/28.jpg)
DATA Sets
DNS Traces• Four Months
• ~ 30 million domains (~ 2 billion hits) (~ 30,000 users)
Target Domain Set• Alexa’s Top 500 popular domains
• ~53,000,000 hits
![Page 29: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/29.jpg)
Typo-Squatting Domains & Hits
1,332 typo-squatting 13,431 hits (~ 110 a day) Is it Large or Small?
• 500 Target Domains
• 4 Month Period
• ~ 30,000 users
• Given Similar Ratio may translate to non-trivial number
• 30,000 => 110 Per Day
• 300,000 => 1,100 Per Day
• 3000,000 => 11,000 (X 365 = ~ 4,000,000 A YEAR)
![Page 30: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/30.jpg)
Typo-squatting Ratio
• 0.025% of total number of queries
• (89% , ≤ 1%) (70%, ≤ 0.1%) ( 57%, ≤ 0.01%)
![Page 31: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/31.jpg)
User Correction Ratio – Alexa-500
• 54% of typo-squatting queries are corrected
• ~ 51% squatted target domains have most squat hits corrected
![Page 32: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/32.jpg)
Potential Hit Loss
• Potential Hit Loss Ratio = 0.012%
• (92% , ≤ 1%) (78%, ≤ 0.1%) (64%, ≤ 0.01%)
![Page 33: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/33.jpg)
Potential Money Loss
• ~75% do not point to target domains
• Referring Typo-Sqt Ratio = 0.008%
• (96%, ≤ 1%) (91%, ≤ 0.1%) ( 81%, ≤ 0.01%)
![Page 34: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/34.jpg)
Typo-Squatting Distribution
•19 % of all Typo-squatting hits
![Page 35: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/35.jpg)
Typo Characterization
• Most Typos are single errors (95% VS 5%)
• Most gTLD sub are “com” to “org” (50%)
• Add – 37 % are of non-adjacent keys
• Sub – 77% are of non-adjacent keys
• Sub – 13% of substitutions are “a” and “o”
•Spelling error
![Page 36: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/36.jpg)
Typo-squatting Domains – TP60
• 15,499 hits
• 0.045% of total number of queries
• (76%, ≤ 1%) (60%, ≤ 0.5%)
![Page 37: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/37.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 38: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/38.jpg)
Future Work
How much of the ads budget go to squatters? Enhance our identification technique See, if the results hold at other ISPs Typo Modeling for getting traffic back
![Page 39: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/39.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 40: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/40.jpg)
Related Work
MS Strider Project [Wang et al. Sruti06] McAfee Study [Keats McAfee White
Paper 07] JAAL project [Banerjee et al. Infocom 08]
![Page 41: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/41.jpg)
Outline
Introduction Background Methodology Parked Domain Classifier Measurements Future Work Related Work Conclusion
![Page 42: Typo-Squatting: a Nuisance or a Threat to Your Traffic?](https://reader033.vdocuments.us/reader033/viewer/2022060117/5585280ed8b42a3a308b459e/html5/thumbnails/42.jpg)
Conclusion
Accurately and automatically identify typo-squatting domains
How much traffic go to typo-squatters Bound on how much traffic the target domain is
loosing towards typo-squatting
• inconsequential