anti-phishing approaches lifeng hu [email protected]
TRANSCRIPT
Anti-PhishingApproaches
Lifeng [email protected]
What is Phishing? An engineering attack An attempt to trick individuals into revealing personal
credentials (uname, passwd, credit card info, etc) Based on faked email and websites A threat for the internet users Damages- 73 million US adults
received more than 50 phishing emails a year
- $2.8 billion loss a year
Phishing Methods
Establish websites having similar interface/URL as famous websites
Establish cheating websites to get users’ personal information
Establish transparent website between original websites and users
Send emails containing malicious URL Send emails containing embed malicious
flash/picture files to avoid text checking of anti-phishing
False positive/negative rate of Anti-Phishing Approaches False negative rate: the rate of phishing websites being
regarded as good in all phishing websites
False positive rate: the rate of good websites being regarded as phishing in all good websites
So, the lower false rates are, the better Anti-Phishing approach is
goodphish
phish
goodgood
good
pf
goodphish
phish
goodgood
good
pf
goodphish
phish
goodgood
good
pf
phishgood
good
phishphish
phish
nf
phishgood
good
phishphish
phish
nf
Anti-Phishing Approaches for Specific Websites Typically, designed by website companies An example is Sitekey mechanism of
BankOfAmerica online Pro: False negative rate is low
False positive rate can be zero Con: Not applicable for phishing emails
Anti-Phishing Approaches Based on Database Anti-phishing Firewall : Kaspersky Anti-phishing Toolbar : Netcraft All based on on-line database Toolbar can provide URL statistics data in advance Pro: Applicable for both websites and emails
False negative rate can be low False positive rate is low
Con: Need frequent updates Relatively hard to implement False negative rate increases if not up-to-date
Anti-Phishing ApproachesBased on Content PILFER: email phishing detection based on machine-learning combining 10
filters:- IP based URL: 192.168.0.1/paypal.cgi?fix=account - Domain age from whois.net- Non-matching URL: <a href=“phishingsite.com"> paypal.com</a>- HTML email : hidden URLs- Malicious JavaScript- <More>… Pro: Practically, false positive and negative rate are relative low
Machine learning methods make it possible to improve accuracyNo constant update is needed
Con: Still need updates on training data and filters to adapt new styles of phishing emailsNetwork cost is a problem
Anti-Phishing ApproachesBased on Content (cont.)
CANTINA: phishing website detection based on TF-IDF weight- TF: the number of times a given term appears in a specific document- IDF: a measure of the general importance of the term in all documents- TF-IDF = TF/IDF, specifies term with frequency in a given document- Search five top TF-IDF words of current web page in search engine such as Google- Current web page should be in top N (30) search results to be legitimate
CANTINA also uses filters similar to PILFER to decrease false positive
Pro: False positive and negative rate are very lowNo constant update is neededSearch engine ranking is relative hard to cheat
Con: Network cost is a problemToo many phishing website searches may affect phishing websites’
ranking
Summary of mentioned Anti-Phishing Approaches
Anti-Phishing Approaches False Positive False NegativeImplement
EffortAdaptation
UpdateCycle
For Specific Websites Zero Low Easy Specific Website None
Firewall Based on Database Low Medium MediumGeneral
Web/EmailVery Frequently
Toolbar Based on Database Low Low HardGeneral
Web/EmailVery Frequently
PILFER Low Low Medium General Email Sometimes
CANTINA Very Low Low MediumGeneral
WebsitesFew
Thanks!