imc 2011, measurement and evaluation of a real world deployment of a challenge-response spam filter

23
Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter Jelena Isacenkova and Davide Balzarotti

Upload: speederliolik

Post on 06-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 1/23

Measurement and Evaluation

of a Real World Deployment

of a Challenge-Response Spam Filter

Jelena Isacenkova and Davide Balzarott

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 2/23

12/22/11 Eurecom

Background

― In 2010 spam accounted for 89.1% of the emails

― Traditional email filtering is based on:

→ Content-based classification

→ Sender's properties, IP reputation, domain authenticity

Thorsten Holz, Honeyblog b

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 3/23

12/22/11 Eurecom

Traditional email filtering

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 4/23

12/22/11 Eurecom

Challenge-Response (CR) filtering

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 5/2312/22/11 Eurecom

Challenge-Response (CR) filtering

The responsibility of the email delivery is shifted from the recipient  

to the sender of the message

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 6/2312/22/11 Eurecom

Challenge-Response (CR) filtering

The responsibility of the email delivery is shifted from the recipient  

to the sender of the message

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 7/2312/22/11 Eurecom

Myths and critics of CR Systems

― Annoying to other users

― Email delivery delay

― Email backscatter

― Mail server blacklisting

― Newsletter non-delivery

― Burdens CR system email users

― Some real emails may go missing

― Deadlocks between two CR serversGarriss et al. “RE: Reliable email ”

Erickson et al, “The effectiveness of Whitelisting: a User-Study

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 8/2312/22/11 Eurecom

Dataset and system architecture

― 47 mail servers (13 open relay)

― Monitoring period: 6 months

― 19,426 protected users

― Total messages: 90M

― Total challenges: 4,3M

― Solved challenges: 151K

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 9/23

12/22/11 Eurecom

System architecture

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 10/23

12/22/11 Eurecom 1

Our contribution

― Internet viewpoint

Users' viewpoint

― Administrator's

viewpoint

– Email backscatter phenomenon

– False negatives in CR system

– Message delivery delay

– CR server blacklisting

– Misdirected challenges

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 11/23

12/22/11 Eurecom 1

Backscattered messages

― Misdirected automated bounce messages(e.g. Non-Delivery Notification)

― The cause is the usage of forged email

addresses (spoofing)

― CR systems work as reflectors:

the unknown incoming messages are challenged with an

automated message aiming to verify the source

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 12/23

12/22/11 Eurecom 1

Backscatter ratio evaluation

How much and what emails CR system pours into the Internet?

― Reflection ratio:

R = RefC/IncE = 19,3%

― Backscattered ratio(spam sent by the CR system)

β <= 8,7%

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 13/23

12/22/11 Eurecom 1

19,3%: good or bad ?

R → 0 %R → 89 %

Harmful Useless

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 14/23

12/22/11 Eurecom 1

False negatives in CR systems

Do CR systems provide 100% spam protection in a real world deployment?

― To answer, we clustered challenged emails by their subject and sender similarity

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 15/23

12/22/11 Eurecom 1

False Negatives

― Requirements for the High-Volume (HV) spam to reach CR

protected user's mailbox are high:

1. Pass reverse DNS, DNSBL, AntiVirus checks

2. Spoof an existing user address(HV spam often uses non-existing email addresses)

3. The innocent user needs to solve a CAPTCHA

― In our experiment a phenomenon of an innocent user solving a

CAPTCHA occurred in 1/10,000 challenges sent

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 16/23

12/22/11 Eurecom 1

Email delivery delay

What is the actual email delivery delay for the new communications?

→ 94% of the performed communications are between known contacts

→ 50% of the ever whitelisted senders get whitelisted in the first 30 minutes, mostly by

senders solving CAPTCHA

→ 0.6% of white emails are delayed more than 1 day

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 17/23

12/22/11 Eurecom 1

Server blacklisting

― Spam traps

→ Designed to lure spam

→ BL services use them to update their blacklists

If a challenge hits a spamtrap, the server may get blacklisted

― Two experiments:

→ Mail server logs monitoring for 'blacklisted' error messages

→ Tracking the MTA-OUT IPs in 8 known blacklisting services

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 18/23

12/22/11 Eurecom 1

Server blacklisting

In our 3 months experiment 75% of the serversnever appeared in any blacklist

― But some servers were blacklisted:

→ During the holidays, stayed unnoticed

Several times in a row→ For the most monitoring period

― No correlation found between number of

challenges sent and server blacklisting

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 19/23

12/22/11 Eurecom 1

Conclusions

― CR systems are capable of providing 99.99% spam protection(on top of other deployed anti-spam filters)

― The filter employs a number of side effects:

→ Email traffic increase (~ 0.62%)

→ Useless challenges sent (96%)

→ User email delays:

° 3% delayed for 30 minutes

° 0.6% delayed for more than one day

― CR servers need to be configured and maintained very carefully

→ To lower the probability of blacklisting (and act to get the IP out of the blacklists)

→ To reduce the effect of a backscattered phenomenon

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 20/23

12/22/11 Eurecom 2

Thank you!

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 21/23

12/22/11 Eurecom 2

CAPTCHA and filter performance

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 22/23

12/22/11 Eurecom 2

Burden on users

What is a burden on users that maintain theirwhitelists?

→ 51% of the users has not more than 10 chang

in their whitelists during 2 months period

→ Number of very active users is very low, aroun

1-2% of the users

→ Average daily digest size varies greatly betwe

users, thus burdening some of them

→ User responsibility to check their digest

8/3/2019 IMC 2011, Measurement and Evaluation of a Real World Deployment of a Challenge-Response Spam Filter

http://slidepdf.com/reader/full/imc-2011-measurement-and-evaluation-of-a-real-world-deployment-of-a-challenge-response 23/23

12/22/11 Eurecom 2

Histograms and correlations