a technical approach to minimizing spam mallory j. paine

11
A Technical Approach to Minimizing Spam Mallory J. Paine

Upload: meagan-williams

Post on 31-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Technical Approach to Minimizing Spam Mallory J. Paine

A Technical Approach to Minimizing Spam

Mallory J. Paine

Page 2: A Technical Approach to Minimizing Spam Mallory J. Paine

The Spam EpidemicBy June 2003, unsolicited commercial email, or spam, accounted for nearly 55% of total email traffic on the Internet.

Spam is on the rise! On 2/1/05, the New York Times reported that spam now accounts for “80% or more” of all email traffic.

Why is spam a problem? Can’t users just ignore it?

Page 3: A Technical Approach to Minimizing Spam Mallory J. Paine

Current Anti-Spam Technologies

Protection of Email Address

Keyword- and Rule-based Filters

Verification Filters

Bayesian Analytical FiltersT

Page 4: A Technical Approach to Minimizing Spam Mallory J. Paine

Keyword- and Rule-Based Filters

Consists of a series of rules that use Boolean logic and the textual contents of a message to determine its legitimacy.

Good because it’s simple to implement.

Bad because analytical process is ‘dumb.’

Example: Suppose your filter contains a rule that blocks messages containing both the word ‘viagra’ and a URL. Messages containing a URL and a slight variation of ‘viagra’, like ‘vi@gra’ or ‘v i a g r a’ pass through the filter as legitimate messages.

Page 5: A Technical Approach to Minimizing Spam Mallory J. Paine

Verification FiltersMessage classification is based on the sender of the email.

If message is from a trusted sender, it’s marked as legitimate.

If message is from a known spammer, message is immediately classified as spam.

If message is from an unknown sender, then message is placed in quarantine until successful completion of a verification process.

Failure to complete the verification process results in sender labeled as a spammer, message classified as spam.

Page 6: A Technical Approach to Minimizing Spam Mallory J. Paine

Verification Filters:The Verification Process

Many different implementations of the Verification Process.

Ideally, verification process includes a task that is difficult or impossible for a computer to complete without human assistance.

Two examples: Simple image verification, more complex image verification: the CAPTCHA.

Page 7: A Technical Approach to Minimizing Spam Mallory J. Paine

Verification Filters:Advantages and Disadvantages

Good because nearly 100% effective at catching spam.

Bad because:

Friends have to “jump through a hoop” to send email to someone who uses a verification filter.

Verification filter ignores text contents of a message. Senders of messages that are obviously spam are still asked to complete verification. This means lots of unnecessary verification emails and wasted bandwidth.

Spam with a fake sender’s address is incorrectly classified as legitimate.

Page 8: A Technical Approach to Minimizing Spam Mallory J. Paine

Bayesian Filters:The Best Real-World Anti-Spam Technique

Calculates probability that a given message is spam by examining the words (or phrases) contained in the message.

By examining the words found in a message, and taking note of how many times each word has occurred in a user’s legitimate messages versus how many times that same word has occurred in junk emails, the Bayesian Filters computes the probability that the entire message is junk.

Page 9: A Technical Approach to Minimizing Spam Mallory J. Paine

Bayesian Filters:Advantages and Disadvantages

Good because a good implementation will classify messages with >99% accuracy and close to zero false positives.

Bad because the filter must be trained. The filter maintains databases of words from legit emails and of words from junk emails. Initially, these databases are empty. Until the databases contain a substantial amount of information about a user’s email, the filter will be unable to classify messages with any accuracy whatsoever.

Page 10: A Technical Approach to Minimizing Spam Mallory J. Paine

Now what?Given that all of the anti-spam techniques presented possess significant disadvantages to their use, it’s clear that none will effectively solve the spam epidemic.

Basically, it’s clear that there are huge, gaping flaws in the current email system, which is based on the POP3 and SMTP protocols. These protocols are very “open” and they make it easy for spammers to send millions of messages without restriction.

The email system must be redesigned entirely.

Page 11: A Technical Approach to Minimizing Spam Mallory J. Paine

Email and the FutureTwo Variations of a Viable Overhaul of the Email System

Implement an email system where the sender of a message is charged a small monetary fee for every message sent ($0.001, for example). This fee is negligible to the average user, but translates into a cost-prohibitive fee of $1000 for every million messages sent.

Implement a similar system where the sender is instead charged a computational fee for each message sent. This computational fee requires perhaps .1 seconds of processing time to complete, which translates into ~28 hours of computing time for every million messages.

Given that spammers rely on the free cost and ease of sending email to send tens of millions of messages per day, either of these two solutions is more than adequate to solve the spam epidemic.

Unlikely that either system will be implemented because the current email system is very, very deeply ingrained in all sorts of mainstream technology.