Understanding the Network-Level Behavior of Spammers
Author: Anirudh Ramachandran, Nick Feamster
SIGCOMM ’06, September 11-16, 2006, Pisa, Italy
Presenter: Tao Li
Questions What IP ranges send the most spam? Common spamming modes? How much
spam comes from botnets versus other techniques? (open relays, short-lived route announcements)
How persistent across time each spamming host is?
Characteristics of spamming botnets?
Motivation 17-month trace over 10 million spam
messages at “spam sinkhole” Joint analysis with IP-based blacklist
lookups, passive TCP fingerprinting info, routing info, botnet “C&C” traces
To find the network-level properties to design more robust network-level spam filters.
Outline
Background Information Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Outline
Background Information Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Spamming Methods Direct spamming
Buy connectivity from “spam-friendly” ISPs Open relays and proxies
Allow unauthenticated hosts to relay email Botnets
Infected hosts as mail relay BGP Spectrum Agility
Hijack send spam withdrawal routes
Mitigation techniques
Content filter Continually update filtering rules large corpuses for training Spammers easy to change content
Blacklist lookup Stolen IP address to send spam Many bot IP addresses are short-lived
Outline
Background Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Spam Email Traces “Sinkhole” corpus domain 8/5/2005—
1/6/2006 No legitimate email addresses DNS Main Exchange (MX) record Run Mail Avenger—SMTP sever
IP address of the relay A traceroute to that IP address A passive “p0f” TCP fingerprinting—OS Result of DNS blacklist (DNSBL) lookups
Spam Email Traces
Number of spam and distinct IP address rising
Data Collection Legitimate Email Traces
700,000 legitimate form a large email provider Botnet Command and Control Data
A trace of hosts infected by “Bobax” Hijacked authoritative DNS server running the
C&C of the botnet, redirect it to a honeypot , BGP Routing Measurements
Colocate a BGP monitor in the same network as “sinkhole”
Outline
Background Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Network-level Characteristics of Spammers
Distribution Across Networks Distribution across IP address space Distribution across ASes Distribution by country
The Effectiveness of Blacklists
Distribution Across Networks
Distribution across IP address space The majority of spam is from a relative small
fraction of IP address space and the distribution is persistent.
Distribution Across Networks About 85% of client IP addresses sent less
than 10 emails to the sinkhole. Important for spam filter design.
Distribution Across Networks
Distribution across ASes Over 10% from 2 ASes; 36% from 20 ASes
Distribution Across Networks
Distribution by country Although the top 2 ASes from which spa
m were received were from Asia, 11 of top 20 were from USA compromising 40% of all of the spam received from the top 20.
Assigning a higher level of suspicion according to an email’s country of origin maybe effective in filtering.
The Effectiveness of Blacklists Nearly 80% relays in the 8
blacklists
The Effectiveness of Blacklists
Spamcop only lists 50% spam received
Blacklists have high false positive
Ineffective when IP address using more sophisticated cloaking techniques
Outline
Background Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Spam from Botnets Bobax Topology
Spamming hosts and bobax drones have similar distribution across IP address space—much of the spam may due to botnets
Spam from Botnets Operating Systems of Spamming
Hosts 4% not Windows; but sent 8% spam
Spam from Botnets Spamming Bot Activity Profile
over 65% bot single shot, 75% of which less than 2 minutes
Spam from Botnets Spamming Bot Activity Profile
Regardless of persistence, 99% of bots sent fewer than 100 pieces of spam
Outline
Background Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Spam from Transient BGP Announcements
BGP Spectrum Agility A small but persistent group of spammers appe
ar to send spam by Advertising (hijacking) large blocks of IP address sp
ace (ie. /8s) Sending spam from IP address scattered throughou
t that space Withdrawing the route for the IP address space shor
tly after the spam is sent
Spam from Transient BGP Announcements
Announcement, withdrawal and spam from 61.0.0.0/8 and 82.0.0.0/8
Spam from Transient BGP Announcements
Prevalence of BGP Spectrum Agility 1% spam from short-lived routes; but
sometimes 10%
Outline
Background Data Collection Data Analysis
Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements
Discussion
Contribution Suggest using network-level properties of spamme
rs as an addition to spam mitigation techniques Quantify and document spammers using BGP rout
e announcements for the first time Present the first study examining the interplay bet
ween spam, botnets and the Internet routing infrastructure
Lots of useful findings according to network-level properties of spam
Weakness Use only a small sample, not providing
general conclusions about the Interne-wide characteristics
Only studied spam sent by Bobax drones Data collection in the Botnet Command
and Control Data, assuming host not patched and not use dynamic addressing during the course.
How to improve
Design a better notion of host identity Detection techniques based on
aggregate behavior Securing the Internet routing
infrastructure Incorporating some network-level
properties of spam into spam filters