whowas: a platform for measuring web deployments on iaas clouds liang wang *, antonio nappa +, juan...
TRANSCRIPT
![Page 1: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/1.jpg)
1
WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds
Liang Wang*, Antonio Nappa+, Juan Caballero+, Thomas Ristenpart*, Aditya Akella*
* University of Wisconsin-Madison+ IMDEA Software Institute
![Page 2: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/2.jpg)
2
MotivationAn increasing number services are using clouds
Understanding cloud usage pattern is important
What is the usage pattern of a website?
How many instances are used by a website?
Do tenants leverage elasticity?
Is piratebay using EC2?
Are there OpenVPN servers in EC2?
- Design new services & applications- Design provisioning & scaling algorithm
![Page 3: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/3.jpg)
3
Motivation
We need more measurement tools
Little research about how tenants use public cloudsDeepfield, 2012: 1/3 of daily users, 1% of Internet traffic are associated with AWS He et al., IMC 2013: 4% of the Alexa top million are in EC2/Azure - Answer the question: Who is using public clouds?- Technique: Investage DNS entries for Alexa top websites
and network packet capture data.- No insight into changes to deployment pattern over timeBermudez et al, INFOCOM 2013: Exploring the cloud from passive measurements: The Amazon AWS case
![Page 4: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/4.jpg)
4
ContributionsWe develop a new measurement platform, WhoWas, to facilitate measurement studies of public cloud services
WhoWas
High churn rates of IPs used by
services each day
Most of web services use a
single IP
New software adopted slowly.
Outdated software popular
Quantify growth in usage of EC2 & Azure
Small number of malicious
websites in clouds
![Page 5: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/5.jpg)
The WhoWas Platform
Analysis
Clustering Engine
VPC Map
Feature Generator
IP ranges
TCP SYN Probes
At most 3 probes for an IP
per day
At most two GET requests for an
IP per day
HTTP GET: http(s)://1.1.1.1/
IP=1.1.1.1
Lightweight probing to associate content to IPs over time
5
WhoWasDB
AnalysisAPIs
![Page 6: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/6.jpg)
6
Ethical Measurement Design
• Lightweight, low-frequency probing• Robots.txt checking• Note in the User-Agent• IP exclusion list• Collected data kept private
![Page 7: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/7.jpg)
10/31/2013 11/10/2013 11/20/2013 11/30/2013 12/10/2013 12/20/2013 12/30/2013106K108K110K112K114K116K118K120K122K
Azure
Date
10/1/2013 10/11/2013 10/21/2013 10/31/2013 11/10/2013 11/20/2013 11/30/2013 12/10/2013 12/20/2013 12/30/20131.02M
1.04M
1.06M
1.08M
1.1M
1.12M
1.14M
1.16M
EC2
EC2: 4,702,208 IPs Oct 2013 – Dec 2013 51 roundsAzure: 495,872 IPs Nov 2013 – Dec 2013 46 roundsAbout 900 GB data in total
Data Collection & DataSetsN
o. o
f clu
ster
s
24.4% of all IPs
22.6% of all IPs
22.6% of all IPs
24.3% of all IPs
Overall growth of No. of IPs responding to probes: 4.9% in EC2 and 7.7% in Azure
7
![Page 8: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/8.jpg)
WhoWas Engines--Clustering
WhoWas offers a new clustering heuristic
…
How to find IPs being operated by the same website?
…
Webpage Clustering
8
![Page 9: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/9.jpg)
9
WhoWas Engines--Clustering
Feature Extractor
• Title• Keywords• Template• Google Analytics ID• Simhash of HTML textual content• Server version
Fingerprint (six-item tuple)
• Title• Keywords• Template• Google Analytics ID• Server version• Simhash of HTML textual content
HTML contents
For two fingerprints, check if : title1=title2 & keyword1=keyword2 & template1=template2 & server1=server2 & GID1=GID2?
No
Different clusters
Yes Same top level clusters
<IP, Round Number, Fingerprint>
<IP, Round Number, Fingerprint>
Clusters
Unsupervised clustering + Elbow method
Use simhash
![Page 10: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/10.jpg)
10
WhoWas Engines--Clustering
The No. of clusters increased by : 3.3% in EC2 and 6.2% in Azure
EC2: 1,767,072 simhashes 243,164 clustersAzure: 210,418 simhashes 31,728 clusters
![Page 11: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/11.jpg)
11
WhoWas Engines--Clustering
About 80% use 1 IP, 0.1% use more than 50 IPsLarge clusters tend to leverage cloud elasticity
Total #IP Mean #IP/Round Min #IP Max #IP
51,211 33,145 30,624 34,509
15,283 5,597 5,435 5,785
3,869 2,029 1,724 2,228
22,226 1,167 179 2,5018,488 617 57 1,837
Top 5 clusters by average number of IP addresses used per round (EC2)
![Page 12: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/12.jpg)
12
More Results from WhoWas
1. Feature Adoption2. Malicious Activity 3. Cloud Availability 4. Software Adoption
![Page 13: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/13.jpg)
13
More Results from WhoWas
1. Feature Adoption2. Malicious Activity 3. Cloud Availability 4. Software Adoption
![Page 14: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/14.jpg)
14
Virtual Private Cloud Mapping
Host A, Public IP=a
Host B, Public IP=b
DNS
Resolve Host A Resolve Host B
Get a Private IP != a Always Get Public IP b
VPC networksClassic network
Default DNS hostname
=region specific string + IP
EC2 Data Center
![Page 15: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/15.jpg)
15
EC2 VPC usage increase whereas classic decrease
Change over time in classic-only, VPC-only, and mixed clusters in EC2
classic-only VPC-only mixed clusters
![Page 16: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/16.jpg)
16
More Results from WhoWas
1. Feature Adoption2. Malicious Activity 3. Cloud Availability 4. Software Adoption
![Page 17: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/17.jpg)
Lifetime of malicious IP is long
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 910
0.10.20.30.40.50.60.70.80.9
1
Lifetime (days) on EC2
CDF
90+ days!
Webpage from an IP URLs in webpage
60% up for 7+ days
WhoWasDB
Safe Browsing API
IP is malicious
IP is benign
EC2: 1,393 malicious URLs 196 malicious IPsAzure: 14 malicious URLs 13 malicious IPs
17
![Page 18: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/18.jpg)
18
File hosting services are used for distributing malicious contents
Domain # of URLs flagged as maliciousdl.dropboxusercontent.com 993
dl.dropbox.com 936
download-instantly.com 295
tr.im 268
www.wishdownload.com 223
IP rangesMalicious activity history
VirusTotal API
EC2: 2,070 malicious IPs 13,752 malicious URLsAzure: No malicious IPs!
![Page 19: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/19.jpg)
19
Cloud Measurement Challenge and Future
VM1.1.1.1Backend VM
No public IP
Frontend VMPublic IP = 1.1.1.1
VPCVM
No default HTTP(S) Port
Firewall
VM
VM
Default website
Other websites
VMWebsite
VM
Website: deny IP access
Only see a portion of web servers
Only see a portion of web sites’ pages
Lower bound on number of IPs used by web services
Able to find
Fail to find
![Page 20: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/20.jpg)
20
Other results are in the paper!Visit our website:
www.cloudwhowas.orgto get more information!
![Page 21: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/21.jpg)
21
ConclusionWhoWas: new measurement platform Lightweight probing to associate content to IPs over timeUsed WhoWas for several first-of-their-kind measurements:
Growth rates of IP usageIdentification of malicious websitesSoftware adoption rate in clouds…
Questions?www.cloudwhowas.org
![Page 22: WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *](https://reader030.vdocuments.us/reader030/viewer/2022032611/56649dd85503460f94acd928/html5/thumbnails/22.jpg)
22
ConclusionWhoWas: new measurement platform Lightweight probing to associate content to IPs over timeUsed WhoWas for several first-of-their-kind measurements:
Growth rates of IP usageIdentification of malicious websitesSoftware adoption rate in clouds…
Questions?www.cloudwhowas.org