sns analysis using cloud computing services
DESCRIPTION
SNS Cloud AWS S3 Hadoop MapReduceTRANSCRIPT
![Page 1: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/1.jpg)
SNS Analysis using Cloud Computing ServicesDHT-based Key-Value Storage and MapReduce-based Analysis
DongWoo [email protected]
SocialFlowOikoLabDSOiko
Laboratory 2CloudKR
PlatformDay2009
1
![Page 2: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/2.jpg)
Agenda
‣ Introduction• Social Network Serivce• Motivation : Visualization, Social Network Analysis• SocialFlow• Scale Out Technologies : Cloud Computing
‣ SNS Analysis Architecture based on Cloud• Overall Process• Crawling• DHT Storage (CouchDB)• MapReduce• Pair-Wise Similarity
‣ Cloud Computing Service• Amazon Web Service• EC2 / S3 / Elastic MapReduce• Tips
‣ References
2CloudKR
2
![Page 3: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/3.jpg)
Introduction
Mobile DeviceCloud ComputingSocial Network
2CloudKR
3
![Page 4: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/4.jpg)
Social Network Service
“Social Applications = Social Networks”“A social network is a collection of people bound together through a specific set of social relations.”
“A collection of people is a social network if and only if it is possible for something to spread virally through that collection.”
2CloudKR
4
![Page 5: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/5.jpg)
Social Network Services : Twitter, Facebook2CloudKR
5
![Page 6: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/6.jpg)
Social Applications
6
![Page 7: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/7.jpg)
Social Networks
http://www.vincos.it/world-map-of-social-networks/
7
![Page 8: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/8.jpg)
Social Network Analysis
‣ Social Graph Analysis
‣ Visualization
‣ Person-to-Person Relationship
‣ Temporal Mind Mining (Content Clustering)
‣ Post-Mortem Log Processing
2CloudKR
8
![Page 9: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/9.jpg)
Social Network Analysis : Visualization2CloudKR
‣ Social Graph(50 People)
9
![Page 10: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/10.jpg)
Social Network Analysis : Visualization2CloudKR
‣ Social Graph (100 People)
10
![Page 11: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/11.jpg)
Social Network Analysis : Visualization2CloudKR
‣ Social Graph (200 People)
‣ Limitations‣ Visualization‣ Computational Complexity
11
![Page 12: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/12.jpg)
‣Social 3D Graph
Social Network Analysis : Visualization 2CloudKR
12
![Page 13: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/13.jpg)
SocialFlow
‣ Thoughts, Feelings, Interests, Relationship and Information of SNS
‣ Real-time Massive Social Data Streams
‣ Difficult to follow the Social Streams
‣ Need a way to get a summary or clustered information based on Common Interests
2CloudKR
SocialFlowOikoLabD
13
![Page 14: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/14.jpg)
SocialFlow
‣ Getting Common Flows of people through Content Similarities
‣ Reflecting Short-Term Interests of People
‣ Extracting Hot Issues
‣ Revealing Relationships among In/Out Resources
‣ Implementing Scale-Out Technologies
‣ Evolving toward Recommendation System based on Collective Intelligence
2CloudKR
14
![Page 15: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/15.jpg)
Scale Out Technologies : Cloud Computing2CloudKR
15
![Page 16: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/16.jpg)
Why Cloud Computing?
‣ SPOF (Single Point of Failure)
‣ Cluster Administration (Who do this?)
‣ Initial Infrastructure Investment (Risk Management)
‣ Focus on Main Thing (Intelligence)
‣ Enable Highly Scalable Services
2CloudKR
New resource provision paradigms for Grid Infrastructures: Virtualization and Cloud / ISGC 2009
http://tinyurl.com/nacgu7
16
![Page 17: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/17.jpg)
Cloud Computing: e.g. Storage Failure2CloudKR
Failure Trends in a Large Disk Drive Population, by Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz André Barroso, Google Inc.
17
![Page 18: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/18.jpg)
SNS Analysis Architecture based on Cloud2CloudKR
SocialFlowOikoLabD
18
![Page 19: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/19.jpg)
Experimental Project
SocialFlowOikoLabD
‣Python / Django / Boto
‣ML / Data Mining
‣DHT / CouchDB
‣Cloud / AWS S3, EC2, Hadoop MapReduce
2CloudKR
19
![Page 20: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/20.jpg)
Workflow2CloudKR
SNS Crawler MapReduce CDN UserPost-Processing
In-house Cluster(Local DataCenter)
Cloud Service
20
![Page 21: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/21.jpg)
Technologies : Before
Key-ValueStorage
ConsistentDHT MapReduce
MachineLearning
CouchDB
CouchJSHash_ring
HomeMade
Crawler
2CloudKR
Crawler Crawler
21
![Page 22: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/22.jpg)
Technologies : After
Key-ValueStorage
ConsistentDHT MapReduce
MachineLearning
CouchDB
EC2Hadoop
Hash_ring
HomeMade
Crawler
2CloudKR
Crawler Crawler
Storage S3
22
![Page 23: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/23.jpg)
Crawling2CloudKR
DB
DB
DB
DB
IndexerIndex
File
[ term, doc ]
Mapper
Crawler
Crawler
Crawler
Crawler
DHT Replication
‣ Fetching recent postings of SNS
‣ Storing fetched postings to CouchDB Storage through DHT Layer (which select a sever)
‣ Pushing raw data into the Cloud to process them with MapReduce
23
![Page 24: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/24.jpg)
Consistent DHT (Distributed Hash Table)
2CloudKR‣ Uniform key distribution and load balancing with a good hash function
‣ Minimizing the effects of a storage crash or temporal down
‣ High availability with replication scheme
2
Replicas
Replicate(k, k-1, k+1)
Node k-1
Node k+1 Node k
1
0N-1
Node N-1
k+1
k-1
!"#$!%&'()*+,-.(/0123',(0405123',(&6-.-7-1(080.-'9(.0405.-'9(.&6-.-7-1(0:
‣ Notice: A real node has non-linear portions of the total key space.
24
![Page 25: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/25.jpg)
Consistent DHT (Distributed Hash Table)
2CloudKR
2
Node k-1
Node k+1 Node k
1
0N-1
Node N-1
Memory Cache
DHT
DHT Front End
AWS S3
html image
SNS Anlysis
Admin View
View
User View
SNS Crawler
Anonymouse User Traffic
Admin Traffic
Generated Contents
25
![Page 26: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/26.jpg)
Consistent DHT : Replication2CloudKR
A B
D
B C
A
C D
B
D A
C
B
B
B
Replica Replica
* Replica = 2
26
![Page 27: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/27.jpg)
CouchDB (Key-Value Storage)2CloudKR
‣ Erlang -based Key-Value Storage
‣ Storage Engine (MVCC, B-tree)
‣ RESTful API
‣ Service-side JavaScript Engine (MapReduce)
‣ View Engine
‣ Futon Web UI
27
![Page 28: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/28.jpg)
CouchDB: Server-side Javascript
‣ Purpose
‣ Local Computations on Local Data Sets
‣ Features
‣ Mozilla’s Spidermonkey
‣ MapReduce Framework with Javascript
‣ Fork External Process (couchjs)
‣ Performance Enhancements Expected
‣ Googles V8 (Chrome’s Javascript Engine / JIT)
2CloudKR
http://tinyurl.com/m76sx3
28
![Page 29: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/29.jpg)
CouchDB: MapReduce2CloudKR
doc = (d1, d2, fq)
dx: { di }
29
![Page 30: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/30.jpg)
Map & Reduce : Pair-Wise Similarity2CloudKR
DB
DB
DB
DB
IndexerIndex File
Group File
[ term, doc ] [ term, { docs } ]
Mapper Reducer
Doc File
DocCombinator
Candidate File
[ term, { docs } ] =>
[ doc1, doc2 ]
Mapper
Result File
[ freq, doc1, doc2 ]
Reducer
DocGrouper
DocPairCounter
‣ Indexer and Grouper for Processing Korean.
‣ No NLP and No Structural Analysis.
‣ Produce a pairwise similarity between two postings.
30
![Page 31: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/31.jpg)
Map & Reduce : Optimization
‣ Concerns‣ Consider Key Group Size Distribution‣ Data Load Balancing‣ Barrier Point
‣ Sample Data‣ Two months postings of my friends‣ Reachable graph: 4,060 Peoples‣ Total Postings: 206,115
2CloudKR
31
![Page 32: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/32.jpg)
Pair-Wise Similarity and its TreeMap
Posting: 110,008Users: 2,691
Score >= 6
32
![Page 33: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/33.jpg)
Pair-Wise Similarity and its Cluster2CloudKR
➡One issue and different opinions among people
33
![Page 34: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/34.jpg)
Pair-Wise Similarity and its Cluster2CloudKR
➡C
omm
on In
tere
st /
Hot
Issu
e
34
![Page 35: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/35.jpg)
Pair-Wise Similarity and its Cluster2CloudKR
➡One person and the similar contents pattern (specialty)
35
![Page 36: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/36.jpg)
Pair-Wise Similarity and its Cluster2CloudKR
➡ Similar Structure of Sentences (trendy, parady)
36
![Page 37: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/37.jpg)
Deployment
www
Flickr
S3/CloudFront
EC2
2CloudKR
37
![Page 38: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/38.jpg)
Cloud Computing Service2CloudKR
38
![Page 39: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/39.jpg)
Before the Cloud Age2CloudKR‣ Smart Shell Guru’s Daily Work : Parallel Sort
$ wc -l data$ split -l 1000k data
$ sort -rm data*.sorted > data.sorted
scpNFS
scpNFS
$ nohup ./work.sh data1 > data1.processed$ nohup sort -r data1.processed > data1.sorted
➡ Need to prepare/maintain physical machines and resources➡ Need to monitor job progress (wait and see job’s status)➡ Need to cope with machine failure (slave nodes / storages / networks)➡ Need to schedule multiple jobs
Complexity
39
![Page 40: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/40.jpg)
Amazon Web Service : Overview2CloudKR
EBS (Elastic Block Store)EC2 (Elastic Compute Cloud) 1 GB to 1TBMount
SimpleDB S3 (Simple Storage Service)
API
CloudFront
SQS (Simple Query Service)
HTTP
Clients
Buckets
Objects
Permissions
key-value
AMI (Machine Image)
EC2 EC2 EC2 EC2
Access Key IDSecret Access KeyKey Pair
Clients HTTP
Admin
SSH
Clients
Clients
Elastic MapReduceInstant EC2 Hadoop Cluster
Hadoop Hadoop Hadoop
Header
CloudWatch
Auto Scaling
Elastic Load Balancing
Mgmt Console
Monitoring
Edges
Messages
Import/Export
Offline
eSATA/USB
EC2 CLI
40
![Page 41: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/41.jpg)
Amazon Web Service2CloudKR
‣ Amazon Management Console
41
![Page 42: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/42.jpg)
AWS : AMI
AMIAmazon Machine Image
2CloudKR
42
![Page 43: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/43.jpg)
AWS : Paid AMI / The Cloud Market
AMIAmazon Machine Image
2CloudKR
Paid AMI
43
![Page 44: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/44.jpg)
AWS : How to make a AMI (1)2CloudKR
Loopback File# dd if=/dev/zero of=new_image.fs bs=1M count=1024
Make ext3 file system# mke2fs -F -j new_image.fs# mkdir /mnt/ec2-fs# mount -o loop new_image.fs /mnt/ec2-fs# mkdir /mnt/ec2-fs/dev# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x console# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x null# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x zero# mkdir /mnt/ec2-fs/etc
Create /mnt/ec2-fs/etc/fstab (Add /dev/sda1 --> /, /etc/pts, shm, /proc, /sys)Create yum-xen.conf
# mkdir /mnt/ec2-fs/proc# mount -t proc none /mnt/ec2-fs/proc# yum -c yum-xen.conf --installroot=/mnt/ec2-fs -y groupinstall Base
Edit /mnt/ec2-fs/etc/sysconfig/network-scripts/ifcfg-eth0Edit /mnt/ec2-fs/etc/sysconfig/networkEdit /mnt/ec2-fs/etc/fstab (Add /dev/sda2 --> /mnt, /dev/sda3 --> swap)
chroot /mnt/ec2-fs /bin/shEdit services
44
![Page 45: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/45.jpg)
AWS : How to make a AMI (2)2CloudKR
Building an AMI# yum install ruby# rpm -i ec2-ami-tools-noarch.rpm (Download from public s3 bucket)# ec2-bundle-image -i new_image.fs -k my-private-key.key -u aws-user-id
Local Machine Root File System# ec2-bundle-vol -k my-private-key.key -s 1000 -u aws-user-id
Upload to S3# ec2-upload-bundle -b my-bucket -m image.manifest -a my-aws-access-key-id -s my-secret-key-id
Register AMI# ec2-register my-bucket/image.manifestIMAGE ami-xxxx
Testing# ec2-describe-images ami-xxxx
Deregister AMI# ec2-deregister ami-xxxx
Running AMI# ec2-run-intances ami-xxxx -n 1
http://docs.amazonwebservices.com/AWSEC2/2006-06-26/DeveloperGuide/
45
![Page 46: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/46.jpg)
AWS : EC2 Running Instance2CloudKR‣ AWS Management Console
46
![Page 47: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/47.jpg)
AWS : EC2 Running Instance2CloudKR
47
![Page 48: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/48.jpg)
Amazon Web Service: Access Methods2CloudKR
‣ Access Key ID / Secret Access Key ID / Key Pairs
‣ Amazon Management Console‣ EC2 API (WSDL) / EC2 CLI (Command Line Interface)‣ SSH
‣ Firefox Extensions• S3 Firefox Organizer• Elasticfox
‣ S3•DNS: s3 CNAME s3.amazonaws.com. e.g) Bucket Name: /s3.xyz.com http://s3.xyz.com ---> S3‘s s3.xyz.com
‣s3cmd (python)‣s3cmd.rb / s3sync.rb (ruby)‣S3Hub (Mac)
48
![Page 49: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/49.jpg)
Amazon Web Service: Elasticfox 2CloudKR‣ Firefox’s Extension: Elasticfox
49
![Page 50: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/50.jpg)
Amazon Web Service: Elasticfox 2CloudKR
‣ Key Pairs‣ Private Key‣ SSH
50
![Page 51: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/51.jpg)
Amazon Web Service: Elasticfox 2CloudKR
‣ Security Groups‣ Open Network Ports
51
![Page 52: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/52.jpg)
AWS: Elastic MapReduce2CloudKR
‣ EC2 + Hadoop
‣Tools‣ Management Console‣ elastic-mapreduce CLI
‣ Preparation‣ Code --> S3‣ Data --> S3
‣ Log Folder‣ Output Folder
‣Job Flow‣ Streaming‣ Custom Jar‣ Sample Applications
52
![Page 53: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/53.jpg)
AWS: Elastic MapReduce2CloudKR
53
![Page 54: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/54.jpg)
AWS: Elastic MapReduce : Web UI2CloudKR
54
![Page 55: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/55.jpg)
AWS: Elastic MapReduce : CLI for Workflow
Step1
Step2
Step3
input/*
output1/part-000**
output2/part-000**
output3/part-000**
2CloudKR
jobflow #id
55
![Page 56: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/56.jpg)
AWS: Elastic MapReduce2CloudKR
‣ Failed tasks will be rescheduled in other Hadoop slaves.‣ If a task is finished, the same instance will be killed by a tracker.
56
![Page 57: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/57.jpg)
AWS: Elastic MapReduce2CloudKR
57
![Page 58: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/58.jpg)
AWS: SocialFlow Automation2CloudKR
DHT
Home IDC Amazon Wild World
UsersAdmin
Re
ad
On
ly
Re
ad
/Write
Local Global
S3
boto python Launching EC2 pool
Results
Renderer
58
![Page 59: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/59.jpg)
AWS: EC2, EMR Price Model2CloudKR
Service Type Per Instance HourPer Instance Hour 1 Week (7 Days) 1 Week (7 Days)
EC2
On-Demand$ 0.10 (S)$ 0.40 (L)$ 0.80 (E)
$ 0.10 (S)$ 0.40 (L)$ 0.80 (E)
$ 16.8 $ 67.2 $ 134.4
KRW 20,865 KRW 83,462 KRW 166,924
EC2
Reserved1yr $ 3253yr $ 500
$ 0.03 (S)$ 0.12 (L)$ 0.24 (E)
$ 0.03 (S)$ 0.12 (L)$ 0.24 (E)
$ 5.04 $ 20.16 $ 40.32
KRW 6,259 KRW 25,038KRW 50,077
ElasticMapReduce On-Demand
$ 0.10 (S)$ 0.40 (L)$ 0.80 (E)
$ 0.015$ 0.06$ 0.12
$ 19.32$ 77.28$ 154.56
KRW 23,995KRW 95,981KRW 191,963
1 USD = 1242 KRW(S) = Small, (L) = Large, (E) = Extra Large
59
![Page 60: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/60.jpg)
AWS: Performance
http://tinyurl.com/qj6ao7
2CloudKR
60
![Page 61: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/61.jpg)
AWS: Performance2CloudKR
61
![Page 62: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/62.jpg)
AWS: Performance
http://tinyurl.com/p9jsyz
2CloudKR
62
![Page 63: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/63.jpg)
AWS: Performance
http://tinyurl.com/cqqxgl
2CloudKR
63
![Page 64: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/64.jpg)
10 Cent Tips
‣ AWS EC2
‣ Minimizing set-up time with prepared shell scripts
‣ Use Boto for automating deployments
‣ Use S3 (Free of Charge between S3 and EC2 in the same region)
‣ $0.030 per GB through June 30, 2000 ($0.1 per GB normal price)
‣ AWS Elastic MapReduce
‣ Enabling the SSH port(22) and Hadoop related ports (9100, 91001)
‣ Assess to Master Node: ssh -i keypair hadoop@public_dns_name
‣ Double Check (PATH, etc)
‣ Debug, Debug, Debug
‣ Use EC2 for hadoop (eg. Clouera’s Hadoop AMI) (No extra cost for Hadoop!)
2CloudKR
64
![Page 65: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/65.jpg)
10 Cent Tips
‣ AWS S3
‣ Setting HTTP header for images and static resources.
‣ Cache-Control: max-age=31536000
‣ Block Search Bots
‣ robots.txt at the root of a Bucket‣ User-agent: *‣ Disallow: /
‣ Using BitTorrent for large files
‣ http://s3.xyz.com/xfile.zip?torrent
‣ Compress Rendered HTML with gzip
‣ Content-Encoding: gzip
2CloudKR
$ s3cmd put index.html s3://s3.xyz.com/www \ --mime-type "text/html” \ --add-header "Content-Encoding: gzip" \ --acl-public
65
![Page 66: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/66.jpg)
Amazon Web Service : Limitations2CloudKR
66
![Page 67: SNS Analysis using Cloud Computing Services](https://reader033.vdocuments.us/reader033/viewer/2022052505/55500cf7b4c90535638b47d7/html5/thumbnails/67.jpg)
References
‣ 10 MapReduces Tips, Cloudera, http://tinyurl.com/pxuqup ‣ Christian Charas, Thierry Lecroq, Handbook of Exact String-Matching Algorithms‣ Dan Pritchett (eBay), BASE: Alternative ACID, p.48-55, ACM Queue May/June 2008‣ Edward Chang, (Google Research), Mining Large Scale Social Networks, MMDS ’08‣ Edward Walker, Benchmarking Amazon EC2 for high-performance scientific computing‣ Matei Zaharia et al, Improving MapReduce Performance in Heterogeneous Environments, OSDI ’08
‣ Following Twitter‣ http://twitter.com/AmazonEC2‣ http://twitter.com/AmazonS3S3
2CloudKR
67