cloud computing and mobile platforms
Post on 14-Apr-2018
228 Views
Preview:
TRANSCRIPT
-
7/27/2019 Cloud Computing and Mobile Platforms
1/103
Lecture 1 Overview andPerspectives
922EU3870 Cloud Computing andMobile Platforms, Autumn 2009
2009/9/14
Ping Yeh ( ), Google, Inc.
-
7/27/2019 Cloud Computing and Mobile Platforms
2/103
2
The Morakot story
-
7/27/2019 Cloud Computing and Mobile Platforms
3/103
3
Morakot
2000mm of rain poured in southern Taiwan in 1.5 days
during holidays.
Flood, mudslide.
People crying for help, emergency phone line is jammed.
3 web sites were created for information exchange. typhoon.adct.org.tw, typhoon.oooo.tw, disastertw.com.
Other people used existing tools to create web pages.
, PTT Emergency,
PTT ( ) ,
Other people post to microblogging sites: twitter, plurk.
http://maps.google.com.tw/maps/ms?ie=UTF8&hl=zh-TW&msa=0&msid=116386460682638203042.000470a33fd5b4fcc5768&ll=23.110049,120.684814&spn=4.278449,8.366089&z=8&brcurrent=3,0x3471e089bb4338c9:0xfdddeea4a2da6d2d&iwloc=000470a383e25bfc5758bhttps://sites.google.com/site/emergencyptt/https://sites.google.com/site/emergencyptt/http://spreadsheets.google.com/pub?key=tMsw3oN65abGkmGzEXZZkeg&output=htmlhttp://spreadsheets.google.com/pub?key=tMsw3oN65abGkmGzEXZZkeg&output=htmlhttp://maps.google.com/maps/ms?ie=UTF8&hl=zh-TW&msa=0&msid=104575551371452297710.000470b5cd56d17528fce&z=8http://maps.google.com/maps/ms?ie=UTF8&hl=zh-TW&msa=0&msid=104575551371452297710.000470b5cd56d17528fce&z=8http://spreadsheets.google.com/pub?key=tMsw3oN65abGkmGzEXZZkeg&output=htmlhttps://sites.google.com/site/emergencyptt/http://maps.google.com.tw/maps/ms?ie=UTF8&hl=zh-TW&msa=0&msid=116386460682638203042.000470a33fd5b4fcc5768&ll=23.110049,120.684814&spn=4.278449,8.366089&z=8&brcurrent=3,0x3471e089bb4338c9:0xfdddeea4a2da6d2d&iwloc=000470a383e25bfc5758b -
7/27/2019 Cloud Computing and Mobile Platforms
4/103
4
typhoon.adct.org.tw
-
7/27/2019 Cloud Computing and Mobile Platforms
5/103
5
disastertw.com
-
7/27/2019 Cloud Computing and Mobile Platforms
6/103
6
typhoon.oooo.tw
-
7/27/2019 Cloud Computing and Mobile Platforms
7/1037
Problems!
Two sites went down soon after people flock there.
One site spent hours to move to a telecom's data center.
The other went up and down several times.
One site stayed functional all the time.
-
7/27/2019 Cloud Computing and Mobile Platforms
8/1038
Why?Why?
-
7/27/2019 Cloud Computing and Mobile Platforms
9/1039
Load
A server can handle limited load
Each request requires resources to handle: process, thread,CPU, memory, etc.
There is a maximum load one server can handle
The database can handle limited connections
Beyond the maximum:
can't accept new requests:timeout, intermittent service,no more connections, etc.
program crash
kernel panic
http://en.wikipedia.org/wiki/File:Ubuntu-linux-kernel-panic-by-jpangamarca.JPG
-
7/27/2019 Cloud Computing and Mobile Platforms
10/10310
How did one site survive?
One keyword: Scalability
Key points on scalability:
Simple web site (not complicated CMS): increases QPS
Deployed on Heroku: up to 3 dynos for 400k pageviews/day
peak qps estimated: > 100. quote http://heroku.com/how/dynos: 4 dynos are equivalent to
the compute power of one CPU-core on other systems.
Use content delivery network (CDN) for static data
Pay as you go: about US$62 for heroku and CDN combined. The developer's blog: http://blog.xdite.net/?p=1369
-
7/27/2019 Cloud Computing and Mobile Platforms
11/10311
Heroku: http://heroku.com/
A fast deploymentenvironment forRuby on Rails.
Quote: new dynoscan be launched in
under 2 seconds formost apps.
Built on top ofAmazon's EC2
service.
-
7/27/2019 Cloud Computing and Mobile Platforms
12/10312
What would people do if Morakot hit
us in 1989? 1999?
-
7/27/2019 Cloud Computing and Mobile Platforms
13/10313
1989
Internet in Taiwan was in the labs of III and ITRI
E-mails were mostly sent with Bitnet, only limited amountof universities have access.
The only way to spread call for help and looking forloved one messages would be via telephones and
newspapers.
-
7/27/2019 Cloud Computing and Mobile Platforms
14/10314
1999
Web was popular, about 4.8 million people were
connected.
Internet was prevalent in all universities, BBS was thedominant use.
Portals were the center of attention on the web.
The best way to spread call for help and looking forloved one messages would be through BBS.
requires manual data organization
could not handle too many connections at the same time
-
7/27/2019 Cloud Computing and Mobile Platforms
15/10315
Since ~1999: Web 2.0
An old buzzword, means different thing to different people
Self publishing: LiveJournal/Blogger.com (1999), YouTube(2005)
Metadata & feeds: RSS (1999), Atom (2003)
Collaboration: Wikpedia (2001), Google docs (2006)
Tagging: del.icio.us (2003), digg.com (2004), flickr (2004)
Ajax: Gmail (2004), Google Maps (2005)
Long Tail: Amazon (1994), Google AdSense (2005)
API: Yahoo!, GData, Amazon (who doesn't?) Aggregation: Google News (2002), Google Reader (2005)
Social: LinkedIn (2003), MySpace (2003), Facebook (2004)
Microblogging: Twitter (2006), Plurk (2008)
-
7/27/2019 Cloud Computing and Mobile Platforms
16/10316
No matter if you call it Web 2.0,
you can't deny that
more people are spending more
time on the web.
-
7/27/2019 Cloud Computing and Mobile Platforms
17/103
17
Web sites are becoming applications
Search: ease of finding information
Tags: making things easier to find without hierarchy
Authoring: ease of writing things
Rich interaction: intuitive and streamlined interface
Aggregation: data, application
Social: discover friends, stay in contact, organize events
And developers/users can mix them up!
-
7/27/2019 Cloud Computing and Mobile Platforms
18/103
18
5 > 2.0
Emerging new technologies: HTML5
canvas tag (demo at http://htmlfive.appspot.com/)
video tag (demo)
Geolocation API (demo)
App caching and databases Workers (demo)
Document editing, drag and drop, browser historymanagement, ...
Other new stuff: 3D (demo at 23:20 of Google IO Keynote)
(Re)newed stuff:
Native SVG support
-
7/27/2019 Cloud Computing and Mobile Platforms
19/103
19
The Web as a Platform
-
7/27/2019 Cloud Computing and Mobile Platforms
20/103
20
Lessons Learned
Having a group of capable web developers is not enough
for building a successful web site.
Scalability is key, because Not serving = Useless
Reverse proxy
HTTP cache
Database
Memory cache
Alternative delivery mechanism for static contents
etc
Sometimes using existing web tools is better no need torecreate the wheels.
-
7/27/2019 Cloud Computing and Mobile Platforms
21/103
21
LiveJournal circa 2007
From Brad Fitzpatrick's USENIX '07 talk:"LiveJournal: Behind the Scenes"
-
7/27/2019 Cloud Computing and Mobile Platforms
22/103
22
LiveJournal circa 2007
Frontends StorageApplicationServers
Memcache
Static File Servers
From Brad Fitzpatrick's USENIX '07 talk:"LiveJournal: Behind the Scenes"
-
7/27/2019 Cloud Computing and Mobile Platforms
23/103
23
Basic LAMP
LAMP
Linux + Apache + MySQL +
Programming Language
(perl, Python, PHP, ...)
Scalable?
Shared machine for database and webserver
Reliable?
Single point of failure (SPOF)
-
7/27/2019 Cloud Computing and Mobile Platforms
24/103
24
Database running on a separate
server
Requirements:
Another machine plus
additional management
Scalable?
Up to one web server
Reliable?
Two single points of failure
Dedicated Database
-
7/27/2019 Cloud Computing and Mobile Platforms
25/103
25
Benefits:
Grow traffic beyond the capacity of one webserver
Requirements:
More machines
Set up load balancing
Multiple Web Servers
-
7/27/2019 Cloud Computing and Mobile Platforms
26/103
26
Load Balance: DNS Round Robin
Register list of IPs with DNS
Statistical load balancing
DNS record is cached with a Time To Live (TTL)
TTL may not be respected
-
7/27/2019 Cloud Computing and Mobile Platforms
27/103
27
Register list of IPs with DNS
Statistical load balancing
DNS record is cached with a Time To Live (TTL)
TTL may not be respected
Now wait for DNS changes to propagate :-(
Load Balance: DNS Round Robin
-
7/27/2019 Cloud Computing and Mobile Platforms
28/103
28
Load Balance: DNS Round Robin
Scalable?
Add more webservers as necessary
Still I/O bound on one database
Reliable?
Cannot redirect traffic quickly Database still SPOF
-
7/27/2019 Cloud Computing and Mobile Platforms
29/103
29
Benefits: Custom Routing
Specialization
Application-level load balancing
Requirements:
More machines Configuration for reverse proxies
Reverse Proxy
-
7/27/2019 Cloud Computing and Mobile Platforms
30/103
30
Scalable?
Add more web servers
Bound by
Routing capacity of reverse proxy
One database server
Reliable?
Agile application-level routing
Specialized components are more robust
Multiple reverse proxies requires network-level routing
Fancy network routing hardware
Database is still SPOF
Reverse Proxy
-
7/27/2019 Cloud Computing and Mobile Platforms
31/103
31
Master-Slave Database
Benefits:
Better read throughput
Invisible to application
Requirements:
Even more machines Changes to MySQL, additional maintenance
-
7/27/2019 Cloud Computing and Mobile Platforms
32/103
32
Master-Slave Database
Scalable?
Scales read rate with # of servers
But not writes
But what happens eventually?
-
7/27/2019 Cloud Computing and Mobile Platforms
33/103
33
Master-Slave Database
Reliable?
Master is SPOF for writes
Master may die before replication
-
7/27/2019 Cloud Computing and Mobile Platforms
34/103
34
Partitioned Database
Benefits:
Increase in both read andwrite throughput
Requirements:
Even more machines
Lots of management
Re-architect data model
G ?
-
7/27/2019 Cloud Computing and Mobile Platforms
35/103
35
How about Google App Engine?
-
7/27/2019 Cloud Computing and Mobile Platforms
36/103
36
How do people describe Cloud
Computing in public?
Th I t ti l W k h Cl d
-
7/27/2019 Cloud Computing and Mobile Platforms
37/103
37
The International Workshop on CloudComputing - 2009
Cloud Computing is defined as a pool of virtualizedcomputer resources. Based on this virtualization the CloudComputing paradigm allows workloads to be deployed andscaled-out quickly through the rapid provisioning of virtualmachines. A Cloud Computing platform supports redundant,self-recovering, highly scalable programming models thatallow workloads to recover from many inevitablehardware/software failures and monitoring resource use inreal time for providing physical and virtual servers on whichthe applications can run. A Cloud Computing platform ismore than a collection of computer resources because itprovides a mechanism to manage those resources. In aCloud Computing platform software is migrating from thedesktop into the "clouds" of the Internet, promising usersanytime, anywhere access to their programs and data.What challenges does it present to users, programmers,and telecommunications providers?
http://www.icumt.org/w-35.html
-
7/27/2019 Cloud Computing and Mobile Platforms
38/103
38
Wikipedia
Cloud computing is a style of computing in whichdynamically scalable and often virtualized resources areprovided as a service over the Internet.[1][2] Users neednot have knowledge of, expertise in, or control over thetechnology infrastructure in the "cloud" that supports them.[3]
The concept generally incorporates combinations of thefollowing:
infrastructure as a service (IaaS)
platform as a service (PaaS) software as a service (SaaS)
Other recent (ca. 200709)[4][5] technologies that rely onthe Internet to satisfy the computing needs of users.
http://en.wikipedia.org/wiki/Cloud_computing
-
7/27/2019 Cloud Computing and Mobile Platforms
39/103
39
SearchCloudComputing.com
Cloud computing is a general term for anything thatinvolves delivering hosted services over the Internet. Theseservices are broadly divided into three categories:Infrastructure-as-a-Service (IaaS), Platform-as-a-Service(PaaS) and Software-as-a-Service (SaaS). The name cloudcomputing was inspired by the cloud symbol that's oftenused to represent the Internet in flow charts and diagrams.
A cloud service has three distinct characteristics thatdifferentiate it from traditional hosting. It is sold on demand,typically by the minute or the hour; it is elastic -- a user can
have as much or as little of a service as they want at anygiven time; and the service is fully managed by the provider(the consumer needs nothing but a personal computer andInternet access). Significant innovations in virtualization anddistributed computing, as well as improved access to high-
speed Internet and a weak economy, have accelerated
http://searchcloudcomputing.com/sDefinition/0,,sid201_gci1287881,00.html
-
7/27/2019 Cloud Computing and Mobile Platforms
40/103
40
Information Weeks' John Foley
Cloud computing is on-demand access to virtualized ITresources that are housed outside of your own data center,shared by others, simple to use, paid for via subscription,and accessed over the Web.
http://www.informationweek.com/cloud-computing/blog/archives/2008/09/a_definition_of.html
-
7/27/2019 Cloud Computing and Mobile Platforms
41/103
41
What is cloud?
internet splat map: http://flickr.com/photos/jurvetson/916142/, CC-by 2.0
baby picture: http://flickr.com/photos/cdharrison/280252512/, CC-by-sa 2.0
User
internet splat map: http://flickr.com/photos/jurvetson/916142/, CC-by 2.0
baby picture: http://flickr.com/photos/cdharrison/280252512/, CC-by-sa 2.0
The Cloud
Any Applicationon the Web
http://flickr.com/photos/jurvetson/916142/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/jurvetson/916142/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/jurvetson/916142/http://flickr.com/photos/cdharrison/280252512/http://flickr.com/photos/jurvetson/916142/ -
7/27/2019 Cloud Computing and Mobile Platforms
42/103
42
Cloud = Internet(without the details)
-
7/27/2019 Cloud Computing and Mobile Platforms
43/103
43
Cloud Computing:
Applications and data are in the cloud,
accessible with any deviceconnected to the cloud
with a browser
-
7/27/2019 Cloud Computing and Mobile Platforms
44/103
44
Cloud Computing...
is about the paradigm
web
application, data
anyone with any device with a browser
focuses on internet scale
easy to use
elastic, on-demand
is less about the implementation virtualization
outside your own data center
shared, subscription
-
7/27/2019 Cloud Computing and Mobile Platforms
45/103
45
Three Types of Cloud Services
Software as a Service: Any web site that handles largescale of users and computing. Examples: Google search,Amazon, Facebook, Twitter, Salesforce, etc.
Infrastructure as a Service: Computing resources thatprovide elastic on-demand resource for rent. Examples:
Amazon EC2, GoGrid, 3tera, etc.
Platform as a Service: Software development anddeployment environment on top of a Cloud infrastructure.Example: Google App Engine, Heroku, Aptana, etc.
-
7/27/2019 Cloud Computing and Mobile Platforms
46/103
46
Evolution of Computing with the Network
Cluster Computing
Network Computing
Grid Computing
Utility Computing
Network is computer (client - server)
Tightly coupled computing resources:
CPU, storage, data, etc
Usually connected within a LAN
Managed as a single resource
Don't buy computers, lease computing powerUpload, run, download
Resource sharing acrossadministrative domains
Decentralized, open standards,non-trivial service
Separation of Functionalities
Commodity, Open Source
Global Resource Sharing
Ownership Model
Cluster and grid images are from Fermilab and CERN, respectively.
-
7/27/2019 Cloud Computing and Mobile Platforms
47/103
47
The Cloud
How many users do you want to serve?
Your CoolestWeb Application
-
7/27/2019 Cloud Computing and Mobile Platforms
48/103
48
Nov. '98: 10,000 queries on 25 computers
Apr. '99: 500,000 queries on 300 computers
Sep. '99: 3,000,000 queries on 2100 computers
Google Growth
-
7/27/2019 Cloud Computing and Mobile Platforms
49/103
49
Scalability Matters
for developers and service providers
-
7/27/2019 Cloud Computing and Mobile Platforms
50/103
50
What does Cloud Computing mean to
ordinary users?
U i
-
7/27/2019 Cloud Computing and Mobile Platforms
51/103
51 51
User-centric
If your data are in the cloud...
It "follows" you
Only need a browser, usernameand password to get your data.
Once logged in, the devicebecomes "yours"
You can get your data when you
want it, where you want it.
Easy to share, controlled by you.
E thi h URL
-
7/27/2019 Cloud Computing and Mobile Platforms
52/103
52 52
Everything has a URL
URL is a presenceof the task acentral point forrelated data andapplication in the
cloud.
C ll b ti th b
-
7/27/2019 Cloud Computing and Mobile Platforms
53/103
53 53
Collaboration on the web
Discuss yourideas whileediting thespreadsheet
I t lli t d P f l
-
7/27/2019 Cloud Computing and Mobile Platforms
54/103
54 54
Intelligent and Powerful
Example:Search
vs. Your typical desktopdocument application
Speed ~ 0.25s ~ 1s
Relevance high low
Data size O(100TB) O(1MB)
-
7/27/2019 Cloud Computing and Mobile Platforms
55/103
55
What does Cloud Computing mean to
web application developers?
U d t d I t t S l C ti
-
7/27/2019 Cloud Computing and Mobile Platforms
56/103
56 56
Understand Internet Scale Computing
Serving: dynamic content vs. static content, caching,
latency, load balancing, etc. Database: partitioned data (?), replication, data model,
memory cache, etc.
Offline processing: how to process internet-scale of input
data to produce interesting output data for every user?
Don't build all the wheels! Use what's available.
If a PaaS supports your favorite language, check it out.
Else, check out the IaaS providers for your web app.
Hosting your web app on your own servers at home oroffice: not a wise move unless you predict there will not bemany users.
Cl d S ft
-
7/27/2019 Cloud Computing and Mobile Platforms
57/103
57
Cloud Software
Google: GFS, BigTable, MapReduce. Not directly
available publicly. Hadoop: open source implementation of GFS and
MapReduce in Java.
HBase, Hypertable: open source implementation of
BigTable in Java, C++. Thrift: open source RPC framework by Facebook.
Protocol Buffers: open source data serialization and RPCservice specification by Google.
Sector: storage system in C++, claims to be 2x Hadoop.
Eucalyptus: build-your-own-infrastructure open sourcesoftware with compatible interface with Amazon EC2.
etc.
C ti th b
-
7/27/2019 Cloud Computing and Mobile Platforms
58/103
58 58
Counting the numbers
Client / Server
One : Many
Personal Computer
One : One
Cloud Computing
Many : Many
Developer transition
Wh P G l ' Cl d C i ?
-
7/27/2019 Cloud Computing and Mobile Platforms
59/103
59
What Powers Google's Cloud Computing?
Performance: single machine not interestingReliability:
Most reliable hardware will still fail:
fault-tolerant software needed Fault-tolerant software enables use
of commodity componentsStandardization: use standardized machines
to run all kinds of applications
Commodity Hardware Infrastructure Software
Distributed storage:Google File System (GFS)
Distributed semi-structured datasystem: BigTable
Distributed data processingsystem: MapReduce
chunk ...chunk ...chunk ...chunk ...
/foo/bar
google stanford edu (circa 1997)
-
7/27/2019 Cloud Computing and Mobile Platforms
60/103
60
google.stanford.edu (circa 1997)
google com (1999)
-
7/27/2019 Cloud Computing and Mobile Platforms
61/103
61
google.com (1999)
corkboards"
-
7/27/2019 Cloud Computing and Mobile Platforms
62/103
62
Google Data Center (circa 2000)
google com (new data center 2001)
-
7/27/2019 Cloud Computing and Mobile Platforms
63/103
63
google.com (new data center 2001)
google com (3 days later)
-
7/27/2019 Cloud Computing and Mobile Platforms
64/103
64
google.com (3 days later)
Current Design
-
7/27/2019 Cloud Computing and Mobile Platforms
65/103
65
Current Design
In-house rack design PC-class motherboards
Low-end storage andnetworking hardware
Linux + in-house software
How to develop a web application that scales?
-
7/27/2019 Cloud Computing and Mobile Platforms
66/103
66 66
How to develop a web application that scales?
Storage Database Serving
Google's
solution/replacement
GoogleFile
SystemBigTableMapReduce
GoogleAppEngine
Publishedpapers
Opened on2008/5/28
DataProcessing
hadoop: open source
implementation
Google File System
-
7/27/2019 Cloud Computing and Mobile Platforms
67/103
67
Google File System
GFS Client
Application
Replicas
Masters
GFS Master
GFS Master
C0 C1
C2C5
Chunkserver
C0
C2
C5
Chunkserver
C1
Chunkserver
File namespace chunk 2ef7chunk ...
chunk ...
chunk .../foo/bar
GFS Client
Application
C5 C3
Files broken into chunks(typically 64 MB)
Chunks triplicated acrossthree machines for safety(tunable)
Master manages metadata Data transfers happen
directly between clientsand chunkservers
GFS Usage @ Google
-
7/27/2019 Cloud Computing and Mobile Platforms
68/103
68
GFS Usage @ Google
200+ clusters Filesystem clusters of up
to 5000+ machines
Pools of 10000+ clients
5+ PB FilesystemsAll in the presence of
frequent HW failures
BigTable
-
7/27/2019 Cloud Computing and Mobile Platforms
69/103
69
BigTable
www.cnn.com
contents:
Rows
Columns
Timestamps
t3t11t17
Distributed multi-level sparsemap: fault-tolerant, persistent
Scalable:
Thousands of servers
Terabytes of in-memory data,petabytes of disk-based data
Self-managing
Servers can beadded/removed dynamically
Servers adjust to loadimbalance
Data model: (row, column, timestamp)
cell contents
Why not just use commercial DB?
-
7/27/2019 Cloud Computing and Mobile Platforms
70/103
70
Why not just use commercial DB?
Scale is too large or cost is too high for most commercialdatabases
Low-level storage optimizations help performancesignificantly
Much harder to do when running on top of a database layer
Also fun and challenging to build large-scale systems :)
System Structure
-
7/27/2019 Cloud Computing and Mobile Platforms
71/103
71
System Structure
Lock service
Bigtable master
Bigtable tablet server Bigtable tablet serverBigtable tablet server
GFSCluster scheduling system
holds metadata,handles master-electionholds tablet data, logshandles failover, monitoring
performs metadata ops +load balancing
serves data serves dataserves data
Bigtable client
Bigtable clientlibrary
Open()read/write
metadata ops
BigTable Cell
BigTable Summary
-
7/27/2019 Cloud Computing and Mobile Platforms
72/103
72
BigTable Summary
Data model applicable to broad range of clients
Actively deployed in many of Googles services
System provides high performance storage system on alarge scale
Self-managing Thousands of servers
Millions of ops/second
Multiple GB/s reading/writing
Currently ~500 BigTable cells
Largest bigtable cell manages ~3PB of data spread overseveral thousand machines (larger cells planned)
Distributed Data Processing
-
7/27/2019 Cloud Computing and Mobile Platforms
73/103
73
Distributed Data Processing
How do you process 1 month of apache logs to find the
usage pattern numRequest[minuteOfTheWeek]?
Input files: N rotated logs
Size: O(TB) for popular sites multiple physical disks
Processing phase 1: launch M processes input: N/M log files
output: one file of numRequest[minuteOfTheWeek]
Processing phase 2: merge M output files of step 1
Pseudo Codes for Phase 1 and 2
-
7/27/2019 Cloud Computing and Mobile Platforms
74/103
74
Pseudo Codes for Phase 1 and 2
def findBucket(requestTime):
# return minute of the week
numRequest = zeros(1440*7) # an array of 1440*7 zeros
for filename in sys.argv[2:]:
for line in open(filename):
minuteBucket = findBucket(findTime(line))
numRequest[minuteBucket] += 1
outFile = open(sys.argv[1], 'w')for i in range(1440*7):
outFile.write("%d %d\n" % (i, numRequest[i]))
outFile.close()
numRequest = zeros(1440*7) # an array of 1440*7 zerosfor filename in sys.argv[2:]:
for line in open(filename):
col = line.split()
[i, count] = [int(col[0]), int(col[1])]
numRequest[i] += count
# write out numRequest[] like phase 1
Task Management
-
7/27/2019 Cloud Computing and Mobile Platforms
75/103
75
Task Management
Logistics:
Decide which computers to run phase 1, make sure the logfiles are accessible (NFS-like or copy)
Similar for phase 2
Execution:
Launch the phase 1 programs with appropriate commandline flags, re-launch failed tasks until phase 1 is done
Similar for phase 2
Automation: build task scripts on top of existing batch
system (PBS, Condor, GridEngine, LoadLeveler, etc)
Technical Issues
-
7/27/2019 Cloud Computing and Mobile Platforms
76/103
76
Technical Issues
File management: where to store files?
Store all logs on the same file server Bottleneck!
Distributed file system: opportunity to run locally
Granularity: how to decide N and M?
Performance when M until M == N if no I/O contention
Can M > N? Yes! Careful log splitting. Is it faster?
Job allocation: assign which task to which node?
Prefer local job: knowledge of file system
Fault-recovery: what if a node crashes? Redundancy of data a must
Crash-detection and job re-allocation necessary
Performance
Robustness
Reusability
MapReduce A New Model and System
-
7/27/2019 Cloud Computing and Mobile Platforms
77/103
77
MapReduce A New Model and System
Map: (in_key, in_value) { (keyj, valuej) |j= 1, ..., K}
Reduce: (key, [value1, ... valueL]) (key, f_value)
Data store1 Data store nmap
(key 1,
values...)
(key 2,
values...)(key 3,
values...)
map
(key 1,
values...)
(key 2,
values...)(key 3,
values...)
Input key*value
pairs
Input key*value
pairs
== Barrier== : Aggregates intermediate values by output key
reduce reduce reduce
key 1,
intermediate
values
key 2,
intermediate
values
key 3,
intermediate
values
final key1values
final key2values
final key3values
...
Two phases of data processing
MapReduce Programming Model
-
7/27/2019 Cloud Computing and Mobile Platforms
78/103
78
MapReduce Programming Model
f f f f f f
f f f f f returned
initial
Borrowed from functional
programmingmap(f, [x1, x2, ...]) = [f(x1), f(x2), ...]
reduce(f, x0, [x1, x2, x3,...])= reduce(f, f(x0, x1), [x2,...])= ...
(continue until the list is exausted)
Users implement twofunctions:
map (in_key, in_value) (keyj, valuej) list
reduce [value1, ... valueL] f_value
MapReduce Version of Pseudo Code
-
7/27/2019 Cloud Computing and Mobile Platforms
79/103
79
MapReduce Version of Pseudo Code
def findBucket(requestTime):
# return minute of the week
class LogMinuteCounter(MapReduction):
def Map(key, value, output): # key is location
minuteBucket = findBucket(findTime(value))
output.collect(str(minuteBucket), "1")
def Reduce(key, iter, output):
sum = 0while not iter.done():
sum += 1
output.collect(key, str(sum))
See, no file I/O! Only data processing logic...
... and gets much more than that!
MapReduce Framework
-
7/27/2019 Cloud Computing and Mobile Platforms
80/103
80
MapReduce Framework
For certain classes of problems, the MapReduce frameworkprovides:
Automatic & efficient parallelization/distribution
I/O scheduling: Run mapper close to input data (samenode or same rack when possible, with GFS)
Fault-tolerance: restart failed mapper or reducer tasks onthe same or different nodes
Robustness: tolerate even massive failures, e.g. large-scale network maintenance: once lost 1800 out of 2000machines
Status/monitoring
Task Granularity And Pipelining
-
7/27/2019 Cloud Computing and Mobile Platforms
81/103
81
Task Granularity And Pipelining
Fine granularity tasks: many more map tasks thanmachines
Minimizes time for fault recovery
Can pipeline shuffling with map execution
Better dynamic load balancing
Often use 200,000 map/5000 reduce tasks w/ 2000machines
-
7/27/2019 Cloud Computing and Mobile Platforms
82/103
82
-
7/27/2019 Cloud Computing and Mobile Platforms
83/103
83
-
7/27/2019 Cloud Computing and Mobile Platforms
84/103
84
-
7/27/2019 Cloud Computing and Mobile Platforms
85/103
85
-
7/27/2019 Cloud Computing and Mobile Platforms
86/103
86
-
7/27/2019 Cloud Computing and Mobile Platforms
87/103
87
-
7/27/2019 Cloud Computing and Mobile Platforms
88/103
88
-
7/27/2019 Cloud Computing and Mobile Platforms
89/103
89
-
7/27/2019 Cloud Computing and Mobile Platforms
90/103
90
-
7/27/2019 Cloud Computing and Mobile Platforms
91/103
91
-
7/27/2019 Cloud Computing and Mobile Platforms
92/103
92
MapReduce: Adoption at Google
-
7/27/2019 Cloud Computing and Mobile Platforms
93/103
93
0
1000
2000
3000
4000
5000
6000
Jan-03 Jul-03 Jan-04 Jul-04 Jan-05 Jul-05 Jan-06 Jul-06
MapReduce Programs in Googles Source Tree
0
100
200
300
400
500
600
Jan-03 Jul-03 Jan-04 Jul-04 Jan-05 Jul-05 Jan-06 Jul-06
Summer intern effectNew MapReducePrograms PerMonth
MapReduce: Uses at Google
-
7/27/2019 Cloud Computing and Mobile Platforms
94/103
94
Typical configuration: 200,000 mappers, 500 reducers on2,000 nodes
Broad applicability has been a pleasant surprise
Quality experiments, log analysis, machine translation, ad-hoc data processing,
Production indexing system: rewritten w/ MapReduce
~10 MapReductions, much simpler than old code
2004/08 2005/03 2006/03
Number of jobs 29,423 72,229 171,834
Avg completion time (sec) 634 934 874
Machine years used 217 981 2,002
Input data read (TB) 3,288 12,571 52,254
Intermediate data (TB) 758 2,756 6,743
Output data written (TB) 193 941 2,970
Avg workers per job 157 232 268
Avg worker deaths per job 1.2 1.9 5
MapReduce Summary
-
7/27/2019 Cloud Computing and Mobile Platforms
95/103
95
MapReduce has proven to be a useful abstraction
Greatly simplifies large-scale computations at Google
Fun to use: focus on problem, let library deal with messydetails
Published
A Data Playground
-
7/27/2019 Cloud Computing and Mobile Platforms
96/103
96
MapReduce + BigTable + GFS = Data playground
Substantial fraction of internet available for processing
Easy-to-use teraflops/petabytes, quick turn-around
Cool problems, great colleagues
Query Frequency Over Time
-
7/27/2019 Cloud Computing and Mobile Platforms
97/103
97
Learning From Data
-
7/27/2019 Cloud Computing and Mobile Platforms
98/103
98
Searching for Britney Spears...
GoogleCloud Offerings
-
7/27/2019 Cloud Computing and Mobile Platforms
99/103
99
App Engine
Lets you run yourweb application on top of Googlesscalable infrastructure.
MapReduce
Google File System, MapReduce, BigTable
Are data centers environment-unfriendly?
-
7/27/2019 Cloud Computing and Mobile Platforms
100/103
100
http://www.youtube.com/watch?v=zRwPSFpLX8I
Clean Energy Initiatives
-
7/27/2019 Cloud Computing and Mobile Platforms
101/103
101
Power Usage Effectiveness (PUE) =
Total power consumption /
power consumed by IT equipments
In 2007, the EPA of the U.S. sent a report to the congress
about the estimated PUE of data centers in year 2011:
Scenario PUE
Current Trends 1.9
Improved Operations 1.7
Best Practices 1.3
State-of-the-Art 1.2
Google's data centers
-
7/27/2019 Cloud Computing and Mobile Platforms
102/103
102
mean PUE =
1.19
Details: http://www.google.com/corporate/green/datacenters/measuring.html
6 Data centers with> 5MW of power
Summary
-
7/27/2019 Cloud Computing and Mobile Platforms
103/103
More and more applications and data are moving to
web servers
More and more user rely on web access to findinformation, do their jobs, talk with their friends,publish their articles, pictures, video, etc.
and they want to do so, not limited to his desk
and that's Cloud Computing
top related