living on a cloud, dr keith marlow
DESCRIPTION
From presentation "Gaining leverate without the costsTRANSCRIPT
Agenda
• What cloud computing actually is…
• The anatomy of a cloud..
• What does it really do for you?
• Being cheap and leveraged
• Yahoo API’s and Web Services
• The future
• Questions…
YOUR BUSINESS
What Cloud Computing actually is…
• Really ancient history – way back in the early 1990’s…
• All ‘in-house’
• All risks yours to manage
YOURDATA
Control Processing Storage
What Cloud Computing actually is…
• Then in the mid to late 90’s – The Internet…YOUR BUSINESS
YOURDATA
Control Processing Storage
The Internet
Not Business Critical!
Full of none business ‘stuff’
What Cloud Computing actually is…
• Around 2000, the Internet became critical.
YOUR BUSINESS
YOURDATA
Control Processing Storage
The Internet
Full of businesses & customers!
Critical to your Business!
What Cloud Computing actually is…
• But now with cloud computing we have..
YOUR BUSINESS
YOURDATA
Control Processing Storage
HOSTED SERVICES
YOURDATA?
The Internet
What Cloud Computing actually is…
• Cloud Computing is either:– Remotely hosted data processing services, or
– Remotely hosted web services
• Which are:– A highly distributed and flexible computing environment
– With high availability
• Basically, it’s a remote self-scaling computing ‘resource’– Fire and forget
• Thanks, in part to:– Cheaper bandwidth and hardware
– Much faster machines
– Abstraction and standards
– Businesses, the research community & Open Source
The anatomy of a cloud..
• 1ST case – The Processing Cloud (Hadoop)
MANAGEMENT SERVICES
PROCESSING CLOUD1000’s of nodes & disks
DA
TA
I/O
CLIE
NT
AP
I
ACCOUNTING & AUDITING
How does Hadoop scale? Map/Reduce
Input
Map Map Map Map
Transient Data
Results
Reduce Reduce Reduce Reduce
Split into ‘bits’
Process the ‘bits’on each node
Collate each ‘bin’on each node
Shuffle into‘bins’
Join it all together
Hadoop – what is it?
• Open Source Apache project - http://hadoop.apache.org/core/
• Hadoop Core includes:
– Distributed File System - distributes data between nodes
– Map/Reduce - distributes application
• Written in Java
• Runs on
– Linux, Mac OS/X, Windows, and Solaris
– Commodity hardware
Hadoop – How do we use it?
• Example: Web Search
– BIG graph: 100 billion nodes and 1 trillion edges
– Largest shuffle is 450 TB (or 643000 CD’s worth!)
– Final output is 300 TB compressed
– Runs on 10,000 cores
– Written in C++
Hadoop – How do we use it?
What 20,0000 nodes look like
Hadoop – Real life usage…
NY TIMES
• Needed offline conversion of public domain articles from 1851-1922.
• Used Hadoop to convert scanned images to PDF
• Ran 100 Amazon EC2 instances for around 24 hours
• 4 TB of input
• 1.5 TB of output
Published 1892, copyright New York Times
Hadoop – Who else uses it?
• Amazon/A9
• IBM
• Joost
• Last.fm
• New York Times
• PowerSet (now Microsoft)
• Quantcast
• Veoh
• Yahoo!
• Basically proven to be fit for purpose
• More information at:– http://developer.yahoo.net/blogs/hadoop/
The anatomy of a cloud..
• 2nd case – The Web Services Cloud
MANAGEMENT SERVICES
DATA RETENTION
PROCESSING CLOUD
DA
TA
I/O
CLIE
NT
AP
I
ACCOUNTING & AUDITING
The anatomy of a cloud..
• 2nd case – The Web Services & Application Cloud
MANAGEMENT SERVICES
DATA RETENTIOND
AT
A I/O
CLIE
NT
AP
I
ACCOUNTING & AUDITING
PROCESSING GRID1000’s of nodes
PROCESSING CLOUD1000’s of nodes
Being cheap and leveraged…
• Cloud Computing allows time of usage outsourcing
– Only pay for exactly what you use
– Lower CAPEX costs, greater ROI
– Its green too!
• Needs a different approach to systems design
– New views on data to use in cloud & privacy protection
– Decoupling around remote API’s
– Remote hosting SDK’s training
Being cheap and leveraged…
• Use the services of Yahoo! to implement and improve your services and offerings
– Low (or nil) set up and operational costs
• Yahoo provides the following:
– API’s and Web Services
– RSS content feeds
– Developer kits and GUI libraries
– BrowserPlus
– YAP
• To you they all operate as ‘clouds’
Yahoo API’s and Web Services
• Maps– Include a map on a website or intranet
• GeoPlanet™– Geocode any address into latitude, longitude and WoeID
• Mail– Send/Read email, lists folders etc
– Zimbra – completely hosted mail service
• BOSS – Build you own Search Service
• Search Monkey – Enriched search results for your sites
• OpenID – share user ID’s between sites
• YAP – Yahoo Application Platform – going Open on Yahoo!
• http://developer.yahoo.com/
BOSS
• API’s into Yahoo! Search
– Unlimited queries a day
– No restrictions on presentation
– Re-ordering allowed
– Blending of Proprietary and Yahoo! Search Content Allowed
– White-Label
• http://developer.yahoo.com/search/boss/
• Use to implement your own site search!
Search Monkeyhttp://developer.yahoo.com/searchmonkey/
YAP – Yahoo! Application Platform
• Allows you to write applications modules that potentially can ‘run’ in any Yahoo! Web site (or supporting 3rd party web site)
– i.e. Yahoo! itself becomes a ‘cloud’
– The user selects modules from a gallery to ‘paint’ onto their page canvas.
– We take care of blending the modules with the content of the site to make the final page.
• What does this mean to you??
– Ability to put your dynamic products and services onto Yahoo!
– Closer relationship with users/customers in general
The Future of Cloud Computing
• Smaller/Bigger/Faster/Cheaper
• Hosting ‘in the cloud’ will become the norm
– Easier & cheaper than not doing so.
• Sum of the parts will be greater than the whole
– More ‘on the fly’ services aggregation & customization
• The user will ‘combine’ services to meet their needs.
• The desktop PC/TV/Mobile will become the ‘presentation & personalization gateway’ into the Internet Cloud
Questions??