Oracle Database in the Cloud
Bill HodakOracle
Jamie KinneyAmazon Web Services
Joseph AdlerRecombinant Data
Paul ParsonsThe Server Labs
Databases in the Cloud
Jamie Kinney - Amazon Web Services
Cloud Computing AttributesAbstract
ResourcesFocus on your needs, not on hardware specifications. As your
needs change, so should your resources.
On-Demand Provisioning
Ask for what you need, exactly when you need it. Pay only for what you use.
Scalability Scale up or down depending on usage needs.
No Up-FrontCosts
No contracts or long-term commitments.Pay only for what you use.
Efficiency of Experts Utilize the skills, knowledge and resources of experts.
The AWS Cloud
Your business
Your Your businessbusiness
Managing all the “Heavy Lifting”
Managing all the Managing all the ““Heavy LiftingHeavy Lifting””
More time to focus onyour business
More time to focus onMore time to focus onyour businessyour business
Configuring cloud assets
Configuring Configuring cloud cloud assetsassets
30% 70%
On-PremiseInfrastructure
30%70%
AWS CloudInfrastructure
The AWS cloud provides reliable and dependable on-demand infrastructure that frees time and expense for you to focus on your business.
Predictions Cost Money
You just lostcustomers
InfrastructureCost $
Time
LargeCapital
Expenditure
OpportunityCost
PredictedDemand
TraditionalHardware
ActualDemand
AutomatedVirtualization
Amazon Web Services
Simple Storage Services (S3) Elastic Compute Cloud (EC2) Elastic Block Store (EBS)
SimpleDB CloudFront Simple Queuing Services (SQS)
Oracle Software on EC2A partnership between Oracle and AWS that allows you to develop and deliver your applications on the Amazon Elastic Compute Cloud
Easy to use. Start developing your applications on Oracle software on Amazon EC2 in minutes
No barriers. Oracle is providing software at no charge for development of commercial applications on Amazon EC2. Pay only infrastructure charges - as little as $0.10/hour.
Pay as you go. Run production versions of leading Oracle software products and pay only for what you need, when you need it using the new “Oracle SaaS for ISVs” monthly licensing model.
Portability. Use your existing Oracle licenses for most Oracle software products in the cloud or on premise - it’s now your choice.
Products. Currently Oracle Database 9i-11gR2, TimesTen, Oracle Coherence and all Fusion Middleware. Oracle Enterprise Linux and Enterprise Manager Grid Control are fully supported on AWS. The Oracle Secure Backup Cloud Module allows you to backup your databases directly to Amazon S3.
We are currently working on support for Oracle’s Enterprise Applications and Real Application Clusters.
Oracle on AWS Use CasesProof-of-Concept/Development: Many projects involving a new technology stack begin by creating development and test environments. Prior to the cloud, these environments could take months to build out and might require executive approval for the associated CapEx. With AWS, creating a new development environment takes minutes and only costs pennies per hour.
Steady State Usage: Migrate your existing Oracle software licenses to Amazon EC2 Reserved Instances and pay even less per hour for non-stop production servers.
On-Demand Usage: Oracle’s cost-effective monthly “SaaS for ISVs” licensing model combined with the Amazon Elastic Compute Cloud allows you to easily scale up or down the number of instances as your application’s workload changes over time. This model works well for unpredictable or variable workloads.
Backup and Recovery: Use the Oracle Secure Backup Cloud Module to backup your production databases directly to Amazon S3 using your existing RMAN scripts.
Disaster Recovery: Create an in-sync Oracle Data Guard standby database on EC2.
Oracle Cloud Computing Center
http://www.oracle.com/tech/cloud/index.html
HW as a ServiceRoot/Admin access to Linux, Windows, and OpenSolaris servers
On DemandProvision custom servers in minutes
ElasticScale up and scale down as needed
Web Scale1000’s of cores, multiple availability zones, EU and US locations
UtilityPay for only what you use. No minimums
EC2 Is.…
Why EC2?Business Agility
React in real time to market and customer demandsLower Risk
Remove under/over investment in infrastructure for new initiativesReliability
Easy to support highly available, highly scalable applicationsCore Competency
Amazon knows datacenter scale, security, and reliability. You can focus on your business
ROISpend for average utilization vs. peak. Economies of scale. No Upfront $
1:Many Relationship Between AMIs and Instances
AMIAMIInstanceInstance
InstanceInstanceInstanceInstance
InstanceInstanceInstanceInstance
Amazon Machine Images
EC2 Instance Lifecycle
AMIAMIAMI
Instance(Pending)InstanceInstance(Pending)(Pending)
RunInstances call to cloud•Specify which AMI to launch•Provide parameters (# instances, security group, etc)
Instance launch initiated•Copy AMI from S3•Assign parameters
Instance(Running)InstanceInstance(Running)(Running)
Instance(Shutting
Down)
InstanceInstance(Shutting (Shutting
Down)Down)Instance
(Terminated)InstanceInstance
(Terminated)(Terminated)
•Attach EBS Storage once running
•Assign Elastic IP Address
•Resources automatically detached (IP, storage)
•Can also be initiated as normal operating system shutdown
AWS Security White PaperAvailable to the publicaws.amazon.com/security
Amazon EC2 Instance TypesStandard On-Demand Instances Hourly Price
1 YearReserved Instance
Price
Memory Virtual Cores Storage
Small $0.10 $227.50 +$0.03/hour 1.7GB 1 @ 1 ECU 160 GB
Large $0.40 $910 + $0.12/hour 7.5 GB 2 @ 2 ECU 850 GB
Extra Large $0.80 $1820 + $0.24/hour 15 GB 4 @ 2 ECU 1690 GB
High-CPU On-Demand Instances
Medium $0.20 $455 + $0.06/hour 1.7 GB 2 @ 2.5 ECU 350 GB
Extra Large $0.80 $1820 + $0.24/hour 7 GB 8 @ 2.5 ECU 1690 GB
Coming Soon!In response to requests from customers running memory and I/O-intensive applications in the cloud we are planning additions to our EC2 instance family
Aimed at database, memory caching and other high-throughput applications, these instances would offer much larger memory sizes and significantly more network I/O bandwidth.
Stay tuned over the next few weeks for an announcement!
© The Server Labs S.L. 2009, Images Courtesy of ESA29-Oct-09 24
Who are The Server Labs
European specialised, niche consultancy
IT architects
Extensive experience
Hands-on
Agile execution
Passion for technology
© The Server Labs S.L. 2009, Images Courtesy of ESA
About this presentation
Results of a Feasibility Study to move part of ESA’s Gaia Data Processing to Amazon’s EC2 Cloud
© The Server Labs S.L. 2009, Images Courtesy of ESA
Study Objectives
Two main objectives
Evaluate the Tecnical Feasibility of using Amazon EC2 to run scientific data processing applications
Evaluate the Financial Viability of using pay on demand compute power vs. traditional in-house data processing
© The Server Labs S.L. 2009, Images Courtesy of ESA
A Stereoscopic Census of our Galaxy
(based on slides from Jos de Bruijne and William O’Mullane)
Seminar, IAC, 30th April 2008 27 Dr. Ralf Kohley, European Space Astronomy Centre
© The Server Labs S.L. 2009, Images Courtesy of ESA
The Gaia Mission
Primary goal of the Gaia mission is to create an astrometric catalogue of 1 billion stars (approx 1% of our Galaxy) with micro arc second precision.
Gaia satellite to be launched in 2011.
Observations done until 2017.
Catalogue ready around 2019.
© The Server Labs S.L. 2009, Images Courtesy of ESA29
The Gaia Mission
Seminar, IAC, 30th April 2008 Dr. Ralf Kohley, European Space Astronomy Centre
Credit: Images ESA
© The Server Labs S.L. 2009, Images Courtesy of ESA
If it took 1 millisecond to process one image, the processing time for just one pass through the data
(on a single processor) would take 30 years.
Obviously the adopted solution is much faster ……. distributed/parallel processing.
© The Server Labs S.L. 2009, Images Courtesy of ESA
AGIS: Astrometric Global Iterative Solution
Sky scans(highest accuracy
along scan)
Scan width: 0.7°
1. Objects are matched in successive scans2. Attitude and calibrations are updated3. Objects positions etc. are solved4. Higher-order terms are solved5. More scans are added6. Whole system is iterated
© The Server Labs S.L. 2009, Images Courtesy of ESA
AGIS Architecture
Datatrains drive through AGIS Database passing observations to algorithms.There can be as many Datatrains in parallel as we wish
Algorithm does not access data directly
Calibration Global Attitude Source
Elementary TakersElementary Takers
Optimised AGISDatabase
Optimised AGISDatabase
Data Access LayerData Access Layer
AstroElementaryAstroElementary
© The Server Labs S.L. 2009, Images Courtesy of ESA
AGIS Architecture - detailed
RunManagerRunManager
ConvergenceMonitor
ConvergenceMonitor
AGIS DB
StoreStore
GaiaTable
Object FactoryObject Factory
AstroElementaryElementaryDataTrain
ElementaryDataTrain
Request AstroElementariesbetween a range (x,y)
Calibration CollectorCalibration Collector
Attitucde CollectorAttitucde Collector
Source CollectorSource Collector
Global CollectorGlobal Collector
Source CollectorSource Collector
Attitude UpdateServer
Attitude UpdateServer
Global UpdateServer
Global UpdateServer
© The Server Labs S.L. 2009, Images Courtesy of ESA
Scheduling
Very simple ..Keep all machines busy all the time!
Busy = CPU ~90%
Post jobs on whiteboard
Trains/Workers Mark Jobs – and do themMark finished – repeat until done
Previous attempt had much more general scheduling It was also ~1000 times slower.
Job 1Job 2Job 3
…Job N
Job 1Job 2Job 3
…Job N
© The Server Labs S.L. 2009, Images Courtesy of ESA
The problem
Data centre cost� AGIS run times decrease as more processors are added. Note that the
data volume increased from 2005 to 2006 from 18 months to 5 years, theprocessor power also increased but the run time went up. This wasdramatically improved in 2007. The normalised column shows throughput perprocessor in the system (total observations/processors/hours) e.g. anindication of the real performance.
Current estimation for in-house data processing for AGIS is around 1.2 million euros
© The Server Labs S.L. 2009, Images Courtesy of ESA
Economics of Cloud Computing
Unused resourcesStatic data center Data center in the cloud
Demand
Capacity
TimeDemand
Capacity
Time
© The Server Labs S.L. 2009, Images Courtesy of ESA
AGIS Peaks
Iterative processing – 6 month Data Reduction CyclesAt current estimates AGIS will run 2 weeks every 6 monthsAmount of data increases over the 5 year mission
0
500
1000
1500
2000
2500
Hours
Nov-11
Nov-12
Nov-13
Nov-14
Nov-15
Nov-16
Nov-17
Date
AGIS Peak Processing (Hours)
AGIS 6 monthly processing
Cap Ex
© The Server Labs S.L. 2009, Images Courtesy of ESA
The Study: Running AGIS in Amazon EC2
Technical Feasibility:Can AGIS run in the cloud?What are the restrictions?What modifications do we have to make?
Financial ViabilityWhat would be the cost of using EC2 for AGIS?Can we do a hybrid solution using a local data centre followed by a mix of local/EC2?
© The Server Labs S.L. 2009, Images Courtesy of ESA
EC2 Images
64 bit imagesLarge, Extra Large and High CPU Large
Oracle ASM Image based on Oracle Database 11g Release 1 Enterprise Edition - 64 Bit (Large instance) -ami-7ecb2f17
AGIS Self configuring Image based on Ubuntu8.04 LTS Hardy Server 64-Bit (Large, Extra Large and High CPU Large Instances) - ami-e257b08b
© The Server Labs S.L. 2009, Images Courtesy of ESA
Architecture in the Cloud
ConvergenceMonitor
ConvergenceMonitor
Attitude UpdateServer
Attitude UpdateServer
StoreStore
GaiaTable
Object FactoryObject Factory
AstroElementaryElementaryDataTrain
ElementaryDataTrain
Request AstroElementaries between a range (x,y)
Calibration CollectorCalibration Collector
Attitucde CollectorAttitucde Collector
Source CollectorSource Collector
Global CollectorGlobal Collector
Source CollectorSource Collector
Data Trains
AGIS DB
RunManagerRunManager
1x Large Instance
AGIS AMI
Elastic IP
<n> x Extra Large or High CPU Large instances
AGIS AMI1x Large instance
Oracle AMI
Elastic IP
3 x Extra Large instances
AGIS AMI
© The Server Labs S.L. 2009, Images Courtesy of ESA
ASMDiskGroup(EBS)
ASMDiskGroup(EBS)
Oracle Image
EC2 Large instance (m1.large)Oracle Enterprise Edition 11g 64 bit (11.0.6)Oracle ASMElastic Block Storage
5 x 100GB disks /dev/sdg - /dev/sdk
AGIS DB
/dev/sdh/dev/sdh
/dev/sdi/dev/sdi
/dev/sdj/dev/sdj
/dev/sdk/dev/sdk
/dev/sdg/dev/sdg
/mnt /dev/sdb/mnt /dev/sdb
/ /dev/sda1/ /dev/sda1ORACLEORACLE EXTERNAL
redundancy best
© The Server Labs S.L. 2009, Images Courtesy of ESA
Configuring Oracle
Launch an m1.large instance of ami-7ecb2f17Attach the /mnt partition properly so it has enough spaceCreate 5 EBS vols of 100GB each and attach them to the instanceSet up Oracle ASMLib
Install driversRun oracleasm_debug_linkRun oracleasm configure, createdisk
Copy a pre-recorded Oracle response file up to create an ASM instanceRun Oracle installer to create the ASM instanceCopy a pre-recorded Oracle response file up to create the AGIS DB instanceRun Oracle installer to create the AGIS DB instance
© The Server Labs S.L. 2009, Images Courtesy of ESA
Configuring Oracle cont.
Create an Elastic IP and associate it with the instanceChange the hostname to be the new public DNS name
hostname ec2-174-129-223-59.compute-1.amazonaws.comRun localconfig remove followed by localconfig addThis will run for ever unless you edit /etc/inittab and change the following line
Start the ASM instanceStart the AGIS DB instance
h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1 </dev/null
to
h1:345:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1 </dev/null
Don’t forget to make a new image!!!
© The Server Labs S.L. 2009, Images Courtesy of ESA
AGIS Image
Instances (m1.large and c1.xlarge).Java version 1.6.0_13
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode)
Apache Tomcat 6.0.14AGIS software.Creation of agis user acoountrc.local script modified to run the AGIS process
Self configuring using user-data
© The Server Labs S.L. 2009, Images Courtesy of ESA
Configuring AGIS Image
Launch an m1.large instance of ami-e257b08bCheckout the source code from the svn server.Create an agis user to run the process.Set up /etc/rc.local to execute the runAgis.sh Create a generic runAgis.sh script that reads data passed to the ami during boot time (using the AWS).
Data contains JVM parameters and especificapplication parameters (Depending on the DataTrainthat will be executed at that node).
© The Server Labs S.L. 2009, Images Courtesy of ESA
Problems Encountered along the way
Creating new images takes a long time so make sure youget it right
No ASMLib drivers for 2.6.18-53.1.13.9.1.el5xen
Oracle is very fussy about it’s IP address, hence theElastic IP
Oracle instance hostname changed to be the public DNS name
Some work still needed on the startup script to make sure the ASM boots first time
The Attitude Servers took a long time to start up (20 mins)This was due to a race condition caused by spin locks in the type of java Thread Pool we were using.
© The Server Labs S.L. 2009, Images Courtesy of ESA
I/O Transfer from disk of up to 50 MB/sec
Oracle Performance in the Cloud
© The Server Labs S.L. 2009, Images Courtesy of ESA
Conclusions
AGIS and Oracle can be run in the cloud!
The Economics work out:EC2 may work out cheaper than buying the hardware!
With the knowledge that EC2 is an option we can delay buying more machines until the middle of the mission (2014) and decide then.
Now running a new feasibility study with 60 million primary stars (1/3 of the final data)
Aim is to try and scale out to 1000 High CPU instances
© The Server Labs S.L. 2009, Images Courtesy of ESA29-Oct-09 52
Contact us!
Dolores Saiz, CEO Paul Parsons, CTOWebsite, e-mail
http://[email protected]@[email protected]
SpainThe Server Labs S.L.C/Pinar, 528006 Madrid, SpainTel: (+34) 91 745 68 77
UKThe Server Labs Ltd.Aston Court, Kingsmead Business Park Frederick Place High WycombeHP11 1LATel: (+44) 20 8133 1620
GermanyTrianon, Mainzer Landstraße 1660325 FrankfurtTel: (+49) (0) 69 971 68 428
© The Server Labs S.L. 2009, Images Courtesy of ESASeminar, IAC, 30th April 2008 Dr. Ralf Kohley, European Space Astronomy Centre 53
Copyright Notice
This presentation contains images and videos which have been released publicly from ESA. You may use ESA images or videos for educational or informational purposes.
The publicly released ESA images may be reproduced without fee, on the following conditions:
* Credit ESA as the source of the images:Examples: Photo: ESA; Photo: ESA/Cluster; Image: ESA/NASA - SOHO/LASCO
* ESA images may not be used to state or imply the endorsement by ESA or any ESA employee of a commercial product, process or service, or used in any other manner that might mislead.
* If an image includes an identifiable person, using that image for commercial purposes may infringe that person‘s right of privacy, and separate permission should be obtained from the individual.
If these images are to be used in advertising or any commercial promotion, layout and copy must be submitted to ESA beforehand for approval to:
Some images contained in this presentation have come from other sources, and this is indicated in the Copyright notice. For re-use of non-ESA images contact the designated authority.
Use of ESA videos
The use of ESA video images in streaming and downloadable format is limited to direct viewing and/or file storage on a single computer per stream and/or download. Forwarding of files or streams to other computers, or use on any non-ESA Web is
prohibited. For the authorisation of any such use, please contact:
Copyright © 2009 Recombinant Data Corp. All rights reserved.
Joseph AdlerSolutions Architect
Recombinant Data Corp.
Joseph AdlerSolutions Architect
Recombinant Data Corp.
55
About me• Joseph Adler
– 12 years experience in data warehousing and data mining
– Multiple patents on cryptography and computer security
– Shamelessly plugs books at conferences
56
About RecombinantWe are a startup from Newton, MA focused on secondary uses of clinical data.
Core Competencies
•Clinical data warehousing & integration services•Translational research & quality reporting solutions
•Data strategy, governance & compliance consulting
•Open Source implementations & extensions
57
Representative Clients• Partners HealthCare• The Dartmouth Institute• Health Sciences South Carolina• UCSF• UC Davis Health System• Boston University• Massachusetts General Hospital• University of Washington• UMass Medical School• Morehouse School of Medicine• Piedmont Healthcare• UMass Memorial Health Care• Maine Health• UC Irvine• Moses Cone Health System• Department of Veterans Affairs• Cincinnati Children’s Hospital
59
Healthcare and Life Sciences
• What’s unique about healthcare and life sciences data warehousing?– Large volume of data
• Long data (thousands of patients)
• Wide data (thousands of measurements)
– Many workers• Thousands of employees, multiple locations—all of whom need to access the same data
– Diverse skill sets• Research scientists, statisticians, medical doctors, etc
• Lots of people with advanced degrees…
61
Case StudyWe were hired by a major pharmaceutical company to build a data warehouse to house pre‐clinical, clinical, and third party data and publications•Helps different types of users (scientists, clinicians, CDTLs, biostatisticians, executives) find and view results
•Allows queries of data across different stages of a study or multiple studies relating to the same subject
•Provides a view of all projects across the organization
62
Case Study• Operational and technical challenges
– Limited budget• Time, Money
– Security concerns• Proprietary corporate information
– Clinical trial results, Research focus, Experimental data• Patient data
– HIPAA, European privacy laws, etc
– Agile development process• Short development cycles, working software, collaboration with customers
– Scalability• Thousands of users, terabytes of data (eventually)
• We chose Amazon Web Services for our development, testing, and production systems.
64
Case Study
Why AWS?– Lowest cost
• Lowest storage cost
• Lowest overhead
– Highest flexibility• Add or remove instances at any time
• Start with a little storage space,scale up over time
– Easiest deployment• Fast, reliable access from our development office in MA, customer sites across US and in Europe
65
Case Study• We used these AWS tools and services:
– S3 Storage• Approximately 2 TB in use right now, scaling to 30+
– EC2 Instances• Started with Oracle AMI
– Oracle Enterprise Linux Release 5 Update 1– Oracle Database 11g Release 1 Enterprise Edition
• 6 Primary instances– Separate database and application servers– Development, Testing, and Production Systems
• Additional instances– Backup– Performance testing
66
Case Study
• Architecture diagramPharmaceutical company network
Amazon Web Services “Cloud”
Development DB
Development DB
Development Application Server
Development Application Server
Testing/QA DB
Testing/QA DB
Testing/QA Application Server
Testing/QA Application Server
Production DB
Production DB
Production Application Server
Production Application Server
Internal DatabasesInternal Databases
Internal DocumentsInternal
DocumentsInternal data filesInternal data files
Developers (App, ETL) Users (Researchers)
67
Case Study• Amazon Web Services is great for data warehouses– Data warehouses are different from transactional systems
• Loading data in batches• Querying data across large tables
– AWS performance characteristics are good for data warehouses
• Good I/O operations per second, but excellent bandwidth
• Ample CPU, Memory
– Unlimited, low cost storage
68
Lessons Learned
• Optimizing Oracle on AWS:– Observations
• Minimize I/O queries
• Load large blocks of data
• Leverage CPU and memory
– Recommendations• Table and index compression
• Bitmap indexes, bitmap join indexes
• Disable logging to the redo file (NOLOGGING)
• “Extra‐Large” EC2 instances (15 GB RAM, 4 cores, 64 bit)
69
Lessons Learned
• Increasing availability on AWS– EC2 instances can die
• Actual story: We lost one day of development work when I changed a configuration parameter and rebooted an AMI instance, locking us out.
– Recommendations• Make frequent backups
– Daily backups (to another AMI instance)
– Backup to S3
• Build your own AMI– Build a machine image with your app
– Make sure you can start, stop the AMI
70
Lessons Learned
• AWS security– Remember that each EC2 instance looks like a public host on the internet
– Recommendations• Use IP filtering (“AWS security groups”) to restrict access during development
• Get your organizations security staff involved early
• Start with a secure AMI
• Treat the EC2 instance like a Windows/Linux server on the public internet
71
Thank You
• Recombinant Data Corp.255 Washington Street, Suite 235Newton, MA 02458Tel: (617) 243‐3700 Fax: (617) 243‐[email protected]
Additional References and Contacts
• Oracle Cloud Computing Center (OTN)• http://www.oracle.com/technology/tech/cloud/index.html• Provide feedback and ask questions using the “Cloud
Computing Discussion Forum”
• Amazon Web Services Website• http://aws.amazon.com