Data Virtualization: Revolutionizing data cloning
a.k.a. copy data management
1kylehailey.com [email protected] @datavirt
DevOps movement
• Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast
DevOps :
• Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast
• Continuous Integration• Cloud • Agile • Kanban• Kata
“IT is the factory floor of this century”
The Goal : Theory of Constraints
Improvementnot made at the constraintis an illusion
factory floor optimization
Put your energy into the constraint
Top 5 constraints in IT
1. Dev environments setup2. QA setup3. Code Architecture4. Development5. Product management
- Gene Kim
“One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
Data is the constraint
60% Projects Over Schedule
85% delayed waiting for data
Data is the Constraint
CIO Magazine Survey:
only getting worse
Gartner: Data Doomsday, by 2017 1/3rd IT in crisis
Typical Architecture
Production
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
InstanceInstance
Instance
File system
Database
File system
Database
Dev, QA, UAT Reporting Backup
Triple Tax
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
InstanceInstance
Instance
File system
Database
File system
Database
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
InstanceInstance
Instance
File system
Database
File system
Database
Copies
21
• Oracle customers : 8-12 copies per db
• Fortune 2K: 1000s multi-TB db
• Downstream storage staggering
- 3 petabytes at just one client
• Hardware– storage, systems, network, – rack space, power cooling
• People – 1000s hours per year just for DBAs – DBAs– SYS Admin– Storage Admin– Backup Admin – Network Admin
• $10s Millions for data center modernizations
Copies require People & Time
Metrics
– Time – Old Data – Storage
Other – Analysts – Audits – Data Center Modernization
companies unaware
"we say no, no, no until we can't say no anymore" response when IT asked for copies of prod DB
1. Waiting to check in code2. Production Bugs3. Expensive Slow QA
Biggest problem in Application Development
QA : Long setup times
BugX
010203040506070
1 2 3 4 5 6 7
Delay in Fixing the bug
Cost ToCorrect
Software Engineering Economics – Barry Boehm (1981)
QA : destructive tests refresh time
32
20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST
8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs
• EMC Symmetrix– 16 snapshots – Write performance impact– No snapshots of snapshots
• Netapp & EMC VNX– 255 snapshots
• ZFS– Compression– Unlimited snapshots– Snapshots of Snapshots
• DxFS– Compression– Unlimited snapshots– Snapshots of Snapshots– Shared cache in memory
Technology Core : file system snapshots
Also check out new SSD storage such as: Pure Storage, EMC XtremIO
Snapshot 1 – full backup once only at link time
Jonathan Lewis © 2013 Virtual DB
38 / 30
a b c d e f g h i
We start with a full backup - analogous to a level 0 rman backup. Includes
the archived redo log files needed for recovery. Run in archivelog mode.
Snapshot 2 (from SCN)
Jonathan Lewis © 2013
b' c'
a b c d e f g h i
The "backup from SCN" is analogous to a level 1
incremental backup (which includes the relevant
archived redo logs). Sensible to enable BCT.
Delphix executes standard rman scripts
Apply Snapshot 2
Jonathan Lewis © 2013
a b c d e f g h ib' c'
The Delphix appliance unpacks the rman backup and "overwrites" the
initial backup with the changed blocks - but DxFS makes new copies of
the blocks
Drop Snapshot 1
Jonathan Lewis © 2013
b' c'a d e f g h i
The call to rman leaves us with a new level 0 backup, waiting for recovery.
But we can pick the snapshot root block. We have EVERY level 0 backup
Creating a vDB
Jonathan Lewis © 2013
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB(filesystem)
Your vDB(filesystem)
b' c'a d e f g h i
Creating a vDB
Jonathan Lewis © 2013
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB(filesystem)
Your vDB(filesystem)
i’b' c'a d e f g h ib' c'a d e f g h i
Bureaucracy
Developer Asks for DB Get Access
Manager approves
DBA Request system
Setup DB
System Admin
Requeststorage
Setupmachine
Storage Admin
Allocate storage (take snapshot)
Technical Challenge
Database Luns
Production FilerTarget A
Target B
Target C
snapshotclones
InstanceInstance
InstanceInstance
InstanceInstance
InstanceInstance
Instance
Source
Database LUNs
snapshot
clonesProduction Filer
Development Filer
Technical Challenge
Instance
Target A
Target B
Target C
InstanceInstance
InstanceInstance
InstanceInstance
Instance
Technical Challenge
Copy
Time Flow
Purge
Production
File System Instance
TargetStorage
Clone (snapshot)
Compress
Share Cache
Provision
Mount, recover, rename
Self Service, Roles & Security
Instance
21 3
How to get a Data Virtualization?
Sourcesync
TargetDeploy
Storagesnapshots
21 3
Source Sync Storage Snapshots Deploy automation
ZFS Yes (unlimited)
EMC SRDF Yes (16 or 255)
Netapp SMO Yes (255)
Oracle EM 12c Data Guard Netapp, ZFS Yes (oracle only, no branching)
Actifio Yes Yes Yes (no branching)
Delphix Yes Yes yes
ActifioProduction
InstanceInstanceInstance
Actifio
InstanceInstance Instance
TargetActifio
Instance
Target
Oracle Snap Clone
ZFSSAor
NetApp
Instance
TargetEM 12c
Instance
Target
Production
InstanceInstanceInstance
Oracle Snap CloneProduction
InstanceInstanceInstance
Data Guard
InstanceInstanceInstance
ZFSSAor
NetApp
Instance
TargetEM 12c
Instance
Target
Oracle Snap CloneProduction
InstanceInstanceInstance EM 12c
Solaris
ZFS
Instance
TargetData Guard
Instance
Instance
Target
Any storage
Incremental forever collect changesProduction
InstanceInstanceInstance
Time Flow
ChangesInstance
NFS
Target
Instance
Target
Before Virtual Data
Production Dev, QA, UAT
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
File system
Database
InstanceInstance
Instance
File system
Database
File system
Database
“triple data
tax”
With Virtual DataProduction
Instance
Dev & QA
Instance
Reporting
Instance
Backup
Instance Instance InstanceInstanceInstance
Instance
File system
Database
Data Virtualization Appliance
Instance
Dev
QA
Instance
Prod
DVA
• Eliminate build time
• Find bugs Fast
• Run Parallel QA
QA Virtual Data : Parallel
Production Time Flow
QA Virtual Data : Fast Refresh
70
20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST
• Fast
• Full
• Fresh
• Efficient
8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs
20 MIN
TEST
Data Version Control
1/30/2015 73
Dev
QA
2.1
Dev
QA
2.2
2.1 2.2
Instance
Prod
DVA Production Time Flow
9TB database 1TB change day 30 day backups storage requirements
76
0
10
20
30
40
50
60
70
wee
k 1
wee
k 2
wee
k 3
wee
k 4
original
Oracle
Delphix
• Collect only Changes• Refresh in minutes
Instance
Prod
BI and DW
ETL24x7
DVA
Virtual Data: Fast Refreshes
Production Time Flow
Modernization: Federated
Instance
Instance
Source1
Source2
DVAProduction Time Flow 1
Production Time Flow 2
Faster
• Financial Close• BI refreshes• Surgical recovery• Projects
How expensive is the Data Constraint?
• Projects “12 months to 6 months.”– New York Life
• Insurance product “about 50 days ... to about 23 days”– Presbyterian Health
• “Can't imagine working without it”– State of California
Virtual Data Quotes
• Problem: Data is the constraint • Solution: Virtualize Data• Results:
• Half the time for projects• Higher quality• Increase revenue
Summary
Thank you!
• Kyle Hailey| Oracle ACE and Technical Evangelist, Delphix– [email protected]
– kylehailey.com
– slideshare.net/khailey
– @datavirt