Download - DB US Visit
CERN/IT/DB
DB US Visit
Oracle Visit August 20 – 24 2001[ plus related news ]
CERN/IT/DB
Introduction
Oracle Strawman Model
Issues & Concerns
Current Status & Future Directions
AOB
CERN/IT/DB
LHC Datatypes & Oracle
RAW: 1PB/yrESD: ~100TB/yrAOD: ~10TB/yrTAG: ~100GB-1TB/yr
~1 ‘DB’ / month~1 ‘DB’ / year~1 ‘DB’~1 ‘DB’ combined with
AOD• Maybe possible to soften these to ~1 ‘DB’ for all ESD• Would there be a strong advantage?• Different ‘DB’s have different access patterns, access
control, schema, … etc.
CERN/IT/DB
Oracle DeploymentDAQ cluster: current data – no history
export tablespacesto RAW cluster
to/from MSS
ESD cluster: 1/year? 1?
AOD/TAG 1 total?
to RCs to/from RCs
reconstruct ‘shift’ analysis
CERN/IT/DB
RAW
ESD
AOD
TAG
randomseq.
1PB/yr
100TB/yr
10TB/yr
1TB/yr
Data
Users
Tier0
Tier1
CERN/IT/DB
Building Blocks
~100TB “databases” – clusters?
OCCI / OTT
Transportable tablespaces & other techniques for data import / export / exchange
CERN/IT/DB
BT Visit – July 2001
Oracle VLDB site: Enormous Proof of Concept test in 1999 80TB disk, 40TB mirrored, 37TB usable Performed using Oracle 8i, EMC storage “Single instance” – i.e. not cluster
Based on same techniques as identified on paper by IT-DB
Demonstrated > 2 years ago!No concerns for building 100TB today!
CERN/IT/DB
Size of the Largest RDBMS in Commercial Use for DSSSource: Database Scalability Program 2000
Terabytes
3
50
100
1996 2000 2005
Projected By Respondents
CERN/IT/DB
Issues & Concerns
VLDB support
Cluster issues (RAC on Linux etc.)
C++ binding / object model definition
Storage issues
CERN/IT/DB
VLDB Support
RAW DB model revised to active partitions as part of DB (catalog) + offline partitions (not part of DB) + “historical data” (maybe in separate DB?)
Such a strategy is used by many Data Warehouse sites in production today
Does not require any special features But Oracle like suggestion of extending
“resumable statements” to provide “automatic but controlled” access to offline data
CERN/IT/DB
VLDB support cont.
Oracle addressing limits of current architecture Already permits 2EB databases
Limits on e.g. # files, partitions etc are expected to be significantly increased beyond Oracle 9i
An area of work, but not concern…
CERN/IT/DB
Cluster Issues
Real Application Clusters = RAC = significant advance over previous OPS
Should be good fit for HEP read-mostly dataSupported on Linux by COMPAQ, FastTangoNot critical to overall model, but could
simplify deployment significantly e.g. small number of clusters: 1 / data type
COMPAQ Oracle competency centre in Valbonne…
CERN/IT/DB
OCCI / OTT
On-going work with Oracle developers to fix bugs / provide enhancements to meet HEP requirements Fixes prior to start of ß, during ß & post ß
…Currently using HEP data models
More examples welcome…Enhancements via pre-releases, dot
releases and 9i R2
CERN/IT/DB
Storage Issues
Oracle number format provides greater precision than IEEE double
Solutions being investigated to allow efficient storage of floats / doubles / ints without user specifying precision / range
Target: next major Oracle release?
CERN/IT/DB
Side Visits
FNALObjectivityIBMCOMPAQFastTango / NetAppliance
CERN/IT/DB
Summary
Oracle demonstrably interested in continuing to work with HEP on VLDB / LHC issues, both regarding 9i and also future Oracle versions
Strong support from Oracle server group
Strong interest from Oracle Grid team