cern/it/db a strawman model for using oracle for lhc physics data jamie shiers, it-db, cern

16
CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

Upload: cori-jackson

Post on 04-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

A Strawman Model for using Oracle for LHC Physics Data

Jamie Shiers, IT-DB, CERN

Page 2: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Overview

Focus on scalability & deployment aspects

Implicit assumption that OCCI / OTT can provide needed functionality

Learn from experience with Objectivity/DB deployment in LAN & WAN

Page 3: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Basic ConceptsOracle Database refers to datafiles & server

processes on a single system or clusterUser applications can access as many

Oracle Databases as requiredDifferent roles / schema / transaction

boundaries etc all supported out of the boxOracle deployed today at 1-100TB level

Page 4: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

LHC Datatypes / Volumes

RAW: 1PB / yearESD: ~100TB / yearAOD: ~10TB / yearTAG: ~100GB-1TB / year

Page 5: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

LHC Datatypes & Oracle

RAW: 1PB/yrESD: ~100TB/yrAOD: ~10TB/yrTAG: ~100GB-1TB/yr

~1 ‘DB’ / month~1 ‘DB’ / year~1 ‘DB’~1 ‘DB’ combined with

AOD• Maybe possible to soften these to ~1 ‘DB’ for all ESD• Would there be a strong advantage?• Different ‘DB’s have different access patterns, access

control, schema, … etc.• Navigation between DBs fully supported (links)

Page 6: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

A 100TB Oracle DB

Single machine or cluster?Oracle stress “Real Application Clusters”

with Oracle 9i – set of commodity systems vs ‘datacenter’ style server

Today’s Objy servers have ~1TB / disk accessible through 1 network connection

Scale to cluster of O(10) systems with O(100TB) disk?Seems plausible…

Page 7: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

Oracle Confidential 7

CERN/IT/DB

Cluster Architecture

ClusteredDatabase Servers

Mirrored Disk Subsystem

High Speed Switch or Interconnect

Hub or Switch Fabric

NetworkCentralized Management Console

Storage Area Network

Low Latency InterconnectVIA or Proprietary

Drive and Exploit Industry Advances in Clustering

Users

No SinglePoint Of Failure

Page 8: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

Oracle Confidential 8

CERN/IT/DB

Cache Fusion

Full Cache Fusion Cache-to-cache

data shipping Shared cache

eliminates slow I/O Enhanced IPC

Allows Flexibleand Transparent Deployment

Users

Shared CacheShared Cache

Cache FusionCache Fusion

Page 9: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

O.R.A.C.

Certified Intel configurations from a number of vendors… COMPAQ: PIII Xeon 700MHz, 4P, 4GB  FastTango: Oracle 9i cluster on Linux

Obtaining information from these and other vendors on suitable evaluation configurations…

Page 10: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Oracle DeploymentDAQ cluster: current data – no history

export tablespacesto RAW cluster

to/from MSS

ESD cluster: 1/year? 1?

AOD/TAG 1 total?

to RCs to/from RCs

reconstruct ‘shift’ analysis

Page 11: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

100TB cluster testbed

BT have ~80TB Oracle DB today Visit arranged for July 31

Other VLDB sites will also be visited e.g. Deutsche Telekom (DB2), DOCOMO, …

Page 12: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Why Cluster?

Separate DBs Simple, no cluster h/w or

s/w Individual nodes (DBs)

can be maintained independently

Need additional layer to find DB

Machines serving inactive data idle

Each node is a single point of failure

Cluster Additional complexity,

cost Entire cluster must be

upgraded together No additional s/w layer All nodes used all of

the time(?) Shared cache Reliability increases

with additional nodes

Page 13: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Size of the Largest RDBMS in Commercial Use for DSSSource: Database Scalability Program 2000

Terabytes

3

50

100

1996 2000 2005

Projected By Respondents

Page 14: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB Decision Support (2000)

Company DB Size*(TB)

DBMS Partner

Server Partner

Storage Partner

SBC 10.50 NCR NCR LSI

First Union Nat. Bank

4.50 Informix IBM EMC

Dialog 4.25 Proprietary Amdahl EMC

Telecom Italia (DWPT)

3.71 IBM IBM Hitachi

FedEx Services 3.70 NCR NCR EMC

Office Depot 3.08 NCR NCR EMC

AT & T 2.83 NCR NCR LSI

SK C&C 2.54 Oracle HP EMC

NetZero 2.47 Oracle Sun EMC

Telecom Italia (DA) 2.32 Informix Siemens TerraSystems

*Database size = sum of user data + summaries and aggregates + indexes

Page 15: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DBTransaction Processing (2000)

Company DB Size*(TB)

DBMS Partner

Server Partner

Storage Partner

Telstra 10.36 IBM IBM, Hitachi IBM

British Telecom 8.45 CA IBM EMC

United Parcel Service

7.88 IBM IBM EMC

Experian 3.14 IBM Hitachi EMC

US Customs Service 2.70 CA IBM Hitachi

Korea Telecom (KT ICIS)

2.26 Oracle Compaq StorageTek

Dacom System Tech.

1.80 Oracle Pyramid Seagate

CheckFree 1.35 IBM IBM IBM

Centrelink 1.27 CCA IBM IBM

LG TelCom 1.13 Oracle HP EMC

*Database size = sum of user data + summaries and aggregates + indexes

Page 16: CERN/IT/DB A Strawman Model for using Oracle for LHC Physics Data Jamie Shiers, IT-DB, CERN

CERN/IT/DB

Summary

~100TB DBs (in Oracle sense) will be fully supported by mainstream vendors on LHC timescales

The gap between our requirements & those of commercial firms narrowing fast