rac internals - indico · rac background processes lmsn lmsn global cache service process manage...

84
1 CERN Geneva - November 2008 juliandyke.com © 2008 Julian Dyke Julian Dyke Independent Consultant RAC Internals

Upload: others

Post on 11-Mar-2020

20 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

1

CERNGeneva - November 2008

juliandyke.com© 2008 Julian Dyke

Julian DykeIndependent Consultant

RAC Internals

Page 2: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

2 © 2008 Julian Dyke juliandyke.com

About me...20 years Oracle experience as DBA, developer and consultant

Independent Consultant specializing inKernel Performance TuningRAC and High Availability

Chair of UKOUG RAC & HA SIG

Regular presenter at conferences, seminars and user group meetings in UK, Europe and USA

Member of Oak Table Network

Website http://www.juliandyke.com specializing in Oracle internals

Page 3: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

3 © 2008 Julian Dyke juliandyke.com

About the book...Pro Oracle Database 10g RAC on Linux

Co-authored with Steve Shaw of Intel Corporation

Published by Apress

Available August 2006

ISBN: 1-59059-524-6

New edition plannedfor 2009 (Oracle 11gR2)

Page 4: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

4 © 2008 Julian Dyke juliandyke.com

10101 101010010 01010010101 010101010101

10101 1001010 1010 10101 101001 01101010 '1011011';

10101 101010010 01010010101 0101010101011001010 110 10101 100101101010 10101010001 100101010010 10101011111000000 0000011000 101 0101010100 1010010

1010111 0101 010110101 0110101

1001011 1010 10101 1010 100010111001 1000110 1001 11101110001 00101 1111 101110

00100110 10101 1001 010111101101110 0110

Page 5: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

5 © 2008 Julian Dyke juliandyke.com

Agenda

InterconnectRAC Background ProcessesGlobal Cache Services

Page 6: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

6 © 2008 Julian Dyke juliandyke.com

RAC4-node cluster

Public Network

SharedStorage

Node 1

Instance 1

Node 2

Instance 2

Node 3

Instance 3

Node 4

Instance 4

PrivateNetwork

(Interconnect)

StorageNetwork

Page 7: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

7 © 2008 Julian Dyke juliandyke.com

InterconnectOverview

Instances communicate with each other over the interconnect (network)

Information transferred between instances includesdata blockslocksSCNs

Typically 1Gb Ethernet UDP protocolOften teamed in pairs to avoid SPOFs

Can also use InfinibandFewer levels in stack

Other proprietary protocols are available

Page 8: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

8 © 2008 Julian Dyke juliandyke.com

InterconnectTCP/IP Five Layer Model

All messages travel down through layers, across physical layer then up again

1Physical

2 Data Link

3 Network

4 Transport

5 Application

1Physical

2 Data Link

3 Network

4 Transport

5 Application

Page 9: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

9 © 2008 Julian Dyke juliandyke.com

InterconnectTCP/IP Five Layer Model

TCP/IP has a four or five layer modelFive-layer model shown below

10BASE-T, 100BASE-T, 1000BASE-T, Optical Fibre, Twisted Pair1 Physical

Ethernet, Token Ring, 802.11, Wi-Fi, FDDI, PPP2 Data Link

IP (IPv4, IPv6), ICMP, ARP, RARP3 Network

TCP, UDP4 Transport

DHCP, DNS, FTP, HTTP, SSH, NFS, NTP, SMTP, SNMP, TELNET, RPC, SOAP5 Application

TCP/IP SuiteLayer

Four-layer model combines data link and physical layers

Page 10: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

10 © 2008 Julian Dyke juliandyke.com

InterconnectTCP/IP Transport Layer

Transport LayerConnection-oriented (TCP)Connectionless (UDP)

Ethernet

Physical Layer

IP

TCP UDPClusterware RAC

Page 11: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

11 © 2008 Julian Dyke juliandyke.com

InterconnectEncapsulation

EthernetHeader

EthernetTrailer

UDPHeader

IPHeader Data

UDPHeader

IPHeader Data

UDPHeader Data

Data

4 bytes14 bytes 20 bytes 8 bytes

MTU Size

Page 12: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

12 © 2008 Julian Dyke juliandyke.com

Oracle ClusterwareNode Heartbeat Messages

Sent to each node in cluster every second in both directionsChecks nodes are still members of cluster

Sent by ocssd.bin using TCP well-known port 49895Outgoing message is 134 bytes (80 byte payload)Incoming message is 66 bytes (12 byte payload)

Node 1

Node 3

Node 2

Node 4

Outgoing

Incoming

Page 13: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

13 © 2008 Julian Dyke juliandyke.com

Oracle ClusterwareNode Status Messages

Number of packets exchanged by a node is determined by number of nodes in clusterNumber of packets per node per hour is

(#nodes - 1) * 4 messages * 3600 seconds

446,40032216,00016100,8008

86,400772,000657,600543,200428,800314,4002

Packets per hourNumber of nodes

Page 14: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

14 © 2008 Julian Dyke juliandyke.com

Global ServicesOverview

ResourceObject to which access must be controlled at instance level

EnqueueMemory structure that serializes access to a resource

Global ResourcesObject to which access must be controlled at cluster level

Global EnqueueLocks and enqueues which need to be consistent between all instances

Page 15: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

15 © 2008 Julian Dyke juliandyke.com

Global ServicesOverview

Global Resource Directory (GRD)Records current state and owner of each resourceContains convert and write queues Distributed across all instances in clusterMaintained by GCS and GES

Global Cache Services (GCS)Implements cache coherency for database Coordinates access to database blocks for instances

Global Enqueue Services (GES)Controls access to other resources (locks) including library cache and dictionary cachePerforms deadlock detection

Page 16: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

16 © 2008 Julian Dyke juliandyke.com

DatafilesControlfiles

Redo Logs

RAC Background ProcessesOverview

Redo Logs

DIAG

LMON

LCK0

LMD0

LMSn

PMON SMON

LGWR

CKPT

ARCn

SMON PMON

DBWR DBWR LGWR

Shared Pool

Buffer Cache

Instance 2

Shared Pool

Buffer Cache

Instance 1

DIAG

LMON

LCK0

LMD0

LMSn

CKPT

ARCn

Node 1 Node 2

Page 17: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

17 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesLMSn

LMSnGlobal Cache Service Process

Manage requests for data access across cluster

Up to 20 in Oracle 10.1LMS0-LMS9 LMSa-LMSj

Up to 36 in Oracle 10.2 LMS0-LMS9 LMSa-LMSz

In Oracle 10.1 and above, number of GCS server processes can be configured using gcs_server_processes parameter

Default value is 1 (single CPU system)Can also be configured using _lm_lms parameter

Page 18: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

18 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesLMSn

In Oracle 10.2 and above LMS processes run in real-time modeRemaining processes run in time-share mode

Check using:

[oracle@server3 ~]$ ps -eo pid,user,opri,cmd | grep ora_lm8596 oracle 75 ora_lmon_TEST18598 oracle 75 ora_lmd0_TEST18601 oracle 58 ora_lms0_TEST1

58 is real time; 75 or 76 is time shareYou can also check process scheduling policies using chrtoracle@server3 ~]$ chrt -p 8601 # lms0 - Real Timepid 8601's current scheduling policy: SCHED_RRpid 8601's current scheduling priority: 1

[oracle@server3 ~]$ chrt -p 8596 # lmon - Time Sharepid 8596's current scheduling policy: SCHED_OTHERpid 8596's current scheduling priority: 0

Page 19: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

19 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesLCK0

LCK0Instance Enqueue Process

Part of KCL (Kernel Cache Library)

Manages instance resource requestscross-instance call operations

Assists LMS processes

Formerly known as lock process

One LCK0 process per instance

In 9.0.1 and below, number of lock processes may be configurable using _gc_lck_procs parameter

Page 20: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

20 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesLMD0

LMD0Global Enqueue Service Daemon

Manages requests for global enqueuesUpdates status of enqueues when granted to / revoked from an instance

Responsible for deadlock detection

One LMD0 process per instance

In 8.1.7 and below number of lock daemons may be configurable using _lm_dlmd_processes parameter

Page 21: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

21 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesLMON

LMON Global Enqueue Service Monitor

One LMON process per instance

Monitors cluster to maintain global enqueues and resources

Manages instance and process expirationsrecovery processing for cluster enqueues

Page 22: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

22 © 2008 Julian Dyke juliandyke.com

RAC Background ProcessesDIAG

DIAG - Diagnosability Process

Collects diagnostic data in the event of a failure

Creates subdirectories in BACKGROUND_DUMP_DESTdirectory

In Oracle 9.0.1 and above can be disabled using _diag_daemon parameter

Do not try this on a production system

Page 23: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

23 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesIntroduction

Global Cache Services exist to implement Cache Fusion

Cache Fusion allows blocks to be updated by multiple instances

Only one instance can have the updatable (current) version of a block

GCS must ensure that only one instance can update a block at any time

Many instances can have read-only (consistent read) versions of a block

Instances can have multiple copies of same block at different SCNs

Page 24: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

24 © 2008 Julian Dyke juliandyke.com

Global Cache Services2 way Consistent Read

Instance 1

Instance 2

Instance 4

1318

Request shared resource

Instance 3

ResourceMaster

Instance 2 requests current read on block

Request granted

SN

Read request

Block returned

1318

1

2

3

4

STOP

Page 25: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

25 © 2008 Julian Dyke juliandyke.com

Global Cache Services3-way Current Read

Instance 1

Instance 2

Instance 4

1318

Request exclusiveresource

Instance 3

ResourceMaster

Instance 1 requests exclusive read on block

Transfer block to Instance 1 for exclusiveaccess

SNBlock and resource status

Resource status

1318

1

2

3

4

N

N

X

1320

STOP

Page 26: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

26 © 2008 Julian Dyke juliandyke.com

Global Cache Services3-way Current Read (Dirty Block)

Instance 1

Instance 2

Instance 4

1318

Request block in exclusive mode

Instance 3

ResourceMaster

Instance 4 requests exclusive read on block

Transfer block to Instance 4 in exclusive mode

SN

Block and resource status

Resource status

1318

12

3

4N NX

1320N

N

X

1320 1323

STOP

Note that Instance 1 will create a past image (PI) of the dirty block

Page 27: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

27 © 2008 Julian Dyke juliandyke.com

Global Cache Services3-way Current (Without Downgrade)

Instance 1

Instance 2

Instance 4

1318

Request block in shared mode

Instance 3

ResourceMaster

Instance 2 requests current read on block

Block and resource status

Resource status

1

3

4

N NX

1320N

N

X

1320 1323

Transferblock to Instance 2in sharedmode

2

STOP

In Oracle 8.1.5 and above _fairness_threshold is used to avoid unnecessary lock conversions

Page 28: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

28 © 2008 Julian Dyke juliandyke.com

Global Cache Services3-way Current (With Downgrade)

Instance 1

Instance 2

Instance 4

1318

Request block in shared mode

Instance 3

ResourceMaster

Instance 2 requests current read on block

Block and resource status

Resource status

1

3

4

N NX

1320N X

1320 1323

Transferblock to Instance 2in sharedmode

2

S

S

STOP

In Oracle 8.1.5 and above _fairness_threshold is used to avoid unnecessary lock conversions

Page 29: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

29 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesWait Events

Wait events show reads where messages have been exchanged with other instancesCan include:

gc cr grant 2-waygc cr block 2-waygc cr block 3-way gc cr multi block requestgc current grant 2-waygc current block 2-waygc current block 3-waygc current multi block request

Page 30: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

30 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesCache Fusion Example

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

1,402,44

1,422,44

1,422,50

2 UPDATE t1SET c2 = 50

WHERE c1 = 2;

1 UPDATE t1SET c2 = 42

WHERE c1 = 1;

Page 31: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

31 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesCache Fusion Example

RAC4 executes

Table block 15Current Read3-wayCurrent Read

Undo block 89Consistent Read2-wayTable block 15Consistent Read3-way

Consistent ReadUndo block 239Consistent Read2-wayUndo block 89Consistent Read2-wayTable block 15Consistent Read3-way

Dynamic Sampling

No statistics so dynamic sampling requiredNo indexes so full table scan requiredSteps are:

UPDATE t1 SET c2 = 42 WHERE c1 = 2;

Page 32: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

32 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesCache Fusion Example

Dynamic Sampling - 10046/8

PARSING IN CURSOR #4 len=433 dep=1 uid=55 oct=3 lid=55 hv=574971495 ad='2b8da360'SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T7") FULL("T7") NO_PARALLEL_INDEX("T7") */ :"SYS_B_2" AS C1, CASE WHEN "T7"."C1"=:"SYS_B_3" THEN :"SYS_B_4" ELSE :"SYS_B_5" END AS C2 FROM "T7" "T7") SAMPLESUBEND OF STMTPARSE #4:c=0,e=423,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1EXEC #4:c=1999,e=10615,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1

WAIT #4: nam='gc cr block 3-way' ela= 836 p1=8 p2=15 p3=1 obj#=51836WAIT #4: nam='gc cr block 2-way' ela= 442 p1=6 p2=89 p3=67 obj#=51836 WAIT #4: nam='gc cr block 2-way' ela= 453 p1=6 p2=239 p3=68 obj#=51836

FETCH #4:c=0,e=2540,p=0,cr=10,cu=0,mis=0,r=1,dep=1,og=1STAT #4 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=10 pr=0 pw=0 time=3903 us)'STAT #4 id=2 cnt=32 pid=1 pos=1 obj=51836 op='TABLE ACCESS FULL T7 (cr=10 pr=0 pw=0 time=2650 us)'

Page 33: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

33 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesCache Fusion Example

UPDATE statement - 10046/8

PARSING IN CURSOR #1 len=34 dep=0 uid=55 oct=6 lid=55 tim=1168417842291309 hv=3829255502 ad='2b8d04dc'UPDATE t7 SET c2 = 20 WHERE c1 = 5END OF STMTPARSE #1:c=10998,e=61121,p=0,cr=11,cu=0,mis=1,r=0,dep=0,og=1

WAIT #1: nam='gc cr block 3-way' ela= 702 p1=8 p2=15 p3=1 obj#=51836WAIT #1: nam='gc cr block 2-way' ela= 447 p1=6 p2=89 p3=67 obj#=0

WAIT #1: nam='gc current block 3-way' ela= 650 p1=8 p2=15 p3=33554433 obj#=51836

EXEC #1:c=0,e=2931,p=0,cr=10,cu=1,mis=0,r=1,dep=0,og=1WAIT #1: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=51836WAIT #1: nam='SQL*Net message from client' ela= 7807082 driver id=1650815232 #bytes=1 p3=0 obj#=51836STAT #1 id=1 cnt=0 pid=0 pos=1 obj=0 op='UPDATE T7 (cr=10 pr=0 pw=0 time=2875 us)'STAT #1 id=2 cnt=1 pid=1 pos=1 obj=51836 op='TABLE ACCESS FULL T7 (cr=10 pr=0 pw=0 time=1665 us)'

Page 34: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

34 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr block 3-way wait event

868

1500

1500

1500

1500

1500

212

480

212

456

Bytes

Block file 8 block 15 part 6RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 5RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 4RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 3RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 2RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 1RAC4 - ServerRAC3 - LMS1

OKRAC2 - LMS1RAC3 - LMS1

Send file 8 block 15 to RAC4RAC3 - LMS1RAC2 - LMS1

OKRAC4 - ServerRAC2 - LMS1

Request file 8 block 15RAC2 - LMS1RAC4 - Server

DescriptionDestinationSource

Page 35: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

35 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr block 3-way wait event

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

1,402,44

1,422,44

UPDATE t1SET c2 = 50

WHERE c1 = 2;

1

2

3

4 5

10

67

89

1,422,441,422,44

Page 36: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

36 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr block 2-way wait event

2-way Consistent Read

868

1500

1500

1500

1500

1500

212

400

Bytes

Block file 6 block 69 part 6RAC4 - ServerRAC3 - LMS1

Block file 6 block 69 part 5RAC4 - ServerRAC3 - LMS1

Block file 6 block 69 part 4RAC4 - ServerRAC3 - LMS1

Block file 6 block 69 part 3RAC4 - ServerRAC3 - LMS1

Block file 6 block 69 part 2RAC4 - ServerRAC3 - LMS1

Block file 6 block 69 part 1RAC4 - ServerRAC3 - LMS1

OKRAC4 - ServerRAC3 - LMS1

Request file 6 block 69RAC3 - LMS1RAC4 - Server

DescriptionDestinationSource

Page 37: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

37 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr block 2-way wait event

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

1,402,44

1,402,44

UPDATE t1SET c2 = 50

WHERE c1 = 2;

1 2

34

56

78

1,402,441,402,44

STOP

Page 38: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

38 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc current block 3-way wait event

3-way Current Read

212OKRAC4 - LMS1RAC2 - LMS1

244Received file 8 block 15RAC2 - LMS1RAC4 - LMS1

868

1500

1500

1500

1500

1500

212

480

212

456

Bytes

Block file 8 block 15 part 6RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 5RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 4RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 3RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 2RAC4 - ServerRAC3 - LMS1

Block file 8 block 15 part 1RAC4 - ServerRAC3 - LMS1

OKRAC2 - LMS1RAC3 - LMS1

Send file 8 block 15 to RAC4RAC3 - LMS1RAC2 - LMS1

OKRAC4 - ServerRAC2 - LMS1

Request file 8 block 15RAC2 - LMS1RAC4 - Server

DescriptionDestinationSource

Page 39: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

39 © 2008 Julian Dyke juliandyke.com

11

Global Cache Servicesgc current block 3-way wait event

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

1,402,44

1,422,44

UPDATE t1SET c2 = 50

WHERE c1 = 2;

1

2

3

4 5

10

67

89

1,422,44

12

UPDATE t1SET c2 = 42

WHERE c1 = 1;

RAC3 saves past image of the dirty block until RAC4 writes the block to disk

1,422,44

1,422,50

STOP

Page 40: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

40 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesPast Images

When an instance passes a dirty block to another instance itFlushes redo buffer to redo log

Retains past image (PI) of block in buffer cachePI is retained until another instance writes block to diskUsed to reduce recovery times

Recorded in V$BH.STATUS as PIBased on X$BH.STATE (value 8 in Oracle 10.2)

Page 41: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

41 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesPast Images

71287129UPDATE t1SET c1 = 7124;COMMIT;

UPDATE t1SET c1 = 7129;COMMIT;

7123

Instance 1

71237124712571267127

Buffer Cache

71247123

71257124

71267125

71277126

7128

71287127

Redo Log 1

Instance 2

Buffer Cache

71297128

UPDATE t1SET c1 = 7125;COMMIT;

UPDATE t1SET c1 = 7126;COMMIT;

UPDATE t1SET c1 = 7127;COMMIT;

UPDATE t1SET c1 = 7128;COMMIT; 7128

7123

Redo Log 2

7123

712871297129

7129

7129

Assume table t1 contains a single row in block 42

Instance 1 updates column to 7124Block 42 is read from diskUndo/Redo written to

Redo Log 1Block 42 is updated in buffer

cacheInstance 1 updates column to

7125Undo/Redo written to

Redo Log 1Block 42 is updated in buffer

cacheInstance 1 updates column to

7126Undo/Redo written to

Redo Log 1Block 42 is updated in buffer

cacheInstance 1 updates column to

7127Undo/Redo written to

Redo Log 1Block 42 is updated in buffer

cacheInstance 1 updates column to

7128Undo/Redo written to

Redo Log 1Block 42 is updated in buffer

cacheInstance 2 updates column to

1329GCS transfers block from Instance 1 to Instance 2

Instance 1 makes block 42 a Past Image block

Undo/redo written toRedo Log 2

Block 42 is updated in buffer cache

Instance 2 CrashesContents of buffer cache are lostDBWR has not written changes

to block 42 back to disk yetInstance 1 must perform recovery for Instance 2

Block 42 needs recoveryInstance 1 uses Past Image Undo/redo is applied from

Redo Log 2Block 42 is subsequently written

back to disk by DBWR

STOP

Page 42: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

42 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr grant 2-way wait event

2-way Consistent Read

212

276

212

400

Bytes

OKRAC3 - LMS1RAC4 - Server

Grant read file 6 block 69RAC4 - ServerRAC3 - LMS1

OKRAC4 - ServerRAC3 - LMS1

Request file 6 block 69RAC3 - LMS1RAC4 - Server

DescriptionDestinationSource

Page 43: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

43 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr grant 2-way wait event

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

1,402,441,402,44

1,402,44

SELECT c2FROM t1

WHERE c1 = 1;

1 2

5 6

34

STOP

Page 44: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

44 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr multi block request wait event

212

772

212

1872

Bytes

OKRAC3 - LMS1RAC4 - Server

Grant file 8 blocks 69-73 to RAC4RAC4 - ServerRAC3 - LMS1

OKRAC4 - ServerRAC3 - LMS1

Request file 8 blocks 69-73RAC3 - LMS1RAC4 - Server

DescriptionDestinationSource

Page 45: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

45 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr multi block request wait event

RAC1

RAC2

RAC4

1318

RAC3

ResourceMaster

SELECT c2FROM t1

WHERE c1 = 1;

1 2

5 6

34

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

1,402,44

STOP

Page 46: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

46 © 2008 Julian Dyke juliandyke.com

Global Cache Servicesgc cr multi block request wait event

The following 10046/8 trace is for a gc cr multi block request

WAIT #2: nam='gc cr multi block request' ela= 722 file#=4 block#=248 class#=1 obj#=51866 tim=1169728375495574

WAIT #2: nam='db file scattered read' ela= 10437 file#=4 block#=244 blocks=5 obj#=51866 tim=1169728375506092

This trace can be misleading because:the gc cr multi block request specifies the LAST block in the rangethe gc cr multi block request does not specify how many blocks should be readthe gc cr multi block request does not specify how many blocks have been returned from another instance

Page 47: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

47 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesUDP Messages

There are two types of message exchanged within RACThese are PROBABLY defined as follows

SynchronousThese messages require an acknowledgement for each packetIn some cases the acknowledgement packet can be larger than the original request

e.g. SCN synchronization

AsynchronousThese messages do not require an individual acknowledgement for each packet

e.g. block transfers between instances

Page 48: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

48 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesLock Modes

Lock modes can be:Null

Another instance can hold an exclusive or shared lockShared

Another instance can hold a shared lock but not an exclusive lock

ExclusiveNo other instances can hold shared or exclusive locks

Locks can also be:Local

No other instance has held an exclusive lockGlobal

Another instance has held an exclusive lock in the past

Page 49: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

49 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesFairness Threshold

Intended to prevent unnecessary lock downgrades when other instances only require read-only copies

For write to read transfersWriting instance retains X lockReading instance retains null lock

If _fairness_threshold reached thenWriting instance downgrades X lock to S lockReading instance receives S lock

_fairness_threshold default value is 4

Page 50: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

50 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesLock Elements

Lock elements are externalized in the V$LOCK_ELEMENT dynamic performance view

Based on X$LE

Additional information is available in the X$LE view

Past image buffers do not have a lock element

In OPS one lock element could manage a contiguous range of blocks

Still can in RAC using GC_FILES_PER_LOCK parameterDisables Cache Fusion

Page 51: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

51 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesLock Elements

Contain embedded GCS Client structures (KJBL)

LockElement

GCSClient

BufferHeader

LockElement

GCSClient

BufferHeader

BufferHeader

LockElement

GCSClient

BufferHeader

Page 52: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

52 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesMemory Structures

KJBRKJBR

KJBL

BH BH

LE

KJBL

LE

KJBL

GCSClient

GCSShadow

GCSResource

BlockHeader Lock

Element

GCS Shadow describes blocks

held by other instances, but

mastered locally

Page 53: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

53 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesMemory Structures

GCS Resources (KJBR)Stored in segmented array Number of GCS resource structures determined by

_gcs_resources parameterExternalized in X$KJBRNumber of free GCS resource structures in X$KJBRFX

GCS Enqueues (Clients / Shadows) (KJBL)GCS clients embedded in lock elementsGCS shadows stored in segmented arrayNumber of GCS shadow structures determined by

_gcs_shadow_locks parameterExternalized in X$KJBLNumber of free GCS shadow structures in X$KJBLFX

Page 54: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

54 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDumps

To dump the contents of the global cache use:ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME GC_ELEMENTS LEVEL 1';

GLOBAL CACHE ELEMENT DUMP (address: 0x21fecd18):id1: 0x3591 id2: 0x10000 obj: 181 block: (1/13713)lock: SL rls: 0x0000 acq: 0x0000 latch: 0flags: 0x41 fair: 0 recovery: 0 fpin: 'kdswh05: kdsgrp'bscn: 0x0.18a9c bctx: (nil) write: 0 scan: 0x0 xflg: 0 xid: 0x0.0.0

GCS CLIENT 0x21fecd60,1 sq[(nil),(nil)] resp[(nil),0x3591.10000] pkey 181grant 1 cvt 0 mdrole 0x21 st 0x20 GRANTQ rl LOCALmaster 1 owner 0 sid 0 remote[(nil),0] hist 0x7chistory 0x3c.0x1.0x0.0x0.0x0.0x0. cflag 0x0 sender 2 flags 0x0 replay# 0disk: 0x0000.00000000 write request: 0x0000.00000000pi scn: 0x0000.00000000msgseq 0x1 updseq 0x0 reqids[1,0,0] infop 0x0pkey 181hv 107 [stat 0x0, 1->1, wm 32767, RMno 0, reminc 6, dom 0]kjga st 0x4, step 0.0.0, cinc 8, rmno 10, flags 0x0lb 0, hb 0, myb 178, drmb 178, apifrz 0

Page 55: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

55 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDumps

Continued

GLOBAL CACHE ELEMENT DUMP (address: 0x237f4358):id1: 0x6a39 id2: 0x10000 obj: 74 block: (1/27193)lock: SL rls: 0x0000 acq: 0x0000 latch: 0flags: 0x41 fair: 0 recovery: 0 fpin: 'kdswh05: kdsgrp'bscn: 0x0.26992 bctx: (nil) write: 0 scan: 0x0 xflg: 0 xid: 0x0.0.0

GCS SHADOW 0x237f43a0,1 sq[0x2ee64e8c,0x2eff3858] resp[0x2ee64e74,0x6a39.10000] pkey 74grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCALmaster 0 owner 0 sid 0 remote[(nil),0] hist 0x12a5.....

GCS RESOURCE 0x2ee64e74 hashq [0x2ee61894,0x2ff57390] name[0x6a39.10000] pkey 74grant 0x2eff3858 cvt (nil) send (nil),0 write (nil),0@65535flag 0x0 mdrole 0x1 mode 1 scan 0 role LOCAL.....

GCS SHADOW 0x2eff3858,1 sq[0x237f43a0,0x2ee64e8c] resp[0x2ee64e74,0x6a39.10000] pkey 74grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCALmaster 0 owner 1 sid 0 remote[0x23fea160,1] hist 0x65f.....

GCS SHADOW 0x237f43a0,1 sq[0x2ee64e8c,0x2eff3858] resp[0x2ee64e74,0x6a39.10000] pkey 74grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCALmaster 0 owner 0 sid 0 remote[(nil),0] hist 0x12a5 .....

Page 56: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

56 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBlock Mastering

Each block is mastered on one instanceBlock DBA is reported by X$KJBR.KJBRNAME

Names have the format:[<block_number>][<file_number>][BL]

For example

[0x137][0x40000][BL]

Ordering by X$KJBR.KJBRNAME is difficult because the resource names do not collate when sorted e.g.:

is file# 4, block# 311

[0x12E][0x40000][BL]

[0x12F][0x40000][BL]

[0x13][0x40000][BL]

[0x130][0x40000][BL]

[0x131][0x40000][BL]

etc...

Page 57: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

57 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBlock Mastering

Some useful functions

CREATE OR REPLACE FUNCTION get_file_number (p_resource_name VARCHAR2)RETURN INTEGERIS

pos1 INTEGER := INSTR (p_resource_name,'x',1,2);pos2 INTEGER := INSTR (p_resource_name,']',1,2);s VARCHAR2(30) := SUBSTR (p_resource_name,pos1+1,pos2-pos1-1);

BEGINRETURN TO_NUMBER (s,'XXXXXXXX') / 65536;

END;/

CREATE OR REPLACE FUNCTION get_block_number (p_resource_name VARCHAR2)RETURN INTEGERIS

pos1 INTEGER := INSTR (p_resource_name,'x',1,1);pos2 INTEGER := INSTR (p_resource_name,']',1,1);s VARCHAR2(30) := SUBSTR (p_resource_name,pos1+1,pos2-pos1-1);

BEGINRETURN TO_NUMBER (s,'XXXXXXXX');

END;/

Page 58: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

58 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBlock Mastering

In Oracle 10.2 block mastering is determined by _lm_contiguous_res_count

Specifies number of contiguous blocks that will hash to the same HV bucket Defaults to 128For example

etcetc0x5FF0x5800x4FF0x4800x3FF0x3800x2FF0x280

EndStart

0x1FF0x180

0x0FF0x080

etcetc0x57F0x5000x47F0x4000x37F0x3000x27F0x200

EndStart

0x17F0x100

0x07F0x000

Instance 0 Instance 1

Page 59: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

59 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBlock Mastering

In Oracle 10.1 and below block mastering is determined by a hash function

Algorithm applied to groups of 1289 contiguous blocksIn two node cluster

Instance 0 has 645 blocksInstance 1 has 644 blocksetc

In three node clusterInstance 0 has 430 blocksInstance 2 has 215 blocksInstance 1 has 430 blocksInstance 2 has 214 blocksetc

Beware of small hot tables and indexes....

Page 60: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

60 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBlock Mastering

The following table shows that masters are still assigned to ranges of 128 contiguous blocks in a four-node cluster

114071280

212791024

01023896

1895768

3767640

3639512

3511384

2383256

2255128

11270

MasterEnd BlockStart Block

Page 61: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

61 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

In Oracle 9.2 documentation describes dynamic remasteringnot implemented in code

In Oracle 10.1work at data file levelvery high threshold so difficult to testdoes occur on some customer sites

In Oracle 10.2works at segment levelthresholds are relatively low

Page 62: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

62 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

ExampleSELECT data_object_id FROM dba_objectsWHERE owner = 'US01'AND object_name = 'T1';OBJECT_ID---------52084

ORADEBUG LKDEBUG -m pkey 52084

To remaster object at current instance use:

All blocks now mastered by the current instance

To redistribute masters to all available instances use:ORADEBUG LKDEBUG -m dpkey 52084

Blocks mastered by both (all) instances again

Page 63: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

63 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

Object remastering is recorded in V$GCSPFMASTER_INFOInstances are internally numbered 0, 1 etcInitially contains no rowsAfter remastering object 52084 to instance 0

SELECT object_id, current_master, previous_masterFROM v$gcspfmaster_info;

After remastering object 52084 to instance 1

32767052084Previous MasterCurrent MasterObject ID

0152084Previous MasterCurrent MasterObject ID

Page 64: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

64 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

In Oracle 10.2 and above, information about Dynamic Remastering operations is also reported in the following fixed views

X$KJDRMREQDynamic Remastering Requests

X$KJDRMAFNSTATSFile Remastering Statistics

X$KJDRMHVSTATSHash Value Statistics

Page 65: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

65 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

In Oracle 11.1 and above, Dynamic Remastering statistics are reported in V$DYNAMIC_REMASTER_STATS

NUMBERCURRENT_OBJECTS

NUMBERREPLAYED_LOCKS_RECEIVED

NUMBERREPLAYED_LOCKS_SENT

NUMBERRESOURCES_CLEANED

NUMBERSYNC_TIME

NUMBERFIXWRITE_TIME

NUMBERREPLAY_TIME

NUMBERCLEANUP_TIME

NUMBERFREEZE_TIME

NUMBERQUIESCE_TIME

NUMBERREMASTERED_OBJECTS

NUMBERREMASTER_TIME

NUMBERREMASTER_OPS

Data TypeCol;umn Name

Page 66: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

66 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesDynamic Remastering

Dynamic remastering is coordinated by the LMD0 background The LMD0 process background process includes limited details of dynamic remastering operations

Excessive dynamic remastering can cause instance freezesObserved in both Oracle 10.1 and 10.2Oracle Support occasionally recommends that dynamic remastering is disabled using the following parameters:

_gc_affinity_time = 0_gc_undo_affinity=FALSE

Page 67: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

67 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesSystem Change Number

In RAC clusters SCN must be maintained across all nodes in cluster

SCN propagation scheme differs according to version

In Oracle 10.1and below defaults to Lamport algorithmLamport in alert.logSCN piggy-backed on GCS/GES messagesRecorded in redo logDefault delay of 7 seconds

In Oracle 10.2 and above defaults to Broadcast on Commit algorithm

SCN negotiated immediatelyApparently no delay

Page 68: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

68 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesSystem Change Number

System Change Number algorithm is determined by the MAX_COMMIT_PROPAGATION_DELAY parameter

In Oracle 10.1 and belowInitialization parameter specified in centrisecondsDefault value is 700 centiseconds (7 seconds)Specifies maximum time taken for a COMMIT on one node to be reflected on other nodes in the clusterFor some applications performing rapid updates and queries of the same data from different instances, value must be set to 0 (Broadcast on commit)Examples include:

E-Business suiteSAP

Page 69: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

69 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesSystem Change Number

In Oracle 10.2 and above Default value of MAX_COMMIT_PROPAGATION_DELAYparameter is 0SCN broadcast on commit method is usedSCN updates are synchronized immediately

SCN is synchronized after current readbefore block updated

This ensures correct SCN is written to block

Page 70: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

70 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBroadcast on Commit

Ethernet broadcast is not used

SCN is synchronized by updating instanceSends UDP SCN synchronization message to each remote instance Remote instances respond with their current SCN

Another round of messages may be required if remote SCNsare more recent than local SCN

Synchronization occurs every time an instance needs a new SCNSynchronization is always performed by the updating instanceNumber of messages = 4 x (number of instances - 1)

Page 71: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

71 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesBroadcast on Commit

In a 4-node cluster 12 messages are exchanged

212192212192212192212192212192212192Bytes

Send current SCNRAC4-LMS0RAC1-LMS0

OKRAC3-LMS0RAC4-LMS0Send current SCNRAC4-LMS0RAC3-LMS0OKRAC2-LMS0RAC4-LMS0Send current SCNRAC4-LMS0RAC2-LMS0OKRAC1-LMS0RAC4-LMS0

OKRAC4-LMS0RAC3-LMS0Send current SCNRAC3-LMS0RAC4-LMS0OKRAC4-LMS0RAC2-LMS0Send current SCNRAC2-LMS0RAC4-LMS0OKRAC4-LMS0RAC1-LMS0Send current SCNRAC1-LMS0RAC4-LMS0DescriptionDestinationSource

Page 72: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

72 © 2008 Julian Dyke juliandyke.com

Global Cache ServiceRead Consistency

When a read consistent version of a block is requested it may be necessary to apply undo to a more recent version of that block

Undo can be applied by LMSn background process inRemote instance Local instance

If undo applied by remote instance, any outstanding redo must first be flushed from redo buffer of remote instance to redo log

Can have significant performance impact on consistent readsParticularly on extended clusters

Page 73: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

73 © 2008 Julian Dyke juliandyke.com

Global Cache ServiceRead Consistency

Statistics on inter-instance consistent reads are reported in V$CR_BLOCK_SERVER

Reports statistics for blocks served by local instances to remote instances including

Number of consistent reads servedNumber of current reads servedNumber of data blocks servedNumber of undo blocks servedNumber of undo headers servedNumber of fairness down convertsNumber of log flushesNumber of times light works rule invoked

Page 74: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

74 © 2008 Julian Dyke juliandyke.com

Global Cache ServiceRead Consistency

In theory, once a block has been written to disk, the LMS process will not attempt to read it again when responding to a consistent read request

Light Works RulePrevents LMS processes from going to disk when responding to CR requests for data, undo or undo segment blocksCan prevent LMS process from completing its response to a CR request

Page 75: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

75 © 2008 Julian Dyke juliandyke.com

Global Cache ServiceRead Consistency

Uncommitted changes MUST be flushed to the redo log before the LMS process can ship a consistent block to another instance

Reading process must wait until redo log changes have been written to redo log by LMS process

Bad for standard RAC databasesReads must wait for redo log writes

Worse for extended / stretch RAC clustersIncreased latency of cross site disk communications

Page 76: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

76 © 2008 Julian Dyke juliandyke.com

Global Cache ServiceRead Consistency

For each block on which a consistent read is performed, a redo log flush must first be performed

Number of redo log flushes is recorded in the FLUSHEScolumn of V$CR_BLOCK_SERVER

Redo log flush time is recorded in the gc cr block flush time statistic for the LMS processwill increase time taken to serve consistent blockwill increase time taken to perform consistent read

If LMS processes become very busy, consistent reads will experience high wait times e.g. for a full table scan gc cr multi block request

Page 77: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

77 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesRead Consistency

Committed transaction on RAC2 - All blocks still in buffer cache

110

109

108

108

Redo Buffer Redo Buffer

Buffer CacheBuffer Cache

RAC1 RAC2

Redo Log

1

2

3110 110

STOP

Page 78: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

78 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesRead Consistency

Committed transaction on RAC2 - Some blocks written to disk

110

109

108

Redo Buffer Redo Buffer

Buffer CacheBuffer Cache

RAC1 RAC2

Redo Log

1

3

2

110

110

4

110

110

STOP

Page 79: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

79 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesRead Consistency

Uncommitted transaction on RAC2 - All blocks still in buffer cache

110

108

Redo Buffer Redo Buffer

Buffer CacheBuffer Cache

RAC1 RAC2

Redo Log

2

31

108 110

4

5

6

109

110

109

109

108108

108108

STOP

Page 80: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

80 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesRead Consistency

Uncommitted transaction on RAC2 - Some blocks written to disk

Redo Buffer Redo Buffer

Buffer CacheBuffer Cache

RAC1 RAC2

Redo Log

3

2

1

110

4

6

8

1105

7 110

110

109

110

109

109

108108

108

STOP

Page 81: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

81 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesJumbo Frames

By default Maximum Transmission Unit (MTU) is 1500MTU includes

IP headerUDP headerData

Requires six packets to transmit one 8192 byte block

On some adapters MTU can be increased to around 9000e.g. Intel PRO/1000

At command line

ifconfig eth1 mtu 9000 up

or in /etc/sysconfig/ifcfg-eth<x>

MTU=9000

Page 82: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

82 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesJumbo Frames

Example - cost of sending on 8192 byte blockMTU=1500 (default)

151841472820145

84762482004812084Total

4

4444

EthernetTrailer

6

4321

Frame#

88684082014

1518147282014151814728201415181472820141518147282014

TotalDataUDPHeader

IP HeaderEthernet Header

82464820082014Total4

EthernetTrailer

1

Frame#

8246820082014

TotalDataUDPHeader

IP HeaderEthernet Header

MTU=9000

Page 83: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

83 © 2008 Julian Dyke juliandyke.com

Global Cache ServicesJumbo Frames

Not all network adapter drivers support jumbo framesParticularly cheap ones....

All network adapters in private interconnect must have same MTU size

Switch must also be configured to support jumbo frames

Lots of bugs and compatibility issues e.g.Bug 4447620: RAC UDP MTU size restricted to 1500 or 9000

affects 10.1.0.5, 10.2,0.1fixed in 10.2.0.2 and above

Page 84: RAC Internals - Indico · RAC Background Processes LMSn LMSn Global Cache Service Process Manage requests for data access across cluster Up to 20 in Oracle 10.1 LMS0-LMS9 LMSa-LMSj

84 © 2008 Julian Dyke juliandyke.com

Thank you for listening

Any questions?

[email protected]