troubleshooting complex performance issues - oracle seg$ contention

43
1 Troubleshooting Another Complex Performance Issue (Part 4) Tanel Põder Gluent, Inc. http://gluent.com http://blog.tanelpoder.com

Upload: tanel-poder

Post on 16-Apr-2017

38.404 views

Category:

Software


6 download

TRANSCRIPT

Page 1: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 1

Troubleshooting  Another  Complex  Performance  Issue  (Part  4)

Tanel  PõderGluent,  Inc.

http://gluent.comhttp://blog.tanelpoder.com

Page 2: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 2

Intro:  About  me

• Tanel  Põder• Oracle  Database  Performance  geek  (18+  years)• Exadata  Performance  geek• Linux  Performance  geek• Hadoop  Performance  geek

• CEO  &  co-­‐founder:

Expert  Oracle  Exadata  book

(2nd edition  is  out  now!)

Instant  promotion

Page 3: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 3

Gluent:  All  Enterprise  Data  Available  in  Hadoop!

GluentHadoop

Gluent

MSSQL

Tera-­‐data

IBM  DB2

Big  Data  Sources

Oracle

App  X

App  Y

App  Z

Page 4: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 4

A  refresher

Page 5: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 5

Your  task  is  slow  because

It  is  doing  too  many  operations

or

The  operations  complete  too  slowly

or  both

Page 6: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 6

Measure  &  Profile!

1. Top  SQL  &  wait  events

2. Performance  counters

3. Oracle  kernel  functions

ASH,  sqlmonsql_tracesnappersqlmonpstackperf

Page 7: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 7

Let's  get  started!

Page 8: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 8

A  Data  Warehouse  data  loading/preparation

• A  large  Exadata  /  RAC  /  Oracle  11.2.0.3  reporting  environment• Lots  of  parallel  CTAS  and  direct  path  loads• High  concurrency,  high  parallelism• Throughput  bad,  all  kinds  of  waits  showing  up:

• Other,  Cluster,  Configuration,  Application,  Concurrency,  User  I/O,  CPU

Page 9: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 9

Parallel  Degree  48,  96+,  inter-­‐instance  PX

• Mostly  CREATE  TABLE  AS  SELECT  statements!• Creating  pre-­‐aggregated  tables,  etc

Page 10: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 10

Drilldown  with  ASH  data  – instance-­‐wide

• Wait  event  names  are  more  informative  than  wait  classes:

SQL> @ashtop session_state,wait_class,event session_type='FOREGROUND'sysdate-1/24 sysdate

%This SESSION WAIT_CLASS EVENT AAS Seconds------ ------- -------------- ------------------------------ ------ -------

32% ON CPU 40.6 14619713% WAITING Cluster gc buffer busy acquire 16.8 6055911% WAITING Configuration enq: TX - allocate ITL entry 13.9 501159% WAITING Cluster gc buffer busy release 11.3 407398% WAITING Scheduler resmgr:cpu quantum 9.9 357114% WAITING Application enq: TX - row lock contention 5.2 187983% WAITING User I/O direct path read temp 4.3 154003% WAITING User I/O cell smart table scan 3.7 132592% WAITING Concurrency buffer busy waits 3 107352% WAITING Other enq: CF - contention 2.1 74431% WAITING Application enq: TM - contention 1.7 6109

Page 11: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 11

Drilldown  with  ASH  data  – one  SQL:

• Let's  pick  one  of  the  slow  CTAS  statements:

SQL> @ash/dashtop session_state,event,p1text,p1 top_level_sql_id='89apzfvjba3gu' sysdate-2 sysdate

%This SESSION EVENT P1TEXT P1------ ------- ------------------------------ ------------------ ----------

36% WAITING gc buffer busy acquire file# 120% WAITING gc buffer busy release file# 110% WAITING enq: TX - allocate ITL entry name|mode 14150533168% WAITING gc buffer busy acquire file# 14% WAITING buffer busy waits file# 14% WAITING enq: TX - allocate ITL entry name|mode 14150533162% WAITING buffer busy waits file# 12% WAITING gc buffer busy release file# 12% WAITING enq: TX - row lock contention name|mode 14150533161% WAITING latch: ges resource hash list address 3.3055E+101% ON CPU sleeptime/senderid 268501512

...

Looks  like  the  CTAS  statements  do  

directly  experience  these  waits

Page 12: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 12

Wait  event  explanation

• buffer  busy  waits• Buffer  is  physically  available  in  local  instance,  but  is  pinned  by  some  

other  local  session  (in  incompatible  mode)

• gc  buffer  busy  acquire• Someone  has  already  requested  the  remote  block  into  local  instance

• gc  buffer  busy  release• Block  is  local,  but  has  to  be  shipped  out  to  a  remote  instance  first  (as  

someone  in  the  remote  instance  had  requested  it  first)

• enq:  TX  -­‐ allocate  ITL  entry• Can't  change  block  as  all  the  block's  ITL  entries  are  held  by  others• Can't  dynamically  add  more  ITLs  (no  space  in  block)

Page 13: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 13

Looks  like  write-­‐write  contention  across  RAC  nodes!

Page 14: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 14

Contention  on  what  objects,  blocks,  what  SQL?

• Let's  dig  into  the  top  wait  event:

SQL> @ashtop session_state,wait_class,p1,p2 "event='gc buffer busy acquire'" sysdate-1/24 sysdate

%This SESSION WAIT_CLASS P1 P2 AAS TotalSeconds------ ------- -------------- ---------- ---------- ---------- ------------

8% WAITING Cluster 1 279634 1.3 4760 6% WAITING Cluster 1 279635 1.1 3916 6% WAITING Cluster 1 279629 1.1 3838 5% WAITING Cluster 1 279632 .9 3146 4% WAITING Cluster 1 279638 .7 2503

...

SQL> @sed "gc buffer busy"

EVENT_NAME WAIT_CLASS PARAMETER1 PARAMETER2 PARAMETER3 ----------------------- ----------- ----------- ----------- -----------gc buffer busy acquire Cluster file# block# class#gc buffer busy release Cluster file# block# class#

Page 15: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 15

Which  object?

• Translate  file#,  block#  into  segment  names/numbers:• Assumes  the  blocks  are  in  buffer  cache

SQL> SELECT objd data_object_id, COUNT(*) FROM v$bh WHERE file#=1 AND block# IN( 279634,279635,279629,279632,279638,279636,279613,279662,279628,279608,279653,279627,279642,279637,279643,279631,279646,279622,279582,279649

) GROUP BY objd ORDER BY 2 DESC;

DATA_OBJECT_ID COUNT(*)-------------- ----------

8 113 All  blocks  belong   to  a  segment  with  data_object_id  =  8

Page 16: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 16

What  is  segment  #8?

• Look  up  the  object  names:• Using  DBA_OBJECTS.DATA_OBJECT_ID  -­‐>  OBJECT_ID

SQL> @doid 8

object_id owner object_name O_PARTITION object_type--------- --------- ----------------- ------------- -----------

8 SYS C_FILE#_BLOCK# CLUSTER14 SYS SEG$ TABLE13 SYS UET$ TABLE

This  segment  #8  is  a  cluster  which  contains  SEG$  (DBA_SEGMENTS)  and  UET$  (DBA_EXTENTS),  but  UET$  isn't  used  anymore  thanks  to  LMT  

tablespaces.  That  leaves  SEG$

Page 17: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 17

Write  contention  on  SEG$?

• SEG$ is  modified  when:

1. A  new  segment  is  created  (table,  index,  partition,  etc)2. An  existing  segment  extends3. A  segment  is  dropped  /  moved

4. Parallel  direct  path  loads  (CTAS)  can  also  create  many  temporary  segments  (one  per  PX  slave)• …and  merge  these  into  final  segment  in  the  end  of  loading

More  about  this  later…

Page 18: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 18

More  evidence!

Page 19: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 19

AWR  1

• No  CPU  starvation  evident  (checked  all  RAC  nodes)

Top 5 Timed Foreground Events~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Event Waits Time(s) (ms) time Wait Class---------------------------- ----------- -------- ----- ------ ----------DB CPU 165,199 37.0gc buffer busy acquire 1,260,128 68,399 54 15.3 Clusterenq: TX - allocate ITL entry 354,496 40,583 114 9.1 Configuratdirect path write temp 4,632,946 37,455 8 8.4 User I/Ogc buffer busy release 213,750 22,683 106 5.1 Cluster

Host CPU (CPUs: 160 Cores: 80 Sockets: 8)~~~~~~~~ Load Average

Begin End %User %System %WIO %Idle--------- --------- --------- --------- --------- ---------

5.20 42.76 26.0 3.3 0.0 69.8

Page 20: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 20

AWR  2

• AWR  also  listed  a  SEG$  insert  as  a  top  SQL:Elapsed Elapsed Time

Time (s) Executions per Exec (s) %Total %CPU %IO SQL Id----------- ------------- ------------- ------ ----- ----- -------------

100,669.1 4,444 22.65 22.6 .5 .0 g7mt7ptq286u7insert into seg$ (file#,block#,type#,ts#,blocks,extents,minexts,maxexts,extsize,extpct,user#,iniexts,lists,groups,cachehint,hwmincr, spare1, scanhint, bitmapranges) values (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,DECODE(:17,0,NULL,:17),:18,:19)

SNAP_ID NODE SQL_ID EXECS AVG_ETIME AVG_LIO AVG_PIO------- ---- ------------- ------ ---------- -------- --------

24263 3 g7mt7ptq286u7 552 70.227 880.6 .0

• AWR  on  another  node  (same  insert  into  seq$):

Super-­‐slow  single  row  inserts  into  SEG$  ???

Page 21: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 21

SQL  Trace  confirms  SEG$  insert  problem:

insert into seg$ (file#,block#,type#,ts#,blocks,extents,minexts,maxexts,extsize,extpct,user#,iniexts,lists,groups,cachehint,hwmincr, spare1,scanhint, bitmapranges)

values(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,DECODE(:17,0,NULL,:17),:18,:19)

call count cpu elapsed disk query current rows------- ------ -------- ---------- -------- -------- --------- -------Parse 0 0.00 0.00 0 0 0 0 Execute 1 0.82 75.14 3 773 1508 1 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- -------- -------- --------- -------total 1 0.82 75.14 3 773 1508 1

Row Source Operation---------------------------------------------------LOAD TABLE CONVENTIONAL (cr=773 pr=3 pw=0 time=75139696 us)

Page 22: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 22

SEG$  insert  wait  events:

Elapsed times include waiting on following events:Event waited on Times Max. Wait Total Waited------------------------------------ Waited ---------- ------------gc buffer busy release 199 1.01 10.76buffer busy waits 560 1.03 7.02resmgr:cpu quantum 54 0.04 0.08enq: TX - row lock contention 93 0.35 4.25latch: ges resource hash list 400 0.01 0.98gc buffer busy acquire 464 8.96 36.64buffer deadlock 158 0.00 0.00Disk file operations I/O 5 0.00 0.00gc cr block 2-way 101 0.00 0.01gc current block 2-way 2 0.00 0.00cell single block physical read 3 0.00 0.00gc current block busy 10 0.05 0.14gc current grant 2-way 2 0.00 0.00enq: TX - allocate ITL entry 316 0.77 13.18latch: cache buffers chains 334 0.01 1.45...

Why  does  a  single   row  insert  cause  so  many waits?

Page 23: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 23

Checkpoint:  What  do  we  know?

Page 24: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 24

Checkpoint

1. Lots  of  parallel  CTAS  statements,  seem  to  wait  for  various  RAC  gc  buffer  busy  events  and  enq:  TX  – ITL  contention

2. CTAS  statements  create  new  segments,  new  segments  cause  inserts  into  SEG$

3. AWR  and  SQL  Trace  report  super-­‐long  elapsed  times  for  single-­‐row  SEG$  inserts

4. It's  actually  the  recursive  SEG$  inserts  that  wait  for  the  gc  buffer  busy and  enq:  TX  – ITL  contention  events

Page 25: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 25

"Hypothesis"

Could  we  be  in  a  cache  +  transaction  layer  nested  locking  ping-­‐pong  loop?

Page 26: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 26

Multiple  layers  of  locking  &  coherency

ITL  2ITL  1

data2data1ccLH c2c1

ITL  4ITL  3

KCBH  -­‐ block  common  headerKTBBH    -­‐ transaction  common  header

KDBH  -­‐ Data  header

Data  block

Row

Level  0:  VOS/service  layer-­‐ Cache  buffers  chains  latch

Level  1:  Cache  layerBuffer  pin-­‐ local  pin-­‐ global  cache  pin

Level  2:  Transaction  layer-­‐ ITL  entry-­‐ row  lock  byte  

You  must  pin  (and  retrieve)  a  buffer  before  you  can  

change  or  examine  it  

What  if  you  pin  a  buffer  and  find  all  its  ITL  slots  busy?  

(with  no  space  to  add  mode)

The  session  will  release  the  pinand  start  waiting  for  enq:  TX  – ITL  contention event

And  re-­‐get/pin  again  when  TX  –ITL  wait  is  over!

Page 27: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 27

Raw  trace  file  (edited):

*** 2013-11-18 21:39:09.760WAIT #1: nam='buffer busy waits' ela= 1023321 file#=1 block#=279627 class#=1 obj#=14WAIT #1: nam='gc buffer busy release' ela= 25304 file#=1 block#=279627 class#=1 obj#=14WAIT #1: nam='buffer busy waits' ela= 4315 file#=1 block#=279627 class#=1 obj#=14WAIT #1: nam='Disk file operations I/O' ela= 10 FileOperation=2 fileno=141 filetype=2 WAIT #1: nam='gc cr block 2-way' ela= 134 p1=141 p2=5456 p3=3447 obj#=0 WAIT #1: nam='enq: TX - allocate ITL entry' ela= 29295 name|mode=1415053316WAIT #1: nam='latch: ges resource hash list' ela= 958 address=33046235872 number=91WAIT #1: nam='gc buffer busy acquire' ela= 3384 file#=1 block#=279627 class#=1 obj#=14 WAIT #1: nam='buffer busy waits' ela= 817 file#=1 block#=279627 class#=1 obj#=14WAIT #1: nam='gc buffer busy acquire' ela= 344 file#=1 block#=192 class#=4 obj#=14WAIT #1: nam='gc buffer busy acquire' ela= 2096 file#=1 block#=279656 class#=1 obj#=14 WAIT #1: nam='buffer busy waits' ela= 3922 file#=1 block#=279656 class#=1 obj#=14 WAIT #1: nam='gc cr block 2-way' ela= 124 p1=141 p2=28888 p3=3895 obj#=0 WAIT #1: nam='enq: TX - allocate ITL entry' ela= 9260 name|mode=1415053316 WAIT #1: nam='latch: ges resource hash list' ela= 683 address=33043220640 WAIT #1: nam='gc buffer busy acquire' ela= 13696 file#=1 block#=279656 class#=1 obj#=14WAIT #1: nam='buffer busy waits' ela= 1312 file#=1 block#=279656 class#=1 obj#=14...*** 2013-11-18 21:39:10.933WAIT #1: nam='gc buffer busy acquire' ela= 1023271 file#=1 block#=277294 class#=1

obj#=9 WAIT #1: nam='buffer deadlock' ela= 15 dba=4471598 class*10+mode=12 flag=8192 obj#=9

What's  obj#  9?

What's  class#  4?

obj#  14  is  SEG$

Page 28: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 28

OBJ  #9  and  #14

• Look  up  the  object  names:• Using  DBA_OBJECTS.DATA_OBJECT_ID  -­‐>  OBJECT_ID

SQL> @oid 9,14

owner object_name object_type--------------- --------------------- ------------------SYS I_FILE#_BLOCK# INDEX SYS SEG$ TABLE

The  object  #9  is  a  cluster  index  for  cluster  

C_FILE#_BLOCK#  (oid #8  on  a  previous  slide

SQL> @bclass 4

CLASS UNDO_SEGMENT_ID------------------ ---------------segment header Bclass  #4  is  a  segment  

header  block  that  also  stores  freelist  information

Page 29: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 29

So  what  did happen?

Page 30: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 30

Conclusion  from  gathered  evidence

1. Lots  of  concurrent  +  highly  parallel  CTAS  statements

2. Each  PX  slave  created  its  own  temporary  segments  (in  spikes  when  PX  query  started  up)

3. Spikes  of  concurrent SEG$  inserts  simultaneously  on  4  RAC  nodes

4. SEG$  cluster  blocks  out  of  ITL  entries

5. This  caused  global  cache/TX  locking  trashing  – long  loops  of  unsuccessful  attempts  to  pin  and  insert  into  SEG$  blocks

Page 31: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 31

Also  worth  noting

1. Resource  manager  a  possible  factor• Plenty  of  resmgr:cpu quantumwaits• Lock  holders,  buffer  pin  holders  might  have  gone  to  sleep  – while  

holding  the  pins/locks  under  contention

1. Freelist-­‐based  block  space  allocation• SEG$  is  a  SYS-­‐owned  freelist-­‐managed  index  cluster• Many  cluster  table  blocks  walked  through  when  searching  for  space

2. ASH  doesn't  always  show  recursive  SQL  IDs• Attributes  waits  to  parent  (top  level)  statement  SQL  ID  instead• ASH  p1,  p2,  p3,  current_obj#  columns  are  useful  for  drilldown

Page 32: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 32

How  to  fix  it?

• Do  it  less!• What?

• Do  not  insert/update  SEG$  entries  so  frequently!• How?

• Do  not  allow  parallel  direct  load  insert  to  create  a  temporary  loading  segment  for  each  slave (and  inserted  partition)!• How?

• Make  sure  that  High  Water  Mark  Brokering  is  used!• One  temporary  segment  -­‐ thus  one  SEG$  insert/update/delete  -­‐ per  

query  (and  inserted  partition)  =  much  less  SEG$  contention.• More  in  following  slides  …

Page 33: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 33

Why  is  High-­‐Water-­‐Mark  Brokering  needed?

• A  historical  problem  with  large  uniform  extent  sizes:1. Each  data  loading  PX  slave  allocated   its  own  extents  (to  its  private  

temporary  segment)  when  loading  data2. When  the  load  was  completed  the  private  temporary  segments  were  

merged  to  one  final  table  segment3. The  last extent  of  each  private  segment  possibly  ended  up  not  fully  used

• Some  were  almost  full,  some  almost  empty  – half-­‐empty  on  average4. Wastage  =  0.5  x  extent_size x  PX_slaves

• The  more  PX  slaves,  the  more  wastage• The  bigger  extent,  the  bigger  wastage

• References:• https://blogs.oracle.com/datawarehousing/entry/parallel_load_uniform

_or_autoallocate• http://www.oracle.com/technetwork/database/bi-­‐

datawarehousing/twpdwbestpractices-­‐for-­‐loading-­‐11g-­‐404400.pdf

Problem  Solved  by  using  High-­‐Water-­‐Mark  

Brokering!

But…

Page 34: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 34

Parallel  Data  Loading  and  Large  Extents  – space  wastage?SQL> @seg sales_parallel_ctasSEG_MB OWNER SEGMENT_NAME SEG_TABLESPACE_NAME

------- ------- ------------------------- --------------------9472 TANEL SALES_PARALLEL_CTAS TANEL_DEMO_LARGE 8896 TANEL SALES_PARALLEL_CTAS_FIX TANEL_DEMO_LARGE

SQL>SELECT * FROM TABLE(space_tools.get_space_usage('TANEL', 'SALES_PARALLEL_CTAS','TABLE'));---------------------------------------------------------------------------------------Full blocks /MB 1134669 8865Unformatted blocks/MB 0 0Free Space 0-25% 0 0Free Space 25-50% 0 0Free Space 50-75% 0 0Free Space 75-100% 72963 570

SQL>SELECT * FROM TABLE(space_tools.get_space_usage('TANEL', 'SALES_PARALLEL_CTAS_FIX’,'TABLE'))----------------------------------------------------------------------------------------Full blocks /MB 1134669 8865Unformatted blocks/MB 0 0Free Space 0-25% 0 0Free Space 25-50% 0 0Free Space 50-75% 0 0Free Space 75-100% 0 0

An  identical  table  PX  CTAS  in  the  same  tablespace  with  

64MB  uniform  extent  (upper  table  6.4%  bigger)

570  MB  worth  of  blocks  unused   in  the  segment?PX  DOP  16  x  64  *  ~0.5  =  

512MB

Same  table  loaded  with  INSERT  /*+  APPEND  */  PX  DOP  16.  No  empty  blocks

Page 35: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 35

Parallel  Data  Loading  and  Large  Extents  – HWM  brokering

• Instead  of  each  PX  slave  working  on  their  own  separate  segment…1. Allocate  and  extend  only  one  segment2. Slaves  allocate  space  within  the  single  segment  as  needed3. No  “holes”  in  tables,  no  space  wastage  problem  anymore!4. Serialized  via  the  HV  -­‐ Direct  Loader  High  Water  Mark enqueue

• Parameter:• _insert_enable_hwm_brokered =  true  (default)• “during  parallel  inserts  high  water  marks  are  brokered”

• Problem:• In  Oracle  11.2  (including  11.2.0.4)  this  only  works  with  INSERT  APPENDs• …and  not  for  CTAS  nor  ALTER  TABLE  MOVE• (in  one  case  it  worked  for  CTAS,  only  if  the  target  table  was  partitioned)  

Page 36: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 36

Parallel  Data  Loading  and  Large  Extents  – HWM  brokering

SQL> @fix 6941515

BUGNO VALUE DESCRIPTION IS_DEFAULT------- ------ ------------------------------------------------------------ ----------6941515 0 use high watermark brokering for insert into single segment 1

SQL> alter session set "_fix_control"='6941515:ON';

Session altered.

• Actually  this  is  a  bug,  and  Oracle  has  fixed  it  long  time  ago• But  bugfix 6941515  is  not  enabled  by  default!  

• After  enabling  the  bug-­‐fix,  parallel  CTAS  and  parallel  ALTER  TABLE  MOVE  also  use  HWM  brokering…• Unfortunately  it  didn’t  work  with  ALTER  TABLE  MOVE  if  moving  the  

whole  table  if  the  table  was  partitioned.DBMS_SPACE  /  

space_tools helps  to  measure  this!

Thanks  to  Alex  Fatkulin for  spotting  

this!

Page 37: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 37

Additional  slides(we  are  probably  out  of  time)

Page 38: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 38

Case  Study:  Data  Loading  Performance  – Example  1

• Parallel  Create  Table  As  Select  from  Hadoop  to  Exadata• Buffer  busy  waits  dominating  the  response  time  profile  (SQL  Monitoring)

http://blog.tanelpoder.com/2013/11/06/diagnosing-­‐buffer-­‐busy-­‐waits-­‐with-­‐the-­‐ash_wait_chains-­‐sql-­‐script-­‐v0-­‐2/

Page 39: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 39

Data  Loading  Performance:  ashtop.sql

SQL> @ash/ashtop session_state,event,p2text,p2,p3text,p3 sql_id='3rtbs9vqukc71' "timestamp'2013-10-05 01:00:00'" "timestamp'2013-10-05 03:00:00'"

%This SESSION EVENT P2TEXT P2 P3TEXT P3 ------ ------- --------------------------------- --------- -------- -------- -------

57% WAITING buffer busy waits block# 2 class# 1331% ON CPU file# 0 size 524288 1% WAITING external table read file# 0 size 524288 1% ON CPU block# 2 class# 13 0% ON CPU consumer 12573 0 0% WAITING cell smart file creation 0 0 0% WAITING DFS lock handle id1 3 id2 2 0% ON CPU file# 41 size 41 0% WAITING cell single block physical read diskhash# 4695794 bytes 8192 0% WAITING control file parallel write block# 1 requests 2 0% WAITING control file parallel write block# 41 requests 2 0% WAITING change tracking file synchronous blocks 1 0 0% WAITING control file parallel write block# 42 requests 2 0% WAITING db file single write block# 1 blocks 1 0% ON CPU 0 0

• Break  the  (buffer  busy)  wait  events  down  by  block#/class#  

block  #2  ?

Page 40: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 40

Case  Study:  Data  Loading  Performance  – Example  2

• Lots  of  serial  sessions  doing  single  row  inserts  – on  multiple  RAC  nodes• “buffer  busy  waits”  and  “gc  buffer  busy  acquire”  waits• file#  =  6• block#  =  2• class#  =  13

SQL> @bclass 13

CLASS UNDO_SEGMENT_ID------------------ ---------------file header block

SQL> select file#, block#, status from v$bh where class# = 13;

FILE# BLOCK# STATUS---------- ---------- ----------

5 2 xcur4 2 xcur

...

Block dump from disk:buffer tsn: 7 rdba: 0x00000002 (1024/2)scn: 0x0000.010b6f9b seq: 0x02 flg: 0x04 tail:

0x6f9b1d02frmt: 0x02 chkval: 0xd587 type: 0x1d=KTFB

Bitmapped File Space HeaderHex dump of block: st=0, typ_found=1

A  single  space  allocation  contention  point  per  LMT  file.

Bigfile  tablespaces  have  only  one  file!

Page 41: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 41

Parallel  Data  Loading  Performance:  ash_wait_chains.sql

SQL> @ash/ash_wait_chains event2 sql_id='3rtbs9vqukc71' "timestamp'2013-10-05 01:00:00'" "timestamp'2013-10-05 03:00:00'”

%This SECONDS WAIT_CHAIN------ ---------- ----------------------------------------------------------------------------35% 43735 -> ON CPU15% 18531 -> buffer busy waits [file header block] -> ON CPU7% 9266 -> buffer busy waits [file header block] -> control file parallel write7% 8261 -> buffer busy waits [file header block] -> cell smart file creation6% 6959 -> direct path write5% 6707 -> buffer busy waits [file header block] -> DFS lock handle4% 4658 -> buffer busy waits [file header block] -> local write wait4% 4610 -> buffer busy waits [file header block] -> cell single block physical read3% 4282 -> buffer busy waits [file header block] -> local write wait -> db file par2% 2801 -> buffer busy waits [file header block]2% 2676 -> buffer busy waits [file header block] -> ASM file metadata operation2% 2092 -> buffer busy waits [file header block] -> change tracking file synchronous2% 2050 -> buffer busy waits [file header block] -> control file sequential read

The  waiting  session(s) The  blocking  session(s)Block  type  if  buffer  busy  wait

• Display  top  wait  chains  from  ASH  (who’s  waiting  for  whom)

Page 42: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 42

Case  Study:  Parallel  Data  Loading  Performance

• Reduce  demand  on  the  LMT  bitmap  blocks• By  allocating  bigger  extents  at  a  time

• Use  large  uniform  extents  for  fast-­‐growing  tables(paces)  • The  customer  went  with  64MB  uniform  extent  size

• The  autoallocate extent  management  is  suboptimal  for  very  large  segments• As  there’d  be  1  LMT  space  management  bit  per  64kB  regardless  of  

your  INITIAL  extent  size  at  the  segment  level• The  _partition_large_extents=  TRUE  doesn’t  change  this  either

• Large  uniform  extents  are  better  for  data  loading  andscanning!  

Page 43: Troubleshooting Complex Performance issues - Oracle SEG$ contention

gluent.com 43

Thanks!!!

http://blog.tanelpoder.com/seminar

[email protected]@tanelpoder