bulgarian oracle user group conference plovdiv (bulgaria ... · bulgarian oracle user group...

44
Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 The importance of applying an appropriate data partitioning (Real case studies from the ATLAS experiment DB applications) Gancho Dimitrov (CERN) On behalf of the ATLAS collaboration

Upload: lynhu

Post on 13-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015

The importance of applying an appropriate data partitioning

(Real case studies from the ATLAS experiment DB applications)

Gancho Dimitrov (CERN)

On behalf of the ATLAS collaboration

Page 2: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

2

About me

§  Working with Oracle RDBMS since 2000.

§  Certified Oracle DBA §  Oracle application DBA in the ATLAS experiment at CERN since 2006.

§  With a liaison role between the CERN/IT database services and support group and the ATLAS database developers community.

§  Main focus is on database schemes design, data management and performance tuning.

Page 3: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

3

Outline

§  ATLAS databases – main roles

§  Specific applications, requirements and technical solutions Used techniques :

1.  Partitioning types : a.  Range (plus “automatic interval” option) b.  Reference c.  List d.  Composite

2.  Data sliding windows

§  Challenges 1.  Execution plans stability 2.  Adequate statistics gathering

§  Conclusions

Page 4: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

CERN and the LHC experiments

4

CERN - European Organization for Nuclear Research founded in 1954. Situated on the French-Swiss border near Geneva, "21 Member States, 7 Observer States + UNESCO and UE."60 Non-member States collaborate with CERN."2400 staff members work at CERN as personnel, 10 000 more researchers from institutes world-wide."

Page 5: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

5

LHC @ CERN

§  LHC (Large Hadron Collider) is the World’s Largest Particle Accelerator.

§  27km ring of superconducting magnets.

§  Restarted in April 2015 after undergone upgrades.

Page 6: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

ATLAS detector

6

ATLAS is a particle physics experiment at the Large Hadron Collider at CERN that is searching for new discoveries in the head-on collisions of protons of extraordinarily high energy."Over 3000 scientists from 38 countries are taking part in the ATLAS experiment alone."

Page 7: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

ATLAS production database clusters

7

Database / info

DB 1 DB 2 DB 3 DB 4

Main database role

Online data taking

Post data taking

analysis

Grid jobs and file

management

Events metadata

Oracle version 11.2.0.4 11.2.0.4 11.2.0.4 12.1.0.1

# DB nodes 2 3 4 2 DB volume 10 TB 23.3 TB 24.5 TB 18 TB

# DB schemes

73 164 16 55

HW specs CPU Intel E5-2650@ 2GHz - 16 cores per node RAM 128 or 256 GB (depends on the DB role)

10 GigE for storage and cluster access NetApp NAS storage with 1.5 TB SSD cache

Page 8: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

8

ATLAS applications

List of ATLAS applications that will be referred often in the slides

§  PanDA (Production and Distributed Analysis system) is data-driven workload management system for distributed analysis processing. PanDA is processing up to 2 million jobs per day with 100k to 300k jobs active at any given time.

§  NICOS is the ATLAS Nightly Build System. In the Oracle DB backend are stored results from ATLAS software tests, package tags, nightly jobs timestamps, status information.

§  Rucio is the ATLAS Distributed Data Management (DDM) system in service of the ATLAS collaboration to manage large volumes of data (currently ~ 140 PBs in 700 million files), both taken by the ATLAS detector as well as generated in the ATLAS distributed computing system. Rucio manages accounts, files, datasets and distributed storage systems.

Page 9: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Used DB partitioning options in ATLAS

§  Oracle provides many types of partitioning:

Range List Hash Interval Reference Composite

§  In ATLAS we use all of them except the Hash option.

§  We use the partitioning mostly because of operations flexibility but also because the appropriate data segmentation provides some key advantages to the ATLAS DB applications in terms of performance.

9

Page 10: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Case study 1

PanDA

Partitioning features in place: Range partitioning type ( in certain cases with automatic interval option) Data sliding windows policies – in house written PLSQL procedures and partition exchange operations.

10

Page 11: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

11

PanDA system

§  The PanDA system relies on the Oracle RDBMS since 2008. Proper schema design, data management and index strategy are significant factors due to the importance of the DB layer for the ATLAS workload management system.

§  DB system has to deal efficiently with two different workloads: transactional (PanDA server) and data warehouse load (PanDA monitor).

§  Due to the different workloads the PanDA data are logically split into ‘operational’ (400 GB) and ‘archive’ (8 TB) types and respectively stored into separate DB schemes.

Page 12: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

12

Why split PanDA data?

§  PanDA server interacts with small manageable ‘operational’ data and slice of the very recent history (~ 400GB). The recent history is defined to be 3 days but can be easily changed to any other period that could be considered appropriate. "

§  Different indices are defined on the PanDA ‘operational’ data than on the ‘archive’ (8 TB) as the workload is different."

§  The ‘archive’ partitions are with a compression setting and for good reasons are set to reside in yearly based tablespaces."

This makes easy and with minimum resources the partition exchanges to dedicated tables on the production server and further move to different DB (with the TTS method) for long-term data preservation. "

Page 13: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

13

The PanDA ‘operational’ and ‘archive’ data

§  All information relevant to a single job is stored in four fact tables. "

§  The ‘operational’ data is kept in a separate schema which hosts all active jobs plus finished jobs of the most recent 3 days. "

§  Jobs that get status ‘finished’ or ‘failed’ and are older than 3 days are moved to an archive PANDA schema ( PanDA => PanDAARCH )."

§  In the first implementation the data move to the archive schema was done by the PanDA server by transactional rows insert and delete statements. !"However the negative impacts from that approach were: ""- increased transactional workload on the database !"- highly fragmented PanDA tables due to the row deletion from the ‘operational’ tables.""- the hot operational data and indexes could not fit into the buffer pool "

""

Page 14: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Better PanDA data segments organization

Tables are range partitioned on a column of type “Date” Each partition covers a time range of a day.

Tables are range partitioned on a column of type “Date” Some tables are with 3 days interval partitions others with monthly partitions

A scheduler job daily inserts the data of the oldest complete day

Filled partitions

Empty partitions relevant to the future. A job is scheduled to run every Monday for creating seven new partitions

Partitions that can be dropped. An Oracle scheduler job is dealing with that daily.

Panda server DB sessions

Time line

14

Partitions with basic compression"

‘Operational’ data ‘Archive’ data

Page 15: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

15

PanDA => PanDA archive data flow machinery

§  The PanDA => PanDAARCH data flow is sustained by a set of scheduler jobs on the Oracle server which execute a logic encoded in PL/SQL procedures"

""

Daily job verifies that all rows of certain partition have been copied to PanDAARCH and drops the PanDA partition if the above is true.

Daily job copies data of a set of partitions from PanDA to PanDAARCH

Weekly job creates daily partitions relevant to the near future days

Page 16: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

16

Benefits from the data removal based on partitions

"§  Scalability: the PanDA bulk data copy and deletion is done on table

partition level instead on row level.!"§  Not IO demanding: Removing the already copied data is not IO

demanding as this is a simple DDL operation over a table segment and its relevant index segments (alter table … drop partition)."

§  Avoids fragmentation in the PanDA ‘operational’ table. Much better space utilization and caching in the buffer pool."

§  No need for indices rebuild or coalesce operations for these partitioned tables. "

"

Page 17: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

17

Applied policies on the ‘archive’ data

§  Table partitions by design are set to reside into dedicated yearly based Oracle tablespaces.!

§  Activated basic compression !"With this option on, we can fit 50% to 60% more rows within a data "block of 8KB into the ‘jobs’ tables."

"§  Sustained data sliding window of 12 months performing partition

exchange operations from the primary tables to new yearly based tables."

"It does not require IO resources (and resp. CPU) as it is a simple Oracle "data dictionary change."

§  The new yearly based tables can be easily moved to another database cluster as they are self-contained in dedicated tablespaces. "

""

"

Page 18: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

18

Archive partitions management

§  The PanDA archive data segmentation makes very simple the data archiving by performing partition exchanges from the current tables to new dedicated yearly based tables: results in fast operations (avoids physical row movement). !""

Source table partition exchange with an auxiliary table :"ALTER TABLE FILESTABLE_ARCH EXCHANGE PARTITION FILESTABLE_ARCH_JAN_2009 WITH TABLE AUX_FILESTABLE_ARCH INCLUDING INDEXES WITHOUT VALIDATION;!"Destination table partition exchange with the auxiliary table :"ALTER TABLE Y2009_FILESTABLE_ARCH EXCHANGE PARTITION FILESTABLE_ARCH_JAN_2009 WITH TABLE AUX_FILESTABLE_ARCH INCLUDING INDEXES WITHOUT VALIDATION;!

""

Page 19: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

19

Snapshot of the partitions states

Before the partition exchange of the FILESTABLE

After the partitions exchange of the FILESTABLE

Page 20: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

20

Statistics gathering policies

§  Defined different policies for the stats gathering on the ‘operational’ and ‘archive’ data.""- incremental stats gathering on the ‘operational’ data – slide 21""- histograms are not collected for all columns in the ‘archive’ tables "and for well selected list on the ‘operational’ data – slide 22"

"§  Statistics gathering needs less resources and time on the PanDA

data due to the fact that a sliding window is sustained."

§  Daily advancement by 24 hours of the max value of the ‘modificationtime’ columns on global table level in the ‘archive’ tables via defined scheduler job – slide 23"

( using the DBMS_STATS. SET_COLUMN_STATS method ) """

Page 21: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

21

Statistics gathering settings: incremental option

We take advantage from statistics collection on partitioned tables called incremental statistics gathering. "

""Oracle spends time and resources on collecting statistics only on partitions which are transactional active and computes the global table statistics using the previously ones in an incremental way. "

""exec DBMS_STATS.SET_TABLE_PREFS (‘owner', ’table_name', 'INCREMENTAL', 'TRUE'); "

Page 22: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

22

Statistics gathering settings – avoid histograms

Histograms could provide misleading information for columns hosting long strings with common data pattern. When Oracle 11g creates a histogram on a VARCHAR2 column it only considers the first 32 characters in the column and thus wrong cardinality could be considered (especially for wildcard searches)"

e.g. forbid histogram on the DATASET_NAME column of a particular table. exec DBMS_STATS.SET_TABLE_PREFS(’table_owner', ‘table_name','METHOD_OPT', 'FOR ALL COLUMNS SIZE AUTO FOR COLUMNS SIZE 1 DATASET_NAME');

Common string pattern in the dataset names

Page 23: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

23

Explicit ‘maxval’ stats change on global table level for a column of DATE type

"For queries based on time range that refer to the recent past, it is of great importance the ‘maxval’ of the DATE column(s) to reflect the reality so that the CBO computes the cardinality more accurately. !

"Example from a logged output of a dedicated scheduler job "executed on 3rd May 2015 at 8 a.m. in the PanDA archive schema""Before the scheduler job execution : "⇒  lowest_value: 01-JAN-2014 00:00:00 "⇒  the highest_value: 03-MAY-2015 08:00:00!

After the scheduler job execution: "⇒  lowest_value: 01-JAN-2014 00:00:00 "⇒  the new highest_value: 04-MAY-2015 08:00:00""

To reflect the reality the DATE value is advanced by 24 hours

Page 24: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Case study 2

NICOS

Partitioning feature in place: Reference partitioning

24

Page 25: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

25

NICOS

§  In the ATLAS Nightly build system (NICOS) database schema are stored results from ATLAS software tests, package tags, nightly job timestamps, status information.

§  The data volume is not huge (few 10s GB/year), but the application and the availability of the data in the database are considered critical.

§  The retention of the fact data is agreed to be 12 months. Having the above requirements was chosen the reference partitioning technique for the child tables that host the nightly job attribues. The parent table is configured to use range partitioning

Page 26: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

26

Reference partitioning – the parent table

§  A parent table JOBS is created with range partitioning based on a column of type NUMBER ( the table’s PK, equi-partitioned)

These number values are with a specific format and encoded logic : YYYYMMDDNNNNIII

Year Month Day Nightly branch ID Job ID

Each partition covers a time range of a calendar month

Page 27: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

27

Reference partitioning – children tables

§  Any child to the JOBS parent table is configured with reference partitioning achieving uniform data segmentation. All indices are locally partitioned as well

Page 28: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Reference partitions organization in NICOS

28

Table with automatic interval partitioning

Parent table named “JOBS” with its six child tables: all partitioned in uniform way

Partition data segment is created only if the partition is used

Page 29: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Query with partition-wise join taking place

29

Page 30: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

30

Reference partitioning : advantages

§  Tables are partitioned in uniform way.

§  On event of partition creation in the parent table Oracle automatically creates relevant partitions into the child tables.

§  Provides flexibility for maintaining any data sliding window. Dropping the parent partition triggers automatic removal of the relevant partitions in the child tables.

§  Partition pruning and partition-wise joins take place when retrieving

data.

Page 31: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

Case study 3

Rucio system!

Partitioning features in place: List partitioning type on heap and IOT tables Composite list partitioning with list sub-partitioning Range partitioning with automatic interval option (plus data sliding windows policies)

31

Page 32: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

32

Rucio system

"The Rucio project is a new generation Distributed Data Management system for allowing the ATLAS collaboration to manage large volumes of proton-proton collisions, processed or simulated data (currently >150 PB)."

"§  Data are physically stored in files (files can be grouped in datasets and

datasets in containers) on the Grid. "

§  In Rucio each file (or copy of it) is assigned to an user, group or ATLAS production activity. !

§  Rucio logical units (scopes) with their metadata for managing ATLAS data on the Grid turned to fit naturally to the Oracle’s data segmentation via the List partitioning feature. "

""" """"

Page 33: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

33

Composite paritioning in place

§  Each list partition hosts data of a single Rucio scope, further is sub-partitioned into three pieces ‘F’- files, ‘D’ - datasets, ‘C’ – containers.!

Page 34: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

34

Composite paritioning in place (2)

Statistics of a table partition and its three sub-partitions of a particular ATLAS production activity (data12_8TeV) "

List-list composite partitioning proved to be beneficial for highly needed application feature: free pattern searches on dataset names.!

Containers Datasets

Files

Sub-partitions of DATA12_8TEV partition

Page 35: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

35

Studies on free pattern search access paths

Many orders orders of magnitude (~ 50) more data block reads (313K compared to 6K buffers) when using the index range scan compared to full sub-partition scan (block = 8KB)"

Page 36: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

36

Studies on free pattern search access paths (2)

§  How Oracle compute the cardinality of free pattern search query SELECT <list of columns> FROM dids WHERE scope = :scope_1 AND did_type = :did_type_1 AND name LIKE :name_1 ESCAPE '\’;

Page 37: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

37

Studies on free pattern search access paths (3)

§  How to guarantee data access path we want to take place?

SELECT <list of columns> FROM dids WHERE scope = :scope_1 AND did_type = :did_type_1 AND name LIKE :name_1 ESCAPE '\’;

Option 1: -------------- Create a view object with encoded hints for ensuring full sub-partition scan

CREATE OR REPLACE view SEARCH_DATASETS_VIEW AS SELECT /*+ NO_INDEX(tab(SCOPE,NAME)) FULL(tab) */ <list of columns> FROM DIDS tab WHERE did_type = 'D’ WITH READ ONLY ;

Option 2: --------------- Specify the hints directly into the query of the DB application code

/*+ NO_INDEX(tab(SCOPE,NAME)) FULL(tab) */

Page 38: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

38

List partitioning: an operational challenge

§  List partitioning is appropriate, but is an operational challenge as partitions must be pre-created for each new partition key value. This is a burden as new Rucio accounts can be registered any time in the lifetime of the Rucio system.

§  In house created solution:

Automatic partition creation in all relevant tables whenever a new partition key value is inserted into a dimensional table.

“After insert” row level trigger gets fired and executes a PLSQL procedure responsible for ALTER TABLE … ADD PARTITION Logic for handling ORA-00054 “resource_busy” error is in place

Each partition creation action is logged into a dedicated logging table

Page 39: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

39

Some numbers about the current list partitions

§  The number of partitions in the Rucio DB schema depends on the number of registered accounts.

§  Currently there are a bit more than 5000 accounts.

As there are defined 5 tables with list partitioning (four are heap organized one of which with sub-partitions and one is Index-organized) => total number of list partitions is 26385 => total number of list sub-partitions is 15831 sub-partitions

§  The number of new accounts (respectively partitions) is not expected to grow significantly within the next couple of years.

§  Depends on the role of each index and the workload it is supposed to serve the indices are set either locally partitioned or global.

§  In several cases function-based indices are in place so that small fraction of the rows are indexed.

Page 40: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

40

Data distribution among several tablespaces

§  Currently the occupied space is about 1.4 TB (3.2 billion rows) and is expected to be within 2 to 3 TB by having in proper data life-cycle management.

§  Tables and their indices reside in 4 different tablespaces depending on the nature of the data.

- attributes - fact - transient - historical

§  Due to the dynamic placement of files and datasets on the Grid, large

fraction of the the data is transient and has to be handled properly."

When needed SHRINK SPACE CASCADE operations are performed on partitions with high deletion rate (info from the DBA_TAB_MODIFICATIONS) ""

Page 41: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

41

Challenge with statistics gathering

§  The large number of list partitions in the Rucio DB schema sets a challenge to the default Oracle stats gathering job. The elapsed time of the stats gathering job when having INCREMENTAL option ON is significant (> 30 hours). Many hours spent on the table with the composite List-List partitioning.

§  To achieve acceptable elapsed time for getting complete refreshed

statistics these settings are in place:

-  Histograms are forbidden for the largest partitioned tables. -  INCREMENTAL option is not set for the List (sub-)partitioned tables, but

only for the range partitioned tables based on a timestamp column. -  Stats are explicitly locked on schema level. -  New stats are gathered by an Oracle scheduler job weekly with “force=true”

option. Under consideration is to increase this interval to N months.

Page 42: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

42

Challenge with execution plans stability

§  The large number of list partitions with various sizes causes SQL execution plan instability on certain queries. When Oracle gets a request to read from a tiny partition, natural choice is full partition scan (if IOT, partition index fast full scan).

Leads to significant resource consumption and increased query elapsed time when plan is reused for large partition(s): the most negative impact was on a query which search is based on the PK columns of list partitioned IOT table (> 5000 partitions)

§  An implemented workaround for set of queries: use of hints INDEX_RS_ASC, NO_INDEX_FFS

§  Whether the removal of partition level stats would bring execution plan stability is currently under study.

Page 43: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

43

Key advantages by having list partitioning

§  Data of each Rucio account is stored in separate segment and maintenance operations does not impact the other partitions.

§  Provides advantage on straightforward MOVE, COALESE, DROP or REBUILD operations on partition level.

§  To serve the users free pattern searches, Oracle reads only the relevant (sub-)partition data chunk.

Page 44: Bulgarian Oracle User Group conference Plovdiv (Bulgaria ... · Bulgarian Oracle User Group conference Plovdiv (Bulgaria), June 2015 ... When Oracle 11g creates a histogram

44

"

"Thank you for the interest!!

!!

Questions?!!!"""