oracle 11g partitioning for developers - database...
TRANSCRIPT
Lewis Cunningham
Senior Solutions Architect
EnterpriseDB
Oracle 11g Partitioning
for Developers
Agenda
• Audience
• What is Partitioning?
• When Should I use Partitioning?
• How do I use Partitioning?
• Summary
Audience
• The changing nature of database
administration and development
• Designer = Developer?
• Anyone who needs performance on large
data sets
• Performance vs Maintenance and
Availability
What is Partitioning?
• A partition physically separates data
• Tables AND indexes can be partitioned
• Partitions improve performance(*)
• Partitioning is transparent to applications
What is Partitioning?
• This presentation covers Oracle only
• Oracle offers extremely robust partitioning
• The optimizer recognizes partitioned
objects
What is Partitioning?
• Data file
– Physical OS disk file that stores data
– All data in an Oracle database ends up in a
data file
– A tablespace may be made up of 1 or more
data files
– This allows data to be spread across multiple
physical or logical disks for performance
What is Partitioning?
• Segment
– A segment is a logical container for an object
in an Oracle database
– A normal table (i.e. non partitioned) is exactly
one segment
– A regular index is exactly one a segment
– A partitioned table will be made up of as
many segments as it has partitions
– The same is true of partitioned indexes
What is Partitioning?
• Tablespace
– A tablespace is a logical container for
segments.
– A tablespace may be empty but it will most
likely hold one or more segments.
– So, a tablespace is logically made up of
segments and physically made up of data files
What is Partitioning?
• Data Blocks & Extents
– Not really pertinent to a discussion of
partitioning.
– A data block is a physical chunk of data
stored in a data file and an extent is a set of
contiguous data blocks
– A segment is made up of extents.
– A segment can span multiple data files but an
extent must be contained within a single data
file
What is Partitioning?
• A normal table (non-partitioned) is a single
segment
• A partitioned table stores one partition per
segment
• A normal index is a single segment
• A partitioned index stores one partition per
segment
• Each segment may exist in a different
tablespace
What is Partitioning?
• How does all of this help me?
– A scan, of a table or an index, scans the
entire segment
– 1 Billion rows in a table means 1 billion rows
to scan
– 1 Billion rows partitioned in 10 partitions
*MIGHT* mean only 100 million rows to be
scanned
– You can have more than 10 partitions for
even better performance (maybe)
What is Partitioning?
True or False – Partitioning is only useful if
each partition is in its own tablespace?
False - This is a myth. Putting each in its
own tablespace can help administration
and management but you can still get
serious gains without that.
What is Partitioning?
• Partitioning is an important aspect of
designing your application
• Not so long ago, 1 million records was a
lot
• 1 billion records is becoming normal
• The 10 GB database of yesterday is the
10TB database of today
• Yahoo has an active 150PB database
What is Partitioning?
• Hybrid databases, OLTP + Reporting, are
becoming more common rather than less
• Is this a good idea? Doesn't matter. We
need to deal with it
• Partitioning is useful in OLTP databases
What is Partitioning?
• How does Oracle use partitioning?
– Partition Pruning is the Key
• Partition pruning let's Oracle only scan those
partitions that actually hold the data requested
• Partition pruning is based on a Partition Key
• The partition key identifies the partition where data
will – does – reside
• Identifying the correct partition key is critical to
useful partitioning
What is Partitioning?
• Partition Keys
– Can be, but rarely is, the primary key
– A primary key uniquely identified a row of data
– A partition key identifies a category of data
– Partition keys should be important to the
majority of data requests against a table
– In addition to the partition type, a partition key
helps decide the number of partitions
What is Partitioning?
How to identify a partition key
• How does the business look at the data?
• What type of partitioning makes the most
sense?
• In retail, a sale_date might make sense
• If geography is important, a region,
country or state might be a good choice
What is Partitioning?
Partition key = sale_date
On Insert, Oracle peeks at
the sale_date value and
inserts the row into the
correct partition.
On select, the optimizer
peeks at the sale_date and
compares that to the
partition key, only partitions
where the data COULD
reside will be scanned
What is Partitioning?
• Identifying the keys means you must
identify key data
• If you choose region, you must identify
which regions will live in which partition
• If you choose sale_date, will it be the year,
month-year, which years, etc
• Your partition key and partition values are
tightly couple to partition types
What is Partitioning?
Note: Oracle 11g allows you to use Virtual
Columns as partition keys, prior versions
did not.
A virtual column is an expression rather than
a specific column in the table. For
example, you can add to dates, or
consolidate using last_day(order_date) or
even substring out a piece of a column
and that is a virtual column.
What is Partitioning?
• Partition Types in 11g
– List Partition
– Range Partition
– Hash Partition
– Reference Partition (new and very cool!)
– System Partition
– Interval Partition
– Composite Partition
What is Partitioning?
• List Partition
– Very simple to implement and understand
– If you data is list or would make a good
lookup value
– Examples: state, region, sic code, color,
product type, industry code, make, model
– You must enumerate the list
– Values must be unambiguous
What is Partitioning?
List Partition Pseudo Code
Partition by LIST (state_column)
Partition 1 stores ('AZ', 'AK', 'AL')
Partition 2 stores ('NM', 'NY', 'MD')
Partition 3 stores ('LA', 'TX', 'MS')
Partition 4 stores (NULL VALUES) Partition 5 stores (ALL OTHER VALUES)
What is Partitioning?
List Partition Real Code
PARTITION BY LIST (state_column)
(
PARTITION state1 VALUES ('AZ', 'AK', 'AL'),
PARTITION state2 VALUES ('NM', 'NY', 'MD'),
PARTITION state3 VALUES ('LA', 'TX', 'MS'),
PARTITION null_states VALUES (NULL),
PARTITION other_states VALUES (DEFAULT)
)
What is Partitioning?
• Range Partition
– Very common partitioning type
– Like a between clause
– Like a hierarchy of < (less than)
• <5
• <10
• <20 etc
– All values must be enumerated
– Values must be unambiguous
What is Partitioning?
Range Partitioning Pseudo Code
Partition by RANGE (sale_date)
Partition 2005 stores sale_date < 01-jan-2005
Partition 2006 stores sale_date < 01-jan-2006
Partition 2007 stores sale_date < 01-jan-2007
Partition 2008 stores sale_date < 01-jan-2008
Partition infinity stores sale_date < INFINITY
• Order is important
What is Partitioning?
Range Partition Real Code PARTITION BY RANGE (sale_date)
(
PARTITION y2005 VALUES LESS THAN
(TO_DATE('01/01/2006', 'DD/MM/YYYY')),
PARTITION y2006 VALUES LESS THAN
(TO_DATE('01/01/2007', 'DD/MM/YYYY')),
PARTITION y2007 VALUES LESS THAN
(TO_DATE('01/01/2008', 'DD/MM/YYYY')),
PARTITION y2008 VALUES LESS THAN
(TO_DATE('01/01/2009', 'DD/MM/YYYY')),
PARTITION yInfinite VALUES LESS THAN (MAXVALUE)
);
What is Partitioning?
• Hash Partition
– Tables with no obvious partition key
– No logical range or list available
– Sequence could be a hash partition key
• Think Customer_id, product_id, etc
• Not as important now that reference partitioning is
available
– Balances data evenly across the number of
partitions defined
– DBAs should like it (for admin reasons)
What is Partitioning?
Hash Partitioning Pseudo Code Partition by hash (seq_pk)
create 3 Partitions
Hash Partitioning Real Code Partition by hash (seq_pk)
partitions 5;
What is Partitioning?
• Reference Partition
– New in 11g
– Foreign Key based partitioning
• Active, enforced FK -> PK required
– Child table is automatically by the parent
partition key, even when the column does not
exist in the child table
– Fast, fast, fast joins
What is Partitioning?
Reference Partitioning Pseudo Code Child_Table Constraint foreign_key_name
Foreign Key relates to parent_table (key)
Partitition by reference (foreign_key_name)
Reference Partitioning Real Code CONSTRAINT order_fk
FOREIGN KEY (order_Id)
REFERENCES orders(order_id) )
PARTITION BY REFERENCE (order_fk);
What is Partitioning?
• System Partition
– Programmatic partitioning
– You have to write all of the code
– Specialized third-party tools or applications
– Not really a day to day usage of partitioning
What is Partitioning?
• Interval Partition
– Special case of Range Partition
– Let Oracle create your partitions for you
– Range partitions can be converted to Interval
• ALTER TABLE SET INTERVAL
– Fix it and Forget it!
What is Partitioning?
Interval Partitioning Pseudo Code
Partition by RANGE (sale_date)
Interval of 1 year
Start with partition 1
sale_date < 01-jan-2005
What is Partitioning?
Interval Partitioning Real Code
Partition by RANGE (sale_date)
INTERVAL (NUMTOYMINTERVAL(1,'year'))
(PARTITION p_year VALUES
LESS THAN TO_DATE('01-JAN-2005') ) );
What is Partitioning?
• Composite Partition
– A composite is just a mix of two partition types
– First partition by high level order
– Follow with inner partition
– Plenty of composite types
• Range Hash
• Range List
• Range Range
• List List
• List Hash
• List Range
What is Partitioning?
Composite (Range Hash) Partitioning
Psuedo Code Partition by Range (sale_date)
Partition 2005 stores sale_date < 01-jan-2005
Subpartition by HASH (sic_cd)
create 5 subpartitions
Partition 2006 stores sale_date < 01-jan-2006
Subpartition by HASH (sic_cd)
create 5 subpartitions
What is Partitioning?
Composite (Range Hash) Partitioning Real
Code Partition by Range (sale_date)
Subpartition by HASH (sic_cd)
subpartitions 5
(
PARTITION y2005 VALUES LESS THAN
(TO_DATE('01/01/2006', 'DD/MM/YYYY')),
PARTITION y2006 VALUES LESS THAN
(TO_DATE('01/01/2007', 'DD/MM/YYYY'))
)
What is Partitioning?
• Indexing Partitions
– Indexes can be partitioned
– Partitioned indexes don't always equate to
better performance
– Three types of indexes
• Global non-partitioned index
• Global partitioned index
• Local index
What is Partitioning?
• Global Non-Partitioned Index
– Index is a single segment
– Your PK is probably a global non-partitioned
index
• If you PK is part of your partition key, it may not be
– This is a regular index like on a non-
partitioned table
– Useful for OLTP but sacrifices management
and availability
What is Partitioning?
• Global Partitioned Index
– Index exists in multiple segments
– Can only be a hash or range
– Partition key is different than table partition
key
– Range Partitioned Index is useful in OLTP
– Hash Partitioned Index is useful for
administration (spreads data out)
What is Partitioning?
• Local Index
– Partitioned by the same partition key as the
table
– Offer excellent through put (DSS and OLAP)
– Offer excellent availability
– Can be unique if partition key is part of unique
index key
– Better for OLAP but can be useful in OLTP
What is Partitioning?
• Oracle's Guidance to choosing an Index
– If the table partitioning column is a subset of
the index keys, use a local index
– If the index is unique and does not include the
partitioning key columns, then use a global
index
– If your priority is manageability, use a local
index
– If the application is an OLTP one and users
need quick response times, use a global
index
When Should I use Partitioning?
• Use partitioning when:
– A table is > 2GB (from Oracle)
– A Table is > 100 Million rows (from me)
• Think about it if table is > 1 million rows
– Partitioning is not free
• Balance your performance needs against your
budget
– If OLTP, can you remove some of the data?
• Data warehouse maybe?
When Should I use Partitioning?
• Use partitioning when:
– If OLAP or Reporting, seriously consider
paying the Partitioning license fee
– Is your data or customer base expected to
grow rapidly?
– Are you running on many CPUs?
• Partitioning can assist in parallelizing many
operations
How do I use Partitioning?
• I'm going to use an Order Entry system as
an example
• Powerpoint sucks for code display
• I'll do my best
How do I use Partitioning?
List Partition CREATE TABLE customers (
cust_id NUMBER NOT NULL PRIMARY KEY,
cust_name VARCHAR2(30),
region_code VARCHAR2(3) )
PARTITION BY LIST (region_code) (
PARTITION asia_region
VALUES ('AS', 'CN', 'JP', 'RS', 'CK'),
PARTITION euro_region
VALUES ('GB', 'FR', 'GR', 'IT', 'SC'),
PARTITION amer_region
VALUES ('US', 'MX', 'CD', 'GU', 'BR'),
PARTITION null_region
VALUES (NULL),
PARTITION error_region
VALUES (DEFAULT) );
How do I use Partitioning?
View the partitions created
SQL> select partition_name, high_value
2 from user_tab_partitions
3 where table_name = 'CUSTOMERS';
PARTITION_NAME HIGH_VALUE
------------------------------ ------------------------------
ASIA_REGION 'AS', 'CN', 'JP', 'RS', 'CK'
EURO_REGION 'GB', 'FR', 'GR', 'IT', 'SC'
AMER_REGION 'US', 'MX', 'CD', 'GU', 'BR'
NULL_REGION NULL
ERROR_REGION DEFAULT
How do I use Partitioning?
Range Partition CREATE TABLE orders (
order_id NUMBER NOT NULL PRIMARY KEY,
order_date DATE NOT NULL,
cust_id NUMBER,
extra_column VARCHAR2(10),
CONSTRAINT ordfk1
FOREIGN KEY (cust_id)
REFERENCES customers(cust_id) )
PARTITION BY RANGE (order_date) (
PARTITION y2005 VALUES LESS THAN (TO_DATE('01/01/2006', 'DD/MM/YYYY')),
PARTITION y2006 VALUES LESS THAN (TO_DATE('01/01/2007', 'DD/MM/YYYY')),
PARTITION y2007 VALUES LESS THAN (TO_DATE('01/01/2008', 'DD/MM/YYYY')),
PARTITION y2008 VALUES LESS THAN (TO_DATE('01/01/2009', 'DD/MM/YYYY')),
PARTITION yInfinite VALUES LESS THAN (MAXVALUE) );
How do I use Partitioning?
SQL> select partition_name, high_value
2 from user_tab_partitions
3 where table_name = 'ORDERS';
PARTITION_NAME HIGH_VALUE
------------------------------ ------------------------------
Y2006 TO_DATE(' 2007-01-01 00:00:00'
, 'SYYYY-MM-DD HH24:MI:SS', 'N
LS_CALENDAR=GREGORIA
Y2007 TO_DATE(' 2008-01-01 00:00:00'
, 'SYYYY-MM-DD HH24:MI:SS', 'N
LS_CALENDAR=GREGORIA
Y2008 TO_DATE(' 2009-01-01 00:00:00'
, 'SYYYY-MM-DD HH24:MI:SS', 'N
LS_CALENDAR=GREGORIA
YINFINITE MAXVALUE
Y2005 TO_DATE(' 2006-01-01 00:00:00'
, 'SYYYY-MM-DD HH24:MI:SS', 'N
LS_CALENDAR=GREGORIA
How do I use Partitioning?
Reference Partition CREATE TABLE order_lines (
order_id NUMBER NOT NULL,
line_num NUMBER NOT NULL,
line_amt NUMBER NOT NULL,
CONSTRAINT ol_pk
PRIMARY KEY (order_id, line_num),
CONSTRAINT order_fk
FOREIGN KEY (order_Id)
REFERENCES orders(order_id)
)
PARTITION BY REFERENCE (order_fk);
How do I use Partitioning?
SQL> select partition_name, high_value
2 from user_tab_partitions
3 where table_name = 'ORDER_LINES';
PARTITION_NAME HIGH_VALUE
------------------------------ --------------
Y2005
Y2006
Y2007
Y2008
YINFINITE
How do I use Partitioning?
• Orders is Partitioned by order_date
• Order_lines does not contain an
order_date field
• The reference partition ensures that the
order_lines table is also partioned by
order_date
How do I use Partitioning?
• Creating Indexes
– We already created global indexes when we
created the primary keys on the table
– Here are two local indexes:
CREATE INDEX ord_date_ind1 ON orders
(order_date) LOCAL;
CREATE INDEX extra_col_ind1 ON orders
(extra_column) LOCAL;
How do I use Partitioning?
SQL> select index_name, partition_name, high_value
2 from user_ind_partitions
3 order by index_name, partition_name;
INDEX_NAME PARTITION_ HIGH_VALUE
--------------- ---------- ----------------------------------------
EXTRA_COL_IND1 Y2005 TO_DATE(' 2006-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
EXTRA_COL_IND1 Y2006 TO_DATE(' 2007-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
EXTRA_COL_IND1 Y2007 TO_DATE(' 2008-01-01 00:00:00', 'SYYYY-M M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
EXTRA_COL_IND1 Y2008 TO_DATE(' 2009-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
EXTRA_COL_IND1 YINFINITE MAXVALUE
ORD_DATE_IND1 Y2005 TO_DATE(' 2006-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
ORD_DATE_IND1 Y2006 TO_DATE(' 2007-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
ORD_DATE_IND1 Y2007 TO_DATE(' 2008-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
ORD_DATE_IND1 Y2008 TO_DATE(' 2009-01-01 00:00:00', 'SYYYY-M
M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
ORD_DATE_IND1 YINFINITE MAXVALUE
10 rows selected.
How do I use Partitioning?
• Two local indexes created 10 partitions
• One index partition per table partition
How do I use Partitioning?
• Partition Maintenance
– Mainly a DBA task
– You add partitions, merge partitions, drop
partitions, etc
– Remember that index maintenance can affect
table availability
– Remember that table maintenance may affect
indexes
How do I use Partitioning?
• Querying Partitions
– In addition to querying a partitioned table, you
can also query individual partitions
– Normally don't need to worry about this
– Might want to see individual partition data
when the optimizer isn't being "smart"
– Add the partition keyword and a partition
name to query an individual partition
How do I use Partitioning?
Force a partition selction:
SELECT *
FROM orders
PARTITION (y2007)
WHERE order_date BETWEEN
to_date('01/02/2007', 'DD/MM/YYYY')
AND
to_date('15/02/2007', 'DD/MM/YYYY');
How do I use Partitioning?
• Tuning
– Way out of scope for this presentation
– See the Oracle 11g Performance Tuning
Guide
• One Note: Make sure to use
DBMS_STATS and be sure you analyze
partitions and indexes
How do I use Partitioning?
• Partition Advisor
– New in 11g
– Part of the SQL Access Advisor
– Fairly complex to use
– Since it's new, it's not the place to start
Summary
• Oracle Partitioning is robust
• Oracle Partitioning is performant
• Oracle Partitioning is useful in OLTP and
OLAP databases
• Oracle Partitioning is not free
• Choosing the right Partition Key is critical
Lewis Cunningham
Senior Solutions Architect
EnterpriseDB
Oracle 11g Partitioning
for Developers