Download - 169 ETL Presentation
ETL - Oracle Database Features and PL/SQL Techniques
Boyan Boev
CNsysBGOUG - 2005
Agenda
• ETL Concepts• ETL Process Development -
Recommendations– Oracle Database Configuration– PL/SQL Techniques– Database Features
• Conclusion
BGOUG - 2005
What is ETL?
• ETL Elements• ETL Purpose• ETL as a Part of Business Intelligence
Solution• Types of ETL
Using Tools Custom PL/SQL Scripts
BGOUG - 2005
ETL – Extract, Transform and Load
• Extract Pull the Data From the Source
• Transform Convert the Input Format to the Target Format Encode any Values
• Load Insert the Transformed Data to the Target Tables
BGOUG - 2005
Agenda
• ETL Concepts
• ETL Process Development - Recommendations– Oracle Database Configuration– PL/SQL Techniques– Database Features
• Conclusion
BGOUG - 2005
Oracle database configuration
• Don’t use ARCHIVE LOG Mode Reduce the Amount of I/O Activity
• Use NOLOGGING DML StatementsALTER TABLE … NOLOGGING;
INSERT /*+ nologging*/ …
• Change the stage schema data model IndexesConstraintsTriggers
BGOUG - 2005
Agenda
• ETL Concepts
• ETL Process Development - Recommendations– Oracle Database Configuration– PL/SQL Techniques– Database Features
• Conclusion
BGOUG - 2005
ETL – recommendations
• PL/SQL Techniques
Bulk Binding Collection Types Pipelined Table Functions Package Variables Native Compilation
BGOUG - 2005
PL/SQL – Binding
• Types of Binds IN-BindsOUT-Binds
• Bind Options Single Row BindsBulk Binds
BGOUG - 2005
Single Row Binds
• Cursor For LoopExample:
BGOUG - 2005
DECLARE
CURSORexample_cursor( p_id_in NUMBER )
IS SELECT * FROM customer_action
WHERE cust_id = p_id_in;
v_cust_id NUMBER := 1681;
BEGIN
FOR rec INexample_cursor( v_cust_id )
LOOP
INSERT
INTO cust_action_hist(…)
VALUES( … );
END LOOP;
END;
/
BGOUG - 2005
Context Switching
BGOUG - 2005
Bulk Binding
• IN-Binds - An Array of Values is passed to SQL Engine
• OUT-Binds - SQL Engine populates a PL/SQL Bind Array
• Context Switching once per Batch Instead of Once Per Row
• Performance Increase to Up to 15 Times
BGOUG - 2005
Bulk Bind Operators
• Bulk Collect Specifying of the LIMIT clause Be Careful to handle last batch
OPEN cursor …
LOOP
FETCH cursor
BULK COLLECT INTO … LIMIT 100;
EXIT WHEN cursor%NOTFOUND;
END LOOP;
BGOUG - 2005
Bulk Bind Operators• FORALL
Bulk DML Operator Not Looping Construct PL/SQL table is Referenced in the statement Handling and Reporting Exceptions VALUES OF , INDICES OF ( New in Oracle 10g )
FORALL i IN 1..20
INSERT INTO emp2
VALUES (enums(i), names(i), median(sals), ...);
BGOUG - 2005
Collection Types
• Associative Arrays( PL/SQL Tables ) PL/SQL type only - not a SQL Type Easy to Use
• Nested Tables Shared Type
BGOUG - 2005
Associative Arrays( PL/SQL Tables )
• automatic element allocation• no need to initialize
DECLARE
TYPE EmpTabTyp IS TABLE OF employees%ROWTYPE
INDEX BY PLS_INTEGER;
emp_tab EmpTabTyp;
BEGIN
SELECT * BULK COLLECT INTO emp_tab
FROM employees WHERE employee_id = 100;
END;
/
BGOUG - 2005
Nested Tables
• PL/SQL
Example
DECLARE
TYPE nest_tab_t IS TABLE OF NUMBER;
nt nest_tab_t := nest_tab_t();
BEGIN
FOR i IN 1..100 LOOP
nt.EXTEND;
nt(i) := i;
END LOOP;
END;
/
BGOUG - 2005
Nested Tables
• Global
Example
CREATE OR REPLACE TYPE email_demo_obj_t AS OBJECT
( email_id NUMBER,
demo_code NUMBER,
value VARCHAR2(30) );
/
CREATE OR REPLACE TYPE email_demo_nt_t AS TABLE OF email_demo_obj_t;
/
BGOUG - 2005
Nested Tables
• SQL-Defined Nested Tables Sorted Aggregated Ised in Dynamic In-Lists Joined with SQL Tables Joined with other PL/SQL nested tables
• Nested Tables Enable Table Functions TABLE Operator
o tell Oracle to treat the Variable like SQL Table
CAST Operator
oExplicitly Tells Oracle the Data Type to be used to Handle the Operation
-
BGOUG - 2005
Nested Tables
• Global Nested tables – example:DECLARE
eml_dmo_nt email_demo_nt_t := email_demo_nt_t();
BEGIN
-- Some logic that populates the nested table …
eml_dmo_nt.EXTEND(3);
eml_dmo_nt(1) := email_demo_obj_t(45, 3, '23');
eml_dmo_nt(2) := email_demo_obj_t(22, 3, '41');
eml_dmo_nt(3) := email_demo_obj_t(18, 7, 'over_100k');
-- Process the data in assending order of email id.
FOR r IN (SELECT * FROM
TABLE(CAST(eml_dmo_nt AS email_demo_nt_t))
ORDER BY 1)
LOOP
dbms_output.put_line(r.email_id || ' ' || r.demo_id);
END LOOP;
END;
/
BGOUG - 2005
Table Functions
• Data is Buffered in the Local Variables During Function Execution
• Pipelined Table Functions
– SELECT * FROM plsql_function;
BGOUG - 2005
Pipelined Functions - exampleFUNCTION pipelined_demo( data SYS_REFCURSOR ) RETURN demo_nt_t PIPELINED IS
CURSOR demo_cursor IS
SELECT demo_obj_t(emp_id, value) FROM emp;
emp_nt demo_nt_t;
BEGIN
OPEN email_demo_cur;
LOOP
EXIT WHEN email_demo_cur%NOTFOUND;
FETCH email_demo_cur BULK COLLECT INTO eml_dmo_nt LIMIT 1000;
FOR i IN 1..eml_dmo_nt.COUNT LOOP
/* Apply some business logic on the object here, and return a row. */
PIPE ROW (eml_dmo_nt(i));
END LOOP;
END LOOP;
RETURN;
END;
BGOUG - 2005
Package Variables
• What are Package Variables
Globally Declared Variables in Package, outside of a
Procedure or Function Definition
Values set by Package Initialization Code
Enable Data Caching
BGOUG - 2005
Native Compilation
• How it works? PL/SQL First is Compiled Down To P-Code C Source Code is Generated from P-Code Native Compiler is invoked Creating a 'C' Shared Object Library Subsequent Calls to PL/SQL Object are Run by the 'C' Library
• NComp and Performance
Agenda
• ETL Concepts
• ETL Process Development - Recommendations– Oracle Database Configuration– PL/SQL Techniques– Database Features
• Conclusion
BGOUG - 2005
ETL – recommendations
• Database Features and Utilities
External Tables( 9i )Transportable TablespacesDML Exception Handling( 10g R2 )Regular Expressions Support( 10g R1 )SQL*LoaderDirect Path INSERTParallel DML Statements
BGOUG - 2005
BGOUG - 2005
Database Features
• External tables File can be Queried as if it is a Real Database Table
Can Sort, Aggregate, Filter Rows, etc. External File Can be Queried in Parallel
Oracle 9i - Read-Only using SQL*Loader Engine Only read-only access to external files
Oracle 10g - Read-Write using Data Pump Engine Can do read/write operations with external files
• Transportable Tablespaces Oracle 10g - transport across different platforms
BGOUG - 2005
Database Features
• DML Exception Handling ( New in 10g R2 ) Catch the Error and Move On: Error Logging Clause Enabling:
SQL> exec dbms_errlog.CREATE_ERROR_LOG ('ACCOUNTS','ERR_ACCOUNTS');
SQL> insert into accounts select * from accounts_ny
log errors into err_accounts reject limit 200;
SQL> select ORA_ERR_NUMBER$, ORA_ERR_MESG$, ACC_NO
from err_accounts;
BGOUG - 2005
Database Features
• Regular Expression Support ( New in 10g R1 )
• SQL*Loader
• Direct Path Insert SQL*Loader CREATE TABLE AS SELECT … INSERT /*+ APPEND*/ … SELECT
• Parallel DML Statements
BGOUG - 2005
Agenda
• ETL Concepts
• ETL Process Development - Recommendations– Oracle Database Configuration– PL/SQL Techniques– Database Features
• Conclusion
BGOUG - 2005
Example
• create package with PIPELINED Function ....
FUNCTION transform (new_data SYS_REFCURSOR) RETURN email_demo_nt_t
PIPELINED
PARALLEL_ENABLE ( PARTITION new_data BY ANY ) IS....
• Transformation is Just a Simple INSERT as SELECT Elegant solution to Parallel , Transactional Co-Processing
INSERT /*+ append nologging*/
INTO ...
SELECT /*+ parallel(a,4)*/
FROM .. TABLE( CAST( package.pipe_func_name( select * from table_3 ) as TYPE ) a);
BGOUG - 2005
Conclusion
• High Perofmance ETL Solution
BGOUG - 2005
Recommended Readings – Sites
– http://www.oracle.com/technology//index.html
– http://rittman.net– http://www.intelligententerprise.com/– http://asktom.oracle.com– http://www.dbazine.com/datawarehouse– http://dbasupport.com/
BGOUG - 2005
Recommended Readings – Books
http://www.amazon.com/exec/obidos/tg/detail/-/0471200247/qid=1129731960/sr=2-1/ref=pd_bbs_b_2_1/104-9260048-7507956?v=glance&s=books
BGOUG - 2005
Recommended Readings – Books
http://www.amazon.com/exec/obidos/tg/detail/-/0764567578/ref=pd_sim_b_1/104-9260048-7507956?%5Fencoding=UTF8&v=glance
BGOUG - 2005
Recommended Readings – Books
http://www.amazon.com/exec/obidos/tg/detail/-/1555583350/qid=1129732390/sr=2-2/ref=pd_bbs_b_2_2/104-9260048-7507956?v=glance&s=books