ansi join sql
DESCRIPTION
Detailed description joins functionality in ansi sql joingTRANSCRIPT
To ANSI or Not To ANSI
Gravenstein, Costello, Maurer
Amtrust Bank
Session #420
Speaker Qualifications
Rumpi Gravenstein, Application Developer, Senior, Amtrust Bank
Newsletter Editor, North East Ohio Oracle Users Group
Has been working with Oracle since 1988 Past Presenter at meetings of:
Independent Oracle Users Group North East Ohio Oracle Users Group
Agenda
Brief History Review Join Technologies Analysis Recommendation
ANSI/Oracle Support History
ANSI here refers to SQL/99 Join Syntax The standard to which all RDBMS vendors
strive to comply
SQL/99 support started with Oracle 9i in 2001 Two years after the release of the standard,
Oracle supports it.
This presentation restricted to the Oracle implementation of the ANSI standard
3 Join Condition Types
Equijoin Columns with the same name Columns with different names
Outerjoin Left (left driving table) Right (right driving table) Full (both tables driving)
Cross/Cartesian product
Traditional Equijoin - Same Name
Traditional Oracle Approach
SELECT e.ename AS Employee_name,
d.deptno,
d.dname AS Department_name
FROM emp e,
dept d
WHERE e.deptno = d.deptno;
Table prefix is required to remove ambiguity on
common columns
Join conditions must be listed
ANSI Equijoin Natural Syntax
ANSI SQL Natural Join
SELECT ename AS employee_name,
deptno,
dname AS department_name
FROM emp
NATURAL JOIN dept;
Join columns implied, based on columns that have the same name
No table prefix if column is part of join condition.
No commas between tables.
God forbid that you accidentally add a non-join column to both tables…(um.. audit columns...)
ANSI Equijoin Using Syntax
ANSI SQL Join USING
SELECT d.dname,
e.ename
FROM emp d
JOIN dept d USING ( deptno );
Several columns share same name, only joining on some of them, in this
case deptno
Table prefix allowed on columns that are not part of the using clause
(join condition)
Add additional join columns using( deptno, join_col2, join_col3, …)
Traditional Equijoin Syntax
Traditional Join, columns different
SELECT d.department_name,
l.city
FROM departments d,
locations l
WHERE d.location_id = l.id;
Join column names are different
ANSI On Equijoin Syntax
ANSI SQL ON
SELECT d.department_name,
l.city
FROM departments d
JOIN locations l ON ( d.location_id = l.id );
Use ON when join column names are different
List join conditions here like traditional syntax
ANSI Equijoin Syntax
ANSI SQL Multi Table On
SELECT e.empno,
l.loc_id,
d.dname,
l.state_tx
FROM locations l
JOIN dept d ON ( d.location_id = l.id )
JOIN emp e ON ( d.deptno = e.deptno );
Bring in first table join
Bring in second table join Any prior table column is visible – joins from left to right
No commas between tables
ANSI Equijoin Syntax
ANSI SQL INNERSELECT e.emp_id,
l.city,
d.dept_name,
d.deptno
FROM locations l
INNER JOIN dept d ON ( d.location_id = l.id )
INNER JOIN emp e ON d.deptno = e.deptno;
INNER – an optional keyword stating this is an equijoin (not an
outer or cross join)
ON clause requires reference to join columns by table name to
resolve ambiguity
Parenthesis are optional, we like to include them for clarity
Traditional Outerjoin Syntax
Traditional Outer Join
SELECT e.ename,
d.dname
FROM emp e,
dept d
WHERE e.deptno (+) = d.deptno
Traditional Outer Join Notation(+) indicator denotes expand records on this side if needed
NULL in name if no employees in the department.
ANSI Outerjoin Syntax
Left Outer Join
SELECT e.ename,
d.dname
FROM dept d
LEFT OUTER JOIN emp e
ON (e.deptno = d.deptno);
LEFT denotes that the dominant table is to the left (dept) and that all of it’s rows will be returned. The
right table is expanded with NULL records
NULL in last name if no employees in the department.
OUTER keyword is optional.
ANSI Outerjoin Syntax
Left Outer Join
SELECT e.ename,
d.dname
FROM dept d
NATURAL LEFT JOIN emp e;
NATURAL can be used in an INNER and OUTER join.
We don’t recommend using it here either!
USING e.g. LEFT JOIN emp e USING (deptno)can be used in an INNER and OUTER join.
ANSI Outerjoin Syntax
Right Outer Join
SELECT e.ename,
d.dname
FROM emp e
RIGHT OUTER JOIN dept d
ON (e.deptno = d.deptno);
RIGHT OUTER denotes that the dominant table is to the right. The left table gets expanded with
NULLS.
NULL in last name if no employees in the department.
Traditional Outerjoin Syntax -Close
Full Outer Join Can only be represented with a “UNION” query.
SELECT e.ename,
d.dname
FROM emp e, dept d
WHERE e.deptno (+) = d.deptno
UNION
SELECT e.ename,
d.dname
FROM emp e, dept d
WHERE e.deptno = d.deptno (+)
Shouldn’t see many of these(We’ve never needed one)
UNION ALL is incorrect as it results in duplicate rows
UNION performs an implicit “DISTINCT” on result – possibly
removing desired rowsThis “full” join syntax can be found
on internet as “true” full join
Traditional Full Close-Executionscott@VOTER> SELECT e.ename, 2 d.dname 3 FROM emp e, dept d 4 WHERE e.deptno (+) = d.deptno 5 UNION 6 SELECT e.ename, 7 d.dname 8 FROM emp e, dept d 9 WHERE e.deptno = d.deptno (+) 10 ;
16 rows selected.
Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=15 Card=28 Bytes=588) 1 0 SORT (UNIQUE) (Cost=15 Card=28 Bytes=588) 2 1 UNION-ALL 3 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 4 3 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 5 3 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 6 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 7 6 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 8 6 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
Plan reflects the implicit DISTINCT inherent in UNION
clause
Notice Cost 15
Traditional Full Join - True
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno(+)UNION ALLSELECT NULL, d.dname FROM dept d WHERE NOT EXISTS (SELECT 1 FROM emp e WHERE e.deptno = d.deptno)
Outer join emp table with the dept table – all emp rows now
returned
Add Dept rows that have no match from emp
Use UNION ALL to avoid implicit DISTINCT inherent in UNION
This syntax can be found on internet at http://optimizermagic.blogspot.com under post on “Outerjoins in Oracle”
Traditional Full Join True-Execution
scott@VOTER> SELECT e.ename, d.dname 2 FROM emp e, dept d 3 WHERE e.deptno = d.deptno(+) 4 UNION ALL 5 SELECT NULL, d.dname 6 FROM dept d 7 WHERE NOT EXISTS 8 (SELECT 1 9 FROM emp e 10 WHERE e.deptno = d.deptno);
16 rows selected.
Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=324) 1 0 UNION-ALL 2 1 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 4 2 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 5 1 HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30) 6 5 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 7 5 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42)
Explain plan for “True” traditional outer join – SORT (UNIQUE)
missing
Cost 13 here, prior plan had cost of 15
ANSI Outerjoin Syntax
Full Outer Join
SELECT e.ename,
d.dname
FROM emp e
FULL OUTER JOIN dept d
ON (e.deptno = d.deptno);
FULL OUTER denotes that the table to the right AND the table to the left will have all their
records returned
NULL in last name if no employees in the department.
NULL in department name if employee not in a department.
OUTER is optional
ANSI Full Execution Planscott@VOTER> SELECT e.ename, 2 d.dname 3 FROM emp e 4 FULL OUTER JOIN dept d 5 ON (e.deptno = d.deptno);
16 rows selected.
Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=256) 1 0 VIEW (Cost=13 Card=16 Bytes=256) 2 1 UNION-ALL 3 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 4 3 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 5 3 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 6 2 HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30) 7 6 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 8 6 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42)
Cost 13 = Cost of Traditional True
Traditional Cross Join/Cross Product
Cross Join
SELECT emp_id,
ename,
dname
FROM emp e,
dept d
WHERE d.deptno = 10;
No join condition between tables… um, normally not good.
Inadvertent cross join not obvious if joined table has a single row
ANSI Cross Join/Cross Product
Cross Join
SELECT emp_id,
ename,
dname
FROM emp e
CROSS JOIN dept d
WHERE d.deptno = 10;
Explicit CROSS condition, impossible to do this accidentally.
ANSI Correlated Join
Correlated Query Syntax Join
SELECT empno,
ename
FROM emp e
WHERE EXISTS
( SELECT NULL
FROM dept d
INNER JOIN locations l
ON ( l.loc_id = d.loc )
WHERE d.deptno = e.deptno
);
ANSI doesn’t allow join clause on first table
Only tables in “current” FROM clause visible to
ANSI join logic
Not really a “mixed” syntax join
ANSI Correlated Subquery Issues
SELECT * FROM dept d INNER JOIN locations2 l USING ( loc ) WHERE EXISTS ( SELECT NULL FROM emp e WHERE e.loc = l.loc )
ORA-25154: column part of USING clause cannot have qualifier
Don’t be tempted to remove the table prefix “l”
“NATURAL” joins have the same issue
Scope of reference dictates that the closest column be used
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job FROM dept d LEFT JOIN emp e ON ( e.deptno = d.deptno AND e.job = 'SALESMAN');
DEPTNO ENAME JOB====== ====== ========
30 ALLEN SALESMAN30 WARD SALESMAN 30 MARTIN SALESMAN30 TURNER SALESMAN50 40 20 10
Filter is applied before join is executed
A number of rows returned that have “OUTER” joined emp data
Only dept 30, Sales, has SALESMAN as a job.
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job FROM dept d LEFT JOIN emp e ON ( e.deptno = d.deptno) WHERE e.job = 'SALESMAN';
DEPTNO ENAME JOB====== ====== ========
30 ALLEN SALESMAN30 WARD SALESMAN30 MARTIN SALESMAN30 TURNER SALESMAN
Filter is applied after join is executed
No outer joined data
The condition has effectively converted this OUTER JOIN query to an INNER JOIN
Mixed Traditional/ANSI Join
Mixed Syntax Join
SELECT emp_id,
ename,
dname
FROM emp e
INNER JOIN dept d USING (deptno),
dual
WHERE deptno = 10;
Uhg….Choose one or the other,
but not both please!
Oracle will run this without throwing an error
Presentation Bonus – This section not in conference
proceedings.
Mixed Syntax Join
scott@VOTER> SELECT empno, 2 ename, 3 dname, 4 dummy 5 FROM dual, 6 emp e 7 INNER JOIN dept d ON (d.deptno = e.deptno) 8 WHERE e.deptno = 10;
EMPNO ENAME DNAME D---------- ---------- -------------- - 7782 CLARK ACCOUNTING X 7839 KING ACCOUNTING X 7934 MILLER ACCOUNTING X
3 rows selected.
Oracle will run this without throwing an error
Multiple traditional and ANSI sections are allowed
Mixed Syntax Join
scott@VOTER> SELECT empno, 2 ename, 3 dname, 4 dummy 5 FROM emp e 6 dual, 7 INNER JOIN dept d 8 ON ( d.deptno = e.deptno ) 9 WHERE e.deptno = 10; dual, *ERROR at line 6:ORA-00933: SQL command not properly ended
Traditional mixed in between ANSI
Mixed Syntax Join
SELECT empno, ename, dname, dummy
FROM dual, emp e INNER JOIN dept d on d.deptno = e.deptno and dummy IS NOT NULL WHERE e.deptno = 10;
and dummy IS NOT NULL
*
ERROR at line 8:
ORA-00904: "DUMMY": invalid identifier
ANSI joins can only see other tables taking part in ANSI join
Presence of both comma (Traditional) and JOIN Mixed syntax join
ANSI vs Traditional Join Analysis
Impact areas Code Clarity
Readability Join Errors
Flexibility Ease of Use
Developer Training DBA Training
Legacy Code Standards
Code Clarity – Readability (Traditional) SELECT /*+ qb_name(orig) */ fdla.dim_borrower_v_id dim_borrower_v_id FROM dim_as_of_date_vw daod, dim_daily_loan_applctn_detl ddlad, dim_disbursement_date_vw dddv, dim_loan_originator dlo, fact_daily_loan_application fdla, dim_loan_applctn_status_vw dlasv WHERE daod.dim_as_of_date_v_id = ddlad.dim_as_of_date_v_id AND daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id AND ddlad.dim_daily_loan_applctn_detl_id =fdla.dim_daily_loan_applctn_detl_id AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id AND dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id AND dlo.dim_loan_originator_id = fdla.dim_loan_originator_id AND dlasv.DIM_LOAN_APPLCTN_STATUS_V_ID = fdla.DIM_LOAN_APPLCTN_STATUS_V_ID AND NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740') AND NOT (dlasv.status_code BETWEEN '000' AND '429') AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) + &c_DEFAULT_SLICE_OFFESET ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT ) END) AND dddv.disburse_date BETWEEN
TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM') AND TRUNC (LAST_DAY
(NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)))) AND ddlad.loan_transfer_status_code != 'T'
Can you quickly determine how tables
are joined?
This is a real join we’ve implemented as part of
a recent project
Code Clarity – Readability (ANSI)SELECT /*+ qb_name(orig) */ fdla.dim_borrower_v_id dim_borrower_v_id FROM dim_as_of_date_vw daod INNER JOIN fact_daily_loan_application fdla ON (daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id) INNER JOIN dim_daily_loan_applctn_detl ddlad ON ( ddlad.dim_as_of_date_v_id = daod.dim_as_of_date_v_id AND ddlad.dim_daily_loan_applctn_detl_id = fdla.dim_daily_loan_applctn_detl_id AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id ) INNER JOIN dim_disbursement_date_vw dddv ON ( dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id) INNER JOIN dim_loan_originator dlo ON (dlo.dim_loan_originator_id = fdla.dim_loan_originator_id) INNER JOIN dim_loan_applctn_status_vw dlasv ON (dlasv.dim_loan_applctn_status_v_id = fdla.dim_loan_applctn_status_v_id) WHERE NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740') AND NOT (dlasv.status_code BETWEEN '000' AND '429') AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) + &c_DEFAULT_SLICE_OFFESET ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT ) END) AND dddv.disburse_date BETWEEN
TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM') AND TRUNC (LAST_DAY
(NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)))) AND ddlad.loan_transfer_status_code != 'T'
TableJoinConditionsAreEasilyIdentified
Code Clarity – Join Errors
Traditional SyntaxSELECT col1,
col2,
...
FROM tab1 t1,
tab2 t2,
...
WHERE ...
Is a table join condition missing? How do you know?
Code Clarity – Join Errors
ANSI SyntaxSELECT col1,
col2,
...
FROM tab1 t1
[join type] tab2 t2
[join condition]
...
WHERE ...
must identify join type:INNEROUTERFULL
CROSS
Identifies join condition USING or ON
“Impossible” to do inadvertent CROSS join
Flexibility – (+) Restrictions not present in ANSI
You cannot specify the (+) operator in a query block that also contains FROM clause join syntax. The (+) operator can appear only in the WHERE clause or, in the context of left-correlation (that is, when specifying
the TABLE clause) in the FROM clause, and can be applied only to a column of a table or view. If A and B are joined by multiple join conditions, then you must use the (+) operator in all of these conditions. If you
do not, then Oracle Database will return only the rows resulting from a simple join, but without a warning or error to advise you that you do not have the results of an outer join.
The (+) operator does not produce an outer join if you specify one table in the outer query and the other table in an inner query.
You cannot use the (+) operator to outer-join a table to itself, although self joins are valid. For example, the following statement is not valid:
SELECT employee_id, manager_id FROM employees WHERE employees.manager_id(+) = employees.employee_id;
However, the following self join is valid:
SELECT e1.employee_id, e1.manager_id, e2.employee_id FROM employees e1, employees e2 WHERE e1.manager_id(+) = e2.employee_id;
The (+) operator can be applied only to a column, not to an arbitrary expression. However, an arbitrary expression can contain one or more columns marked with the (+) operator.
A WHERE condition containing the (+) operator cannot be combined with another condition using the OR logical operator.
A WHERE condition cannot use the IN comparison condition to compare a column marked with the (+) operator with an expression.
A WHERE condition cannot compare any column marked with the (+) operator with a subquery.
“Oracle strongly recommends that you use the more flexible FROM
clause (ANSI) join syntax”
Ease of Use/Developer & DBA Training Traditional Join
Long time Oracle developers do nothing New Oracle Developers need to learn Oracle
syntax ANSI Join
Works with SQL Server/Oracle/MySQL/… Syntax is more readable/self documenting Natural join is “un-natural” - can lead to errors
Installed Code Base
ANSI Joins not present in the Oracle installed code base
ANSI Joins present in other RDBMS installed code More of these databases coming all the time
Harm in having two join syntaxes Support personnel have to be comfortable with
both syntaxes Additional training required
Fishbone Diagram
Cod
e C
larit
yFe
wer
Joi
n
Con
ditio
n E
rror
s
Error FreeSQL
Only Support
One Coding
Style
Mul
ti-V
endo
r
RD
BM
S
Sup
port
Only Allow Traditional Join Syntax
Allow ANSI Join Syntax
Sta
ndar
ds B
ased
Feat
ure
Ric
h
Installed Oracle
Code Base
No Natural Join
Possibility
Recommendations
Allow both Provide training so that all are familiar with both Place some restrictions on ANSI syntax to prevent
problems Do not allow NATURAL joins and possibly USING
clause Single SQL statements should use one or the other
but not both New development should try to use same syntax
throughout Long term goal, ANSI only
Session Goals
Familiarity with ANSI Join Syntax Understanding the merits of the ANSI join
syntax Intention to start using the ANSI syntax
?
Questions
Thank You
Please complete the evaluation forms My Name: Rumpi Gravenstein Session Title: To ANSI or not to ANSI Session #: 420 If you have additional questions, I can be
reached at [email protected]