oaktable world 2014 toon koppelaars: database constraints polite excuse

47
Why SQL DBMS’s Still Lack Full Declarative Constraint Support (A polite excuse) Toon Koppelaars

Upload: kyle-hailey

Post on 02-Dec-2014

196 views

Category:

Software


0 download

DESCRIPTION

Toon Koppelaars presentation on Oracle database constraints

TRANSCRIPT

Page 1: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Why SQL DBMS’s Still

Lack Full Declarative

Constraint Support

(A polite excuse)

Toon Koppelaars

Page 2: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Who am I?

• A database guy, relational dude

– Developer (with DBA experience)

• Oracle technology user since 1987

• Co-author of this book

• Today’s talk = last chapter

Page 3: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Agenda

• SQL Assertions?

– A few examples

• Validation Execution Models

– More efficient along the way

• Serializability

– Probably the biggest problem

• Conclusions

Page 4: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

SQL Assertions?

• Examples

1. There is at most one president

2. One cannot manage more than two

departments

3. Department with president and/or

manager should have an administrator

Data integrity constraints

– Just like: CHECK, PK, UK, FK

– ‘all the other ones’

They constrain

data allowed in

our tables

They constrain

data allowed in

our tables

Page 5: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Syntax?

• Create assertion [some_name] as

check([some_SQL_expression]);

• For over 25 years part of Ansi/Iso

SQL standard

Page 6: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 1

• create assertion

at_most_one_president

as check

(1 =>

(select count(*)

from EMP e

where e.JOB = ‘PRESIDENT’

)

);

• Task of DBMS: make sure it’s true at all times

- Closed SQL

expression (no

free variables)

- Evaluates to

true or false

Page 7: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 2

• create assertion

cannot_manage_more_than_2

as check

(not exists

(select ‘x’

from (select d.MGR

,count(*) as cnt

from DEPT d

group by d.MGR)

where cnt > 2

)

);

Page 8: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 3

• create assertion admin_in_dept_with_vip as check (not exists (select ‘x’ from (select distinct e.DEPTNO from EMP e) e where exists (select ‘y’ from EMP e1 where e1.DEPTNO = e.DEPTNO and e1.JOB in (‘P’,’M’)) and not exists (select ‘z’ from EMP e2 where e2.DEPTNO = e.DEPTNO and e2.JOB = ‘ADMIN’) ) );

Page 9: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

• Can this be the killer feature of

OracleXIII?

Page 10: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Imagine...

• If DBMS vendor would support these

– How much less lines of code one would

need to write in application

development

– How much less bugs this results into

– How easy we could accomodate

change requested by the business

– How data quality would improve

Page 11: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Point to be made

• SQL assertions ‘cover’ all the other

declarative constraints available

– CHECK can be written as assertion

– PK/UK can be written as assertion

– FK can be written as assertion

• We just have shorthands for these,

since these are so common in every

database design

Page 12: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Check writen as assertion

• create assertion hire_only_18_plus

as check

(not exists

(select ‘x’

from EMP e

where e.HIRED - e.BORN <

18*365

)

);

CHECK((HIRED – BORN) >= 18*365)

Page 13: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

PK/UK written as assertion

• create assertion empno_is_unique

as check

(not exists

(select ‘x’

from EMP e1

,EMP e2

where e1.EMPNO = e2.EMPNO

and e1.rowid != e2.rowid

)

);

PRIMARY KEY (EMPNO)

Page 14: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

FK written as assertion

• create assertion work_in_known_dept as check (not exists (select ‘x’ from EMP e where not exists (select ‘y’ from DEPT d where d.deptno = e.deptno ) ) ); FOREIGN KEY (DEPTNO) REFERENCES DEPT(DEPTNO)

Page 15: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Scientific Problem

• Why is CREATE ASSERTION not

supported?

• Think about this:

– The DBMS has our assertion expression

but:

• Would you accept full evaluation of given

expression for every insert/update/delete?

– No.

• Performance would be horrible in real world db’s

• You require a much more sophisticated execution

model in the real world

Page 16: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Scientific Problem

• The challenge is:

– Given: arbitrary complex assertion expression

+

– Given: arbitrary complex SQL statement

– What is the minimum required check that

needs to be executed by the DBMS for the

assertion to remain true?

• Can be ‘nothing’ if assertion is immune for the given

DML

Page 17: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

For instance...

• Department that employs president or manager should also employ administrator – Obviously only when DML operates on EMP

1.Insert into emp values(100,’Smith’,’SALESREP’

,’1/1/80’,’20/11/07’,7000,20);

2.Insert into emp values(:b0,:b1,:b2

,:b3,:b4,:b5,:b6);

3.Insert into emp (select * from emp_loaded);

4.Update emp set msal=1.05*msal;

5.Delete from emp where empno=101;

Page 18: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Models

• Using triggers:

– EM1: always

– EM2: involved tables

– EM3: involved columns

– EM4: table polarities

– EM5: involved literals (TE)

– EM6: TE + minimal check

Page 19: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 1

• Evaluate every assertion on every

DML statement

– Evaluating every <boolean

expression with subqueries>

– On every DML statement

• In after-statement table trigger

– WOULD BE *VERY* INEFFICIENT

• Let’s quickly forget this “EM1”

Page 20: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Only evaluate the assertions that involve the table that is being “DML-led”

– Finding involved tables by parsing the

assertion expression

– Per assertion 100% code generation of a

(insert/update/delete) “after statement” table

trigger

• Select from dual where [expression]

• Found? OK

• Not found raise_application_error

Page 21: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Using examples 2 and 3: – Admin_in_dept_with_vip EMP table

– Cannot_manage_more_than_2 DEPT table

Page 22: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

create trigger EMP_AIUDS_EMP01

after insert or update or delete on EMP

declare pl_dummy varchar(40);

begin

--

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘a violation’

from (select distinct deptno from EMP) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint EMP01 is violated.');

end;

Assertion

predicate

Page 23: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

create trigger DEPT_AIUDS_DEPT01

after insert or update or delete on DEPT

declare pl_dummy varchar(1);

begin

--

select ‘a' into pl_dummy

from DUAL

where not exists (select ‘a violation’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint DEPT01 is violated.');

end;

EM2 could be supported declaratively

Page 24: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Inefficiencies:

– EMP01 and DEPT01 are checked when updating columns that are not involved, for instance:

• Updating EMP.ENAME

• Updating DEPT.LOC

Page 25: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

• For inserts + deletes: EM3 == EM2

• For updates: only evaluate

assertions that involve *columns*

being changed

– Simple parse will find columns

– Assumes ‘clean specification’

• Create trigger syntax allows

specification of columns that are

being changed

Page 26: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

create trigger EMP_AUS_EMP01

after update of DEPTNO,JOB on EMP

declare pl_dummy varchar(40);

Begin

--

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘department in violation’

from (select distinct deptno from EMP) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint EMP01 is violated.');

end;

Page 27: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

create trigger DEPT_AIUDS_DEPT01

after update of MGR on DEPT

declare pl_dummy varchar(40);

begin

--

select ‘a' into pl_dummy

from DUAL

where not exists (select ‘x’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint DEPT01 is violated.');

end;

EM3 could be supported declaratively

Page 28: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

• Inefficiencies:

– Sometimes inserts (e)or deletes can

never violate a constraint

– For Cannot_manage_more_than_2,

deleting a department does not require

re-validation

– For Admin_in_dept_with_vip, both

inserts and deletes do require re-

validation

Page 29: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• For updates: EM4 = EM3

• For deletes and inserts EM4 drops unnecessary delete (e)or insert table triggers

– Polarity of a table for a given constraint • Positive: inserts can violate

• Negative: deletes can violate

• Neutral: both can violate

• Undefined: table is not involved

Polarity of table for given constraint can be

computed via special parsing

Page 30: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• EM4 maintains:

– All EM3 triggers, except for one:

• Drops:

– Cannot_manage_more_than_2 delete

trigger

EM4 could be supported declaratively

Page 31: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• Inefficiencies: – Admin_in_dept_with_vip: eg. inserting

SALESMAN, or deleting TRAINER does not require re-validation

• Start involving literals mentioned in assertions

• If assertion does not have literals then next EM is same as EM4 – Cannot_manage_more_than_2 has no literals

Page 32: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• How do we see: – Salesman is inserted? Trainer deleted?

• Could parse the DML statement

– But does not always work due to absence of literals

• Make use of column values of affected rows – Requires “Transition Effect” (TE) of a DML statement

• Common concept (aka. “delta”-tables)

• Inserted_rows, Updated_rows, Deleted_rows

– Maintaining TE is straightforward (see book)

• EM5: Only check the assertion when a property holds in the TE

Page 33: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• TE property for

Admin_in_dept_with_vip:

1. Inserted_rows holds a president or a

manager

or,

2. Deleted_rows holds an admininstrator

3. Updated rows shows that ...

Page 34: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

create trigger EMP_AIS_EMP01

after insert on EMP

declare pl_dummy varchar(40);

begin

-- If this returns no rows, then EMP01 cannot be violated.

select 'EMP01 must be validated' into pl_dummy

from DUAL

where exists

(select 'A president or manager has just been inserted'

from inserted_rows

where JOB in ('PRESIDENT','MANAGER'));

--

begin

--

<same trigger code as EM4>

--

end;

exception when no_data_found then

-- No need to validate EMP01.

null;

--

end;

Page 35: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

create trigger EMP_ADS_EMP01

after delete on EMP

declare pl_dummy varchar(40);

begin

-- If this returns no rows, then EMP01 cannot be violated.

select 'EMP01 must be validated' into pl_dummy

from DUAL

where exists

(select 'An administrator has just been deleted'

from deleted_rows

where JOB = 'ADMIN');

--

begin

--

<same trigger code as EM4>

--

end;

exception when no_data_found then

-- No need to validate EMP01.

null;

--

end;

Page 36: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• Update TE-property for EMP01:

select 'EMP01 is in need of validation'

from DUAL

where exists

(select 'Some department just won a president/

manager or just lost an administrator'

from updated_rows

where (n_job in ('PRESIDENT','MANAGER') and

o_job not in ('PRESIDENT','MANAGER')

or (o_job='ADMIN' and n_job<>'ADMIN')

or (o_deptno<>n_deptno and

(o_job='ADMIN' or n_job in

('PRESIDENT','MANAGER')))

• Can be deduced from insert + delete properties

• EM5 fully declarative too? – Here it gets complex...

Page 37: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• Inefficiencies:

– Admin_in_dept_with_vip:

triggers validate all departments

• Unacceptable in real-world databases

• Only some require re-validation

– Cannot_manage_more_than_2:

triggers validate all department

managers

• Unacceptable in real-world databases

• Only some require re-validation

Page 38: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 6

• On TE-property + optimized

validation query

– Use the TE-query to find:

• Which deptno-values require re-validation

• Which mgr-values require re-validation

– Then use these values in the assertion-

expression

Page 39: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

create trigger EMP_AIS_EMP01

after insert on EMP

declare pl_dummy varchar(40);

begin

--

for r in (select distinct deptno

from inserted_rows

where JOB in ('PRESIDENT','MANAGER'));

loop

begin

-- Note: this now uses r.deptno value from preceeding TE-query.

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘department in violation’

from (select distinct deptno from EMP where deptno = r.deptno) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

--

raise_application_error(-20999,

'Constraint EMP01 is violated for department '||to_char(r.deptno)||'.');

--

end;

end loop;

end;

Page 40: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 6

• This requires:

– Detecting that the ASSERTION can be (re)written as a universal quantification

– Can sometimes be done in multiple ways

• Which to choose?

– EM6 fully declarative? • Complexity introduced in EM5 further

increases

• Remember: given any arbitrary complex assertion + dml-statement

Page 41: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Still not there yet...

• Then there is something else too...

– Which is often overseen by database

professionals

– And, which is neglected in every

research paper (I’ve read...) that deals

with generating constraint validation

code

Serializability

Page 42: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Deptno 13 has two admins and one

manager

– TX1 deletes an admin from 13

• Does not yet commit

– TX2 deletes the other admin 13

• Commits

– TX1 commits

• Constraint is violated for deptno 13!

– TX1 and TX2 must be serialized

Page 43: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Note: this is *not* about locking rows of data, but rather: locking a constraint No two TX’s can validate at same time

• We can use DBMS_LOCK to serialize these transactions – See book for example code

• Again complexity further increases

Page 44: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Concurrency impact of acquiring rule locks

– EM1: One TX at a time

– EM2: One TX per table at a time

– EM3, EM4, EM5 slowly relaxes • Up to EM5: not acceptable

– EM6: Only if two TX’s actually validate *and* involve same deptno (EMP01 assertion)

• Acceptable

Page 45: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Another complicating

factor

• Deferrabilty...

– Involves temporarily storing violation

cases for re-evaluation at commit time

– More comments on that in chapter 11

of the book

Page 46: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

The Polite Excuse

• Inefficient EM’s

– Could be supported, but: • Are unacceptable wgt. Performance &

transaction concurrency

• Efficient EM’s

– Aligned with business requirements • Performance and TX concurrency

But,

– Need more research to determine if they could be declaratively supported

Page 47: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Questions?

• For slidedeck email me at [email protected]

• Twitter @ToonKoppelaars