relation decomposition a, a, … a 12n given a relation r with attributes create two relations r1...

28
Relation Decomposition A , A , … A 1 2 n en a relation R with attributes e two relations R1 and R2 with attributes B , B , … B 1 2 m C , C , … C 1 2 l Such that: B , B , … B 1 2 m C , C , … C 1 2 l A , A , … A 1 2 n e always decompose a relation with no dire conseque we see a BCNF violation, how do we decompose?

Upload: bathsheba-wade

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Relation DecompositionA , A , … A 1 2 n

Given a relation R with attributes

Create two relations R1 and R2 with attributes

B , B , … B 1 2 m C , C , … C 1 2 l

Such that:B , B , … B 1 2 m C , C , … C 1 2 l

A , A , … A 1 2 n

Can we always decompose a relation with no dire consequences?

When we see a BCNF violation, how do we decompose?

Page 2: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Wild DecompositionsName Address Move-Date

Name Address

Name Move-Date

What’s wrong?

Page 3: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Not This Example, Again...

SSN Name Phone Number

123-321-99 Fred (201) 555-1234

123-321-99 Fred (206) 572-4312 909-438-44 Joe (908) 464-0028 909-438-44 Joe (212) 555-4000

Page 4: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

More Careful Strategy

Given a dependency that violates the BCNF condition:

A , A , … A 1 2 n B , B , … B 1 2 m

A’sOthers B’s

R1 R2

Decompose the relation as follows:

Continue untilthere are noBCNF violationsleft.

Find a 2-attribute relation that isnot in BCNF.

Page 5: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Example Decomposition

Name Social-security-number Age Eye Color Phone Number

Functional dependencies:

Name + Social-security-number Age, Eye Color

What if we also had an attribute Draft-worthy, and the FD: Age Draft-worthy

Page 6: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

A ProblemUnit Company Product

Unit Company

Unit Product

FD’s: unit -> company; company, product -> unitSo, there is a BCNF violation, and we decompose.

Page 7: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

So What’s the Problem?

Unit Company Product

Unit Company Unit Product

Galaga99 UW Galaga99 databasesNullValue UW NullValue databases

No problem so far. All local FD’s are satisfied.Let’s put all the data back into a single table again:

Galaga99 UW databasesNullValue UW databases

Violates the dependency: company, product -> unit!

Page 8: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Solution: 3rd Normal Form

A simple condition for removing anomalies from relations:

A relation R is in 3rd normal form if and only if:

Whenever there is a nontrivial dependency for R , it is the case that { } a super-key for R, or B is part of a key.

A , A , … A 1 2 n

BA , A , … A 1 2 n

What happened to first and second normal forms?

Will we have more normal forms?

Page 9: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Multi-valued Dependencies

Name SSN Phone Number Course

Fred 123-321-99 (206) 572-4312 CSE-444Fred 123-321-99 (206) 572-4312 CSE-341Fred 123-321-99 (206) 432-8954 CSE-444Fred 123-321-99 (206) 432-8954 CSE-341

The multi-valued dependencies are:

Name, SSN Phone Number Name, SSN Course

4th Normal form: replace FD by MVD.

Page 10: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Summary of This Section

• Functional dependencies: basic definition.

• Properties of functional dependencies.

• Entailment and equivalence of FD’s:– Compare what FD’s say about sets of databases.

• Algorithm for finding the closure of a set of attributes A:– Start with A, and add attributes implied by the

FDs.

Page 11: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Summary Continued• Algorithm for entailment: Does the set of FDs S entail

the dependency A -> B?– Compute the closure of A. If it includes the attribute B,

return Yes. Otherwise, return No.

• Algorithm for entailment of a set of FDs: does S entail S1?– Apply the previous algorithm for each FD in S1. Return

Yes only if it’s Yes for all.

• Algorithm for equivalence: apply entailment in both directions.

Page 12: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

More Summary

• Decompose a relation by identifying BCNF violations.• You can weaken BCNF by considering 3rd normal

form.• If multi-valued dependencies are present, consider 4th

normal form.• Given a relational schema and a set of dependencies,

we now know how to put it into normalized form.• Do we always want to do that??

Page 13: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Summary Beyond This Section

• We built a conceptual model (in E/R or ODL):– We chose entities, relationships– We articulated the constraints on the domain.

• We translated it into a relational schema in the proper way.

• What do we do with the data now that it’s in tables? – How do we query it?– How do we combine data from multiple tables?

Page 14: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Querying the Database• How do we specify what we want from our database?

Find all the employees who earn more than $50,000 and pay taxes in New Jersey.

• We design high-level query languages:– SQL (used everywhere)– Datalog (used by database theoreticians, their students,

friends and family)

• Relational algebra: a basic set of operations on relations that provide the basic principles.

Page 15: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Relational Algebra at a Glance• Operators: sets as input, new set as output

• Basic Set Operators– union, intersection, difference, but no complement.

• Selection:• Projection:

• Cartesian Product: X• Joins (natural,equi-join, theta join, semi-join)

• Renaming:

Page 16: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Set Operations• Binary operations

– Result is table(set) with same attributes

• Watch out for naming of attributes in resulting relation.

• Union: all tuples in R1 or R2• Intersection: all tuples in R1 and R2• Difference: all tuples in R1 and not in R2• No complement. Why?• Bags later.

Page 17: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Selection

• Produce a subset of the tuples in a relation which satisfy a given condition

• Unary operation… returns set with same attributes, but ‘selects’ rows

• Use and, or, not, >, <… to build condition

• Find all employees with salary more than $40,000:

Page 18: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Selection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name DepartmentID Salary888888888 Alice 2 45,000

Find all employees with salary more than $40,000.

Page 19: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Projection

• Unary operation, selects columns

• Eliminates duplicate tuples

• Example: project social-security number and names.

Page 20: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Projection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name999999999 John777777777 Tony888888888 Alice

Page 21: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Cartesian Product

• Binary Operation

• Result is tuples combining any element of R1 with any element of R2, for R1 X R2

• Schema is union of Schema(R1) & Schema(R2)

Page 22: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Cartesion Product Example

EmployeeName SSNJohn 999999999Tony 777777777

DependentsEmployeeSSN Dname999999999 Emily777777777 Joe

Employee_DependentsName SSN EmployeeSSN DnameJohn 999999999 999999999 EmilyJohn 999999999 777777777 JoeTony 777777777 999999999 EmilyTony 777777777 777777777 Joe

Page 23: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Join (Natural)• Most important, expensive and exciting.

• Combines two relations, selecting only related tuples

• Equivalent to a cross product followed by selection

• Resulting schema has all attributes of the two relations, but one copy of join condition attributes

Page 24: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Join Example

EmployeeName SSNJohn 999999999Tony 777777777

DependentsEmployeeSSN Dname999999999 Emily777777777 Joe

Employee_DependentsName SSN DnameJohn 999999999 EmilyTony 777777777 Joe

Page 25: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Other Joins and Renaming

• Theta join: the join involves a predicate– R S

• Semi-join: the attributes of one relation are included in the other.

• Renaming:

Page 26: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Complex Queries

Product ( name, price, category, maker)Purchase (buyer, seller, store, product)Company (name, stock price, country)Person( name, phone number, city)

Find phone numbers of people who bought gizmos from Fred.

Find telephony products that somebody bought

Page 27: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Exercises

Product ( name, price, category, maker)Purchase (buyer, seller, store, product)Company (name, stock price, country)Person( name, phone number, city)

Ex #1: Find people who bought telephony products.Ex #2: Find names of people who bought American productsEx #3: Find names of people who bought American products and did not buy French productsEx #4: Find names of people who bought American products and they live in Seattle.Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.

Page 28: Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such

Operations on Bags (and why we care)

Basic operations:

Projection Selection Union Intersection Set difference Cartesian product

Join (natural join, theta join)