er normalization

54
© © 2008 2008 Prof. Dr. Rainer Manthey Prof. Dr. Rainer Manthey Foundations of IM Foundations of IM 1 1 Database Design Database Design Principles Principles of Database Design of Database Design Chapter Chapter 4 4 Foundations of Information Foundations of Information Management Management (WS 200 (WS 200 8 8 /0 /0 9 9 ) ) ? ?

Upload: shanu

Post on 14-Oct-2014

28 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 11

Database DesignDatabase Design

Principles of Database DesignPrinciplesPrinciples of Database Designof Database Design

–– Chapter Chapter 44 ––

Foundations of Information Management (WS 2008/09)Foundations of Information Foundations of Information ManagementManagement (WS 200(WS 20088/0/099))

??

Page 2: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 22

Design of DB Design of DB applicationsapplications

Requirements analysisRequirementsRequirements analysisanalysis

DesignDesignDesign

ImplementationImplementationImplementation

UsageUsage

Database Database developmentdevelopmentRelative Relative costscosts forfor compensatingcompensatingthethe consequencesconsequences of of errorserrors::

111

101010

100100100

>> 100>>>> 100100

Topic of Topic of thisthischapterchapter

Page 3: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 33

DBDB--Design: Design: PhasesPhases

ApplicationApplication areaareain in thethe "real "real worldworld""

Design Design documentationdocumentation

PhysicalPhysical structurestructurein in storagestorage

LogicalLogical levellevel

PhysicalPhysical levellevel

ConceptualConceptualdesigndesign

PhysicalPhysicaldesigndesign

ConceptualConceptual levellevel

LogicalLogical structurestructure„„on on paperpaper""

LogicalLogicaldesigndesign

Page 4: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 44

ConceptualConceptual designdesign

ApplicationApplication areaareain in thethe "real "real worldworld""

Design Design documentationdocumentation

PhysicalPhysical structurestructurein in storagestorage

LogicalLogical levellevel

PhysicalPhysical levellevel

ConceptualConceptualdesigndesign

PhysicalPhysicaldesigndesign

ConceptualConceptual levellevel

LogicalLogical structurestructure„„on on paperpaper""

LogicalLogicaldesigndesign

Formally define what you knowabout „the world“ without thinkingof computers and databases (yet)!

FormallyFormally definedefine whatwhat youyou knowknowaboutabout „„thethe worldworld““ withoutwithout thinkingthinkingof of computerscomputers and and databasesdatabases ((yetyet)!)!

Page 5: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 55

TheThe EntityEntity--RelationshipRelationship modelmodel

•• EntityEntity--RelationshipRelationship datadata modelmodel (ER (ER modelmodel):):•• ProposedProposed in 1976 in a in 1976 in a paperpaper byby Peter ChenPeter Chen•• GraphicalGraphical notationnotation forfor applicationapplication modelingmodeling (ER (ER diagramsdiagrams))•• Independent Independent semanticsemantic datadata modelmodel

((aimingaiming at at thethe meaningmeaning of of conceptsconcepts in real in real worldworld))•• PredecessorPredecessor of of todaytoday‘‘ss objectobject--orientedoriented datadata modelsmodels•• ExtremelyExtremely successfulsuccessful as a as a meansmeans of of „„prepre--designdesign““ of relational of relational DBsDBs

•• TheThe ER ER modelmodel offersoffers fewfew veryvery simple and simple and basicbasic conceptsconcepts::

•• EntitiesEntities ((objectsobjects), ), characterizedcharacterized byby attributesattributes ((propertiesproperties))•• BinaryBinary oror nn--aryary relationshipsrelationships betweenbetween entitiesentities,,

possiblypossibly characterizedcharacterized byby attributesattributes as wellas well•• OftenOften notnot mentionedmentioned explicitlyexplicitly, , butbut importantimportant and and basicbasic::

ValuesValues:: printableprintable symbolssymbols as as valuesvalues of of attributesattributes;;play a play a subordinatesubordinate rolerole ((characterizingcharacterizing objectsobjects))

•• RolesRoles: : NamesNames forfor thethe specialspecial meaningmeaning an an entityentity has has withinwithin aarelationshiprelationship

Page 6: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 66

An An entityentity

"Rainer""Rainer" "Manthey""Manthey" 5353 "Informatik""Informatik"765765

firstNamefirstNamepersNrpersNrnamename ageage topictopic

"C3""C3"

rankrank

an entityan an entityentity

its attributesitsits attributesattributes

their valuestheirtheir valuesvalues

EachEach entityentity isis completelycompletely characterizedcharacterizedbyby thethe valuesvalues of of allall itsits attributesattributes. .

Page 7: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 77

TwoTwo similarsimilar entitiesentities

another entityanotheranother entityentity

identical attributesidenticalidentical attributesattributes

"Rainer""Rainer" "Manthey""Manthey" 5353 "Informatik""Informatik"765765

firstNamefirstNamepersNrpersNrnamename ageage topictopic

"C3""C3"

rankrank

partially different valuespartiallypartially different different valuesvalues

„„Armin B."Armin B." „„Cremers"Cremers" 6060 "Informatik""Informatik"????

firstNamefirstNamepersNrpersNrnamename ageage topictopic

"C4""C4"

rankrank

Page 8: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 88

EntityEntity typestypes

•• SimilarSimilar entitiesentities cancan bebe combinedcombined intointo entityentity typestypes..

•• „„SimilaritySimilarity" " requiresrequires at least at least identicalidentical attributeattribute structurestructure..(Attribute (Attribute namesnames and and correspondingcorresponding valuevalue domainsdomains areare identicalidentical.).)

•• EntityEntity typestypes areare graphicallygraphically representedrepresented byby rectanglesrectangles. Attributes . Attributes labellabel thetheline line connectingconnecting an an entityentity typetype and a and a valuevalue domaindomain ((oftenoften symbolizedsymbolizedbyby an an ovaloval):):

professorprofessor

StringString IntIntIntInt

firstNamefirstNamepersNrpersNrnamename ageage topictopic

StringString

rankrank

StringString StringString

NameName of of thethe typetype

Page 9: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 99

EntityEntity typestypes: Alternative : Alternative representationsrepresentations

•• Domain Domain namesnames areare oftenoften omittedomitted (in (in casecase theythey areare obviousobvious oror irrelevant).irrelevant).InsteadInstead attributesattributes areare placedplaced insideinside ovalsovals: :

professorprofessor

firstNamefirstNamepersNrpersNr namename ageage topictopic rankrank

•• In larger In larger diagramsdiagrams, , thethe attributeattribute structurestructure isis oftenoften entirelyentirely omittedomitted in order toin order tosave save spacespace ((oror isis writtenwritten down in down in abbreviatedabbreviated form form onlyonly): ):

professorprofessor

persNrpersNr, , firstNamefirstName, , namename, . . ., . . .

Page 10: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1010

InstancesInstances and and populationpopulation of an of an entityentity typetype

•• EachEach entityentity „„belongsbelongs to" at least to" at least oneone entityentity typetype: : ItIt isis calledcalled an an instanceinstance of of thisthis typetype..

professorprofessoran an instanceinstance ofof

•• TheThe setset of of allall currentcurrent instancesinstances of an of an entityentity typetype isis callescalles itsits currentcurrent populationpopulation..

professorprofessora a populationpopulation ofof

Page 11: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1111

Multiple Multiple classificationclassification

•• One and One and thethe samesame entityentity cancan bebe an an instanceinstance of of variousvarious entityentity typestypes..

professorprofessorinstanceinstance ofof

fanfan ofofWerder BremenWerder Bremen

instanceinstance ofof

•• In such a In such a casecase, , thethe attributesattributes of of thethe different different typestypes of of thisthis entityentity maymay well well bebequitequite differentdifferent..

Page 12: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1212

ClassificationClassification in in presencepresence of of identicalidentical attributesattributes

•• EntitiesEntities withwith thethe samesame attributesattributes do do notnot at allat all havehave to to bebe instancesinstances of of thethe samesametypetype!!

AssAssinstanceinstance ofof

Fan of Fan of footballfootball clubclubWerder BremenWerder Bremeninstanceinstance ofof

•• Almost Almost alwaysalways additional additional classificationclassification criteriacriteria areare requiredrequired, , usuallyusually notnotderivablederivable fromfrom thethe attributeattribute structurestructure..

namename, , firstNamefirstName, age, age

namename, , firstNamefirstName, age, age

Page 13: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1313

Key Key attributesattributes

•• As in As in thethe relational relational modelmodel (Access, SQL), (Access, SQL), therethere areare usuallyusually oneone oror moremore attributesattributesper per entityentity typetype thethe valuesvalues of of whichwhich areare sufficientsufficient forfor uniquelyuniquely identifyingidentifying eacheachinstanceinstance::

professorprofessor

firstNamefirstNamepersNrpersNr namename ageage topictopic rankrank

Key attributesKey Key attributesattributes

•• ((PrimaryPrimary) ) keykey attributesattributes areare usuallyusually underlinedunderlined in an in an ERER--diagramdiagram..•• KeysKeys oughtought to to bebe ""minimalminimal" (no " (no attributeattribute cancan bebe omittedomitted).).•• A A distinctiondistinction betweenbetween primaryprimary keykey and and otherother candidatecandidate keyskeys isis notnot mademade in in thethe

ERER--modelmodel, , eveneven thoughthough itit wouldwould bebe usefuluseful to do so.to do so.

Page 14: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1414

RelationshipsRelationships

19921992

•• 22ndnd mainmain conceptconcept of of thethe ERER--modelmodel: : elementaryelementary relationshipsrelationships betweenbetween twotwooror moremore entitiesentities ((possiblypossibly withwith theirtheir ownown attributeattribute valuesvalues))

yearyear of of assignmentassignment

UniversitUniversitäätt BonnBonn765765

PersNrPersNr namename

RelationshipRelationshipRelationship

•• EachEach relationshiprelationship isis uniquelyuniquely characterizedcharacterized byby thethe keykey valuesvalues of of thethe participartici--patingpating entitiesentities and and byby thethe valuesvalues of all of all relationshiprelationship attributesattributes. .

•• HoweverHowever, , therethere areare no separate no separate keykey attributesattributes forfor relationshipsrelationships, as , as thethe keyskeys of of thethe participatingparticipating entitiesentities alwaysalways sufficesuffice forfor uniqueunique identificationidentification. .

Page 15: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1515

Multiple Multiple relationshipsrelationships

19921992

19991999

EntititesEntitites maymay participateparticipate in in variousvarious relationshipsrelationships (also (also similarsimilar onesones).).

yearyear of of assignmentassignment

yearyear of of assignmentassignment

Page 16: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1616

RelationshipRelationship typestypes

•• SimilarSimilar relationshipsrelationships maymay bebe groupedgrouped intointo relationshiprelationship typestypes..

•• „„SimilaritySimilarity" " requiresrequires at least at least identicalidentical attributeattribute structurestructure andand identicalidentical typestypes(and (and numbernumber) ) of of participatingparticipating entitiesentities..

•• RelationshipRelationship typestypes areare graphicallygraphically denoteddenoted in in thethe ERER--modelmodel byby a a diamonddiamond. . Attributes Attributes areare writtenwritten as as forfor entityentity typestypes..

professorprofessor

persNrpersNr, . . ., . . .

universityuniversity

namename, . . ., . . .

IntInt

assigned_toassigned_to

yearyear of of assignmentassignment

Page 17: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1717

InstancesInstances of a of a relationshiprelationship typetype

19921992UniversitUniversitäätt BonnBonn765765

persNrpersNr namename

professorprofessor

persNrpersNr, . . ., . . .

universityuniversity

namename, . . ., . . .

IntInt

assigned_toassigned_to

. . .. . . . . .. . .

assigned_toassigned_to

RelationshipRelationship typestypes havehave instancesinstances, , tootoo: : IndividualIndividual relationshipsrelationships betweenbetween individualindividualentitiesentities areare analogouslyanalogously consideredconsidered instancesinstances of of thethe correspondingcorresponding RR--typetype..

yearyear of of assignmentassignment

yearyear of of assignmentassignment

Page 18: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1818

Type vs. Type vs. instanceinstance

19921992UniversitUniversitäätt BonnBonn765765

persNrpersNr namename

professorprofessor

persNrpersNr, . . ., . . .

universityuniversity

namename, . . ., . . .

IntInt

assigned_toassigned_to

. . .. . . . . .. . .

assigned_toassigned_to

TheThe distinctiondistinction betweenbetween typestypes and and instancesinstances isis difficultdifficult eveneven forfor specialistsspecialists::TryTry to to bebe preciseprecise fromfrom thethe veryvery beginningbeginning!!

entity typeentityentity typetype relationship typerelationshiprelationship typetype

entityentityentity relationshiprelationshiprelationship

yearyear of of assignmentassignment

yearyear of of assignmentassignment

Page 19: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 1919

RolesRoles

•• One and One and thethe samesame entityentity typetype maymay participateparticipate in a in a relationshiprelationship typetype moremore thanthanonceonce, e.g.:, e.g.:

personperson

descendent_ofdescendent_ofmothermother

daughterdaughter

•• For a (For a (syntacticalsyntactical) ) distinctiondistinction betweenbetween thethe different different „„formsforms of of participationparticipation““,,specialspecial designatorsdesignators areare usedused, , calledcalled rolesroles. . RolesRoles areare usedused as as labelslabels of of thethe lineslinesconnectingconnecting thethe resp. resp. entityentity and and relationshiprelationship typetype..

Page 20: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2020

ExtendedExtended GeoGeo--DB: DB: ERER--diagramdiagram

citycity countrycountry

riverriver regionregionrelationshiprelationship typetype

rolerole

city_in_countrycity_in_countrystadt_an_flussstadt_an_fluss regionregion__

in_countryin_countryriverriver__throughthrough__countrycountry

neighbour_countryneighbour_country

source_ofsource_of

sourcesource riverriver

countrycountry 11

countrycountry 22

capital_of_countrycapital_of_countrycapitalcapital

entityentity typetype

Page 21: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2121

RelationshipRelationship typestypes and and mathematicalmathematical relationsrelations

On On thethe levellevel of of instancesinstances,, a a relationrelation overover thethe populationspopulations of of thethe entityentity typestypes involvedinvolvedisis associatedassociated withwith eacheach relationshiprelationship typetype..

EE11 EE22RR

entitiesentities notnotinvolvedinvolved in Rin R

Page 22: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2222

Functional Functional relationshipsrelationships

citycity countrycountry

city_in_countrycity_in_country

In In manymany casescases, , at at mostmost oneone entityentity on on oneone „„sideside" of a " of a relationshiprelationship typetype maymay bebeassociatedassociated withwith a a particularparticular entityentity on on thethe otherother „„sideside““ of of thethe relationshiprelationship, e.g.: , e.g.:

• Each city is in exactly one country.• In each country, however, arbitrarily many cities may be situated.

•• EachEach citycity isis in in exactlyexactly oneone countrycountry..•• In In eacheach countrycountry, , howeverhowever, , arbitrarilyarbitrarily manymany citiescities maymay bebe situatedsituated..

Page 23: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2323

Functional Functional relationshipsrelationships (2)(2)

•• MathematicallyMathematically, , city_in_countrycity_in_country isis a a functionfunction::

city_in_country: city → countrycity_in_countrycity_in_country:: citycity →→ countrycountry

•• MoreMore exactlyexactly: . . . a : . . . a functionfunction fromfrom thethe populationpopulation of of citycity intointo thethepopulationpopulation of of countrycountry. .

populationpopulation(city(city)) populationpopulation(country(country))

Page 24: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2424

Functional Functional relationshipsrelationships (3)(3)

citycity countrycountry

city_in_countrycity_in_country

•• In In thethe ERER--modelmodel, such , such restrictionsrestrictions of of thethe admissibleadmissible combinationscombinations cancan bebeexpressedexpressed byby meansmeans of of soso--calledcalled functionalitiesfunctionalities, , annotationsannotations attachedattached to to thetheedgesedges connectingconnecting entityentity and and relationshiprelationship typestypes. .

•• ThereThere areare fourfour different different kindskinds of of binarybinary relationshipsrelationships expressibleexpressible byby meansmeans ofoffunctionalitiesfunctionalities::

•• In In thisthis contextcontext, N resp. M , N resp. M standsstands forfor arbitraryarbitrary integer integer valuesvalues ≥≥ 0. 0.

•• In In thethe ‚‚city_in_country'city_in_country'--exampleexample, an , an N:1N:1--relationship relationship isis appropriateappropriate: : ThereThere isisexactlyexactly oneone countrycountry per per citycity, , butbut arbitrarilyarbitrarily manymany cititescitites per per countrycountry (N (N ≥≥ 0).0).

NN 11

1 : 1 1 : N N : 1 N : M1 : 11 : 1 1 : N N : 1 N : M1 : N N : 1 N : M

Page 25: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2525

Functional Functional relationshipsrelationships (4)(4)

•• FunctionalitiesFunctionalities of of typetype 1 : 1, 1 : N 1 : 1, 1 : N oror N : 1 N : 1 definedefine partial partial functionsfunctions wherewhere somesomeof of thethe instancesinstances of of thethe typestypes involvedinvolved possiblypossibly areare notnot relatedrelated at all.at all.

citycity countrycountry

city_in_countrycity_in_country

NN 11

City City withoutwithout countrycountry!!

•• In In thethe „„normalnormal““ ERER--modelmodel, total , total functionsfunctions cannotcannot bebe distinguisheddistinguished fromfrom partialpartialonesones –– in in extensionsextensions of of thethe modelmodel therethere areare additional additional graphicalgraphical meansmeans forfor exex--plicitlyplicitly statingstating whetherwhether a a functionfunction isis partial partial oror total.total.

Page 26: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2626

Functional Functional relationshipsrelationships (5)(5)

•• In a In a 1 : 11 : 1--relationshiprelationship eacheach instanceinstance of of oneone of of thethe entityentity typestypes involvedinvolved isis relatedrelatedto to nonenone oror exactlyexactly oneone of of thethe instancesinstances of of thethe otherother entityentity typetype. .

•• An An N : N : MM--relationshiprelationship cancan bebe consideredconsidered thethe „„normal normal casecase" " withoutwithout restrictionsrestrictionson on thethe numbernumber of of participatingparticipating entitiesentities..

•• IfIf no no functionalitiesfunctionalities havehave beenbeen statedstated forfor a a relationshiprelationship typetype, , thenthen an an implicitimplicitN : M N : M functionalityfunctionality isis assumedassumed..

•• FunctionalitiesFunctionalities cancan bebe defineddefined forfor relationshipsrelationships withwith moremore thanthan twotwo entitiesentities inin--volvedvolved, , tootoo::

EE11

EE22

EE33

RRNNMM

11

R: E1 × E2 → E3R: ER: E11 ×× EE22 →→ EE33

Page 27: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2727

Functional Functional relationshipsrelationships (6)(6)

EE11

EE22

EE33

RRNN11

11

R(1): E1 × E2 → E3

R(2): E1 × E3 → E2

RR(1)(1): E: E11 ×× EE22 →→ EE33

RR(2)(2): E: E11 ×× EE33 →→ EE22

•• IfIf in an in an nn--aryary relationshiprelationship severalseveral edgesedges areare markedmarked byby '1','1',thenthen thethe resp. resp. relationshiprelationship typetype representsrepresents severalseveral partial partial functionsfunctions::

•• . . . and . . . and analogouslyanalogously::

EE11

EE22

EE33

RR1111

11

R(1): E1 × E2 → E3

R(2): E1 × E3 → E2

R(3): E2 × E3 → E1

RR(1)(1): E: E11 ×× EE22 →→ EE33

RR(2)(2): E: E11 ×× EE33 →→ EE22

RR(3)(3): E: E22 ×× EE33 →→ EE11

Page 28: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2828

ExtendedExtended ER ER modelmodel: : GeneralizationGeneralization

•• TheThe conceptsconcepts introducedintroduced so far so far havehave beenbeen containedcontained in Chenin Chen‘‘s original s original proposalproposalthroughoutthroughout. . SinceSince thenthen, , howeverhowever, , variousvarious extensionsextensions havehave beenbeen proposedproposed::

ExtendedExtended EntityEntity--RelationshipRelationship ModelModel ((EEREER--modelmodel))

•• Most Most importantimportant extensionsextensions (as in (as in objectobject--orientedoriented modelsmodels):):

•• ThisThis meansmeans::•• formationformation of of subtypessubtypes of of entityentity typestypes•• subsub--/supertype/supertype relationshipsrelationships ((typetype hierarchyhierarchy))•• inheritanceinheritance of of attributesattributes and of and of „„participationsparticipations" in " in RR--typestypes

•• Special Special graphicalgraphical notationnotation forfor generalizationgeneralization of of EE--typestypes::

GeneralizationGeneralizationGeneralization

is_ais_a

. . .. . .

Page 29: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 2929

GeneralizationGeneralization (2)(2)

is_ais_a

regionregion

countrycountry statestate

namename, , areaarea, , inhabitantsinhabitants

prime prime ministerminister governorgovernor

citycity

capital_ofcapital_of supertypesupertype

subtypessubtypes

„„InheritanceInheritance" in " in thisthis exampleexample meansmeans::•• BothBoth subtypessubtypes inheritinherit all all attributesattributes of of thethe supertypesupertype,,

i.e., i.e., theythey „„ownown" " thesethese attributesattributes withoutwithout explicitexplicit definitiondefinition..•• BothBoth subtypessubtypes participateparticipate in in thethe relationshiprelationship typetype capital_ofcapital_of',',

whichwhich has has beenbeen explicitlyexplicitly defineddefined forfor thethe supertypesupertype onlyonly..

Page 30: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3030

GeneralizationGeneralization (3)(3)

is_ais_a

regionregion

countrycountry statestate populationpopulation(region(region))

populationpopulation(country(country))

populationpopulation(state(state))

GeneralizationGeneralization alwaysalways meansmeansthatthat thethe populationspopulations of of thethe subtypessubtypesareare subsetssubsets of of thethe populationpopulation of of thethesupertypesupertype..

ThisThis circumstancecircumstance motivatesmotivates thethenotionnotion ''is_a'is_a'--relationshiprelationship::

""EveryEvery countrycountry isis aa regionregion."."

QuantifierQuantifier overover instancesinstances!!

Page 31: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3131

GeneralizationGeneralization (4)(4)

In In thethe exampleexample: : specialspecial casecasedisjointdisjoint generalizationgeneralization

((EmptyEmpty intersectionintersection of of thethe populationspopulations))

In In generalgeneral::ThisThis form of form of thethe ''is_a'is_a'--notationnotationjust just meansmeans somesome form of form of subsetsubset formationformation, i.e., i.e.overlappingoverlapping, , incompleteincompletesubdivisionsubdivision..

Page 32: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3232

LogicalLogical design: ER design: ER ⇒⇒ Relations Relations

Design Design documentationdocumentation

ConceptualConceptualdesigndesign

PhysicalPhysicaldesigndesign

LogicalLogical structurestructure„„on on paperpaper""

LogicalLogicaldesigndesign

ER modelER ER modelmodel

Relational modelRelational Relational modelmodel

MappingMapping

ImprovementImprovement

1st step

Map ER types systema-tically to relations:

From semantic conceptsto syntactic structures !

11stst stepstep

MapMap ER ER typestypes systemasystema--ticallytically to relations:to relations:

FromFrom semanticsemantic conceptsconceptsto to syntacticsyntactic structuresstructures !!

Page 33: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3333

ER ER ⇒⇒ Relations: Relations: MappingMapping EntityEntity typestypes

•• MappingMapping fromfrom ER ER modelmodel to relational to relational modelmodel::•• in in principleprinciple veryvery easyeasy: per : per typetype oneone relationrelation (table)(table)•• in in detaildetail and and forfor extensionsextensions: : quitequite difficultdifficult

•• MappingMapping of of entityentity typestypes: : ratherrather obviousobvious•• typetype ⇒⇒ table table namename

•• attributeattribute ⇒⇒ columncolumn namename

EE

AA11 AA22 AAnn. . .. . .

AA1 1 AA2 2 AAnn. . .. . .EE

•• Key Key attributesattributes areare mappedmapped to to primaryprimary keykey columnscolumns..

Page 34: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3434

ER ER ⇒⇒ Relations: Relations: MappingMapping RelationshipRelationship typestypes

•• For For relationshiprelationship typestypes::•• UseUse thethe samesame basicbasic ideaidea: : oneone relationrelation per per typetype..•• butbut: : HowHow areare participatingparticipating entityentity typestypes representedrepresented??

statestatecitycity

city_in_statecity_in_state

sincesincenamenameinhabitantsinhabitants. . .. . .

namenameareaarea. . .. . .

•• ObviousObvious solutionsolution as well: as well: TheThe participatingparticipating entitiesentities areare representedrepresented byby meansmeansof of thethe valuesvalues of of theirtheir primaryprimary keykey attributesattributes ((possiblypossibly afterafter renamingrenaming).).

citycity..namename statestate..namename sincesincecity_in_statecity_in_state

ForeignForeign keykey

Page 35: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3535

ER ER ⇒⇒ Relations: Relations: MappingMapping RelationshipRelationship typestypes (2)(2)

In In presencepresence of of specialspecial functionalitiesfunctionalities (1 : N, N : 1, 1 : 1 resp.), a separate (1 : N, N : 1, 1 : 1 resp.), a separate relationrelation--shipship table table isis notnot necessarynecessary, as , as thethe relationshiprelationship informationinformation cancan bebe embeddedembedded intointothethe table of table of thethe entityentity typetype on on thethe NN--sideside: :

statestatecitycity

city_in_statecity_in_state

sincesincenamenameinhabitantsinhabitants. . .. . .

namenameareaarea. . .. . .

citycity.name.name inhabitantsinhabitants . . . . . . statestate.name.name sincesincecitycity

ForeignForeign keykey

11NN

Page 36: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3636

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies

is_ais_a

EE11

EE22

AA11, A, A22, . . . , . . .

BB11, B, B22, . . . , . . .

inheritanceinheritance

How to realize inheritance andsub-/super type relationships relationally ?

HowHow to to realizerealize inheritanceinheritance andandsubsub--/super /super typetype relationshipsrelationships relationallyrelationally ??

•• Relational Relational modelmodel: : doesdoes notnot knowknow anyany inheritanceinheritance!!

•• InheritanceInheritance thusthus has to has to bebe ""simulatedsimulated" . " .

Relational Relational representationrepresentation of a of a super super typetype EE11 isis obviousobvious: :

AA11 AA22 . . . . . . EE11

Page 37: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3737

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies (2)(2)

is_ais_a

EE11

EE22

AA11, A, A22, . . . , . . .

BB11, B, B22, . . ., . . .

InheritanceInheritance

AA11 AA22 . . . . . . EE11

AA11 AA22 . . . . . . BB11 BB22 . . . . ..

EE22

inheritedinherited „„native" native" attributesattributes

•• ObviousObvious relational relational realizationrealization of a of a subtypesubtype::subtypesubtype relationrelation EE2 2 ownsowns „„native" and native" and inheritedinherited attributesattributes..

•• ButBut: : ValuesValues of of thethe inheritedinherited attributesattributes of all Eof all E22--instances instances havehave to to bebe((redundantlyredundantly) ) repeatedrepeated in in thethe EE11--relation in relation in thisthis casecase!!

•• ReasonReason: : EachEach EE2 2 --instanceinstance isis an Ean E11--instance, instance, tootoo !!

aa1111 aa1212 . . .. . . bb1111 bb1212 . . .. . .

aa1111 aa1212 . . .. . .aa2121 aa2222 . . .. . .

Page 38: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3838

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies (3)(3)

is_ais_a

EE11

EE22

AA11, A, A22, . . . , . . .

BB11, B, B22, . . ., . . .

inheritanceinheritance AA11 AA22 . . . . . . EE11

AA11 BB11 BB22 . . . . ..

EE22

„„native" native" attributesattributes

aa1111 bb1111 bb1212 . . .. . .

aa1111 aa1212 . . .. . .aa2121 aa2222 . . .. . .

inheritedinherited attributesattributes

jointjointkeykey

AvoidingAvoiding duplicationduplication: Store : Store onlyonly „„native" native" attributesattributes(+ (+ keykey forfor joiningjoining) in ) in thethe subtypesubtype relationrelation!!

But in this case relation E2 doesno longer contain all attributes ofE1-entities !

ButBut in in thisthis casecase relationrelation EE22 doesdoesno no longerlonger containcontain allall attributesattributes ofofEE11--entities !entities !

Page 39: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 3939

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies (4)(4)

AA11 AA22 . . . . . . EE11

AA11 BB11 BB22 . . . . . .

EE22--locallocal

„„native" native" attributesattributes

aa1111 bb1111 bb1212 . . .. . .

aa1111 aa1212 . . .. . .aa2121 aa2222 . . .. . .

inheritedinherited attributesattributes

way outway out: E2: E2--population population isis completelycompletely realizedrealized byby meansmeans of a of a viewview joiningjoiningthethe inheritedinherited and and thethe native native attributesattributes. .

AA11 AA22 . . . B. . . B11 BB22 . . . . . .

EE22--globalglobal

aa1111 aa1212 . . .. . . bb1111 bb1212 . . .. . .

CREATE VIEW E2-global AS(SELECT *FROM E1 INNER JOIN E2-local

ON E1.A1 = E2-local.A1)

CREATE VIEW CREATE VIEW E2E2--globalglobal ASAS(SELECT *(SELECT *FROM E1 INNER FROM E1 INNER JOINJOIN E2E2--locallocal

ON E1.A1 = E2ON E1.A1 = E2--local.A1)local.A1)

Page 40: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4040

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies (5)(5)

AA11 AA22 . . . . . .

EE11--locallocal

„„native" native" attributesattributes

aa2121 aa2222 . . .. . .

inheritedinherited

EE11--globalglobal

CREATE VIEW E1-global AS( TABLE E1-local )UNION

(SELECT A1, A2, . . .FROM E2 )

CREATE VIEW CREATE VIEW E1E1--globalglobal ASAS( TABLE ( TABLE E1E1--locallocal ))UNIONUNION

(SELECT A(SELECT A11, A, A22, . ., . . ..FROM E2FROM E2 ))

33rdrd alternativealternative (also (also freefree of of redundanciesredundancies and and usingusing a a viewview):):DistributeDistribute valuesvalues of of thethe inheritedinherited attributesattributes to to differentdifferent relations relations ––super super typestypes areare reconstructedreconstructed via via viewsviews. .

AA11 AA22 . . . B. . . B11 BB22 . . . . . .

EE22

aa1111 aa1212 . . .. . . bb1111 bb1212 . . .. . .

AA11 AA22 . . . . . . aa1111 aa1212 . . .. . .aa2121 aa2222 . . .. . . ∪∪

ππ

Page 41: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4141

Relational Relational representationrepresentation of of generalizationgeneralization hierarchieshierarchies (6)(6)

Which of the three alternatives is „the best" ?WhichWhich of of thethe threethree alternatives alternatives isis „„thethe best"best" ??

1. Relations E1. Relations E11 and Eand E22, , nono viewsviews::

2. Relations E2. Relations E11 and Eand E22--local, local, viewview EE22--global (JOIN):global (JOIN):

3. Relations E3. Relations E22 and Eand E11--local, local, viewview EE11--global (PROJECTglobal (PROJECT--UNION):UNION):

++−−

++−−

++−−

shortshort accessaccess timetime ((withoutwithout anyany joinjoin of of tablestables))high high requirementsrequirements forfor spacespace ((duedue to redundant to redundant storagestorage))

OnlyOnly keykey attributeattribute valuesvalues areare storedstored redundantlyredundantly..Access to Access to EE22--attributes attributes isis slowerslower ((duedue to to joinjoin). ).

No No duplicationduplication of of anyany attributeattribute valuesvalues..Access to Access to EE11--attributes attributes isis slowerslower ((duedue to to projectionprojection and and unionunion) .) .

Page 42: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4242

GeneralizationGeneralization and and updatesupdates

is_ais_a

EE11

EE22

AA11, A, A22, . . . , . . .

BB11, B, B22, . . ., . . .

What happens if an E2-entity is deleted ?WhatWhat happenshappens ifif an Ean E22--entity entity isis deleteddeleted ??

•• relational relational variantvariant 1 (1 (inheritedinherited attributesattributes duplicatedduplicated):):DeletionDeletion fromfrom bothboth relations relations isis necessarynecessary..

•• relational relational variantvariant 2 (2 (inheritedinherited and native A. and native A. separatedseparated):):DeletionDeletion fromfrom bothboth relations relations isis necessarynecessary..

•• relational relational variantvariant 3 (E3 (E22--attributes attributes onlyonly in in oneone relationrelation):):nono propagationpropagation of of deletionsdeletions requiredrequired

⇒ In variants 1 and 2: referential integrity constraintswith delete cascade is required.

⇒⇒ In In variantsvariants 1 and 2: 1 and 2: referentialreferential integrityintegrity constraintsconstraintswithwith deletedelete cascadecascade isis requiredrequired..

•• For For insertionsinsertions and and modificationsmodifications: : ChangesChanges in in severalseveral relations relations maymay bebe necessarynecessary,,tootoo ((dependingdepending on on thethe chosenchosen strategystrategy).).

•• DeletionDeletion of of instancesinstances of of thethe super super typetype EE11: : CascadingCascading deletiondeletion ifif thethe resp. resp. instanceinstance isis an Ean E22--instance, instance, tootoo ((againagain referentialreferential integrityintegrity).).

Page 43: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4343

RelationshipsRelationships and and referentialreferential integrityintegrity

EachEach relationshiprelationship typetype inducesinduces FOREIGN FOREIGN KEYKEY--constraintsconstraints as well:as well:

•• WithWith N : N : MM--functionalityfunctionality::

•• WithWith N : 1N : 1--functionality:functionality:

statestateriverriver

river_through_stateriver_through_state

CREATE TABLE river_through_state( River String REFERENCES river,

State String REFERENCES state )

CREATE TABLE CREATE TABLE river_through_stateriver_through_state( River String ( River String REFERENCESREFERENCES riverriver,,

State String State String REFERENCESREFERENCES statestate ))

NN MM

statestatecitycity

citycity_in__in_statestate11NN

CREATE TABLE city( . . . ,

State String REFERENCES state )

CREATE TABLE CREATE TABLE citycity( . . . ,( . . . ,

State String State String REFERENCESREFERENCES statestate ))

Page 44: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4444

FunctionalitiesFunctionalities as constraintsas constraints

statestatecitycity

citycity_in__in_statestate11NN

CREATE TABLE city( . . . ,CHECK COUNT (SELECT State

FROM city S WHERE S.Name = Name) =< 1

CREATE TABLE CREATE TABLE citycity( . . . ,( . . . ,CHECK CHECK COUNT COUNT (SELECT State (SELECT State

FROM FROM citycity S S WHERE S.Name = Name) WHERE S.Name = Name) =<=< 11

•• ButBut: : UniquenessUniqueness of of thethe statestate in in thethe citycity_in__in_statestate--relationshiprelationship hashasnotnot yetyet beenbeen expressedexpressed !!

•• An An additional CHECKadditional CHECK--constraintconstraint isis requiredrequired contrainingcontraining thethe numbernumber of of statestate instancesinstances::

ImplicitImplicit universal universal quantifierquantifier rangingrangingoverover eacheach citycity rowrow !!

Page 45: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4545

ER ER ⇒⇒ Relations (2)Relations (2)

Design Design documentationdocumentation

ConceptualConceptualdesigndesign

PhysicalPhysicaldesigndesign

Logical structureLogical structure„„on on paperpaper""

LogicalLogicaldesigndesign

ER modelER ER modelmodel

Relational modelRelational Relational modelmodel

MappingMapping

ImprovementImprovement

2nd step (after reaching relations):

Improve design by„fine tuning“ thestructure of tables

22ndnd step step ((afterafter reachingreaching relations): relations):

ImproveImprove design design byby„„fine fine tuningtuning““ thethestructurestructure of of tablestables

Page 46: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4646

RedundanciesRedundancies duedue to bad designto bad design

•• ExampleExample of of „„bad designbad design““ ((remainingremaining afterafter mappingmapping fromfrom ER ER levellevel): ): ‚‚City_in_City_in_statestate‘‘ and and ‚‚capitalcapital_of_of‘‘ havehave beenbeen placedplaced intointo a a singlesingle table.table.

City City State CapitalState Capital

BonnBonn NWNW DDüüsseldorfsseldorfKKöölnln NWNW DDüüsseldorfsseldorfEssenEssen NWNW DDüüsseldorfsseldorf. . .. . .MainzMainz RPRP MainzMainzTrierTrier RPRP MainzMainz. . .. . .

Redundantly stored information:Düsseldorf is the capitalof North Rhine-Westphalia.

RedundantlyRedundantly storedstored informationinformation::DDüüsseldorf sseldorf isis thethe capitalcapitalof North of North RhineRhine--WestphaliaWestphalia..

•• ObviouslyObviously oneone topictopic ((WhichWhich citycity isis thethe capitalcapital of . . . ?) has of . . . ?) has beenbeen combinedcombined withwithanotheranother topictopic (In (In whichwhich statestate isis a a certaincertain citycity situatedsituated ?) in such an ?) in such an „„unluckyunlucky““mannermanner thatthat considerableconsiderable redundanciesredundancies occuroccur, , resultingresulting in in wastewaste of of spacespace..

What does „a topic“ mean ?WhatWhat doesdoes „„a a topictopic““ meanmean ??

Page 47: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4747

AnomaliesAnomalies resultingresulting fromfrom redundanciesredundancies

City City State CapitalState Capital

BonnBonn NWNW DDüüsseldorfsseldorfKKöölnln NWNW DDüüsseldorfsseldorfEssenEssen NWNW DDüüsseldorfsseldorf. . .. . .MainzMainz RPRP MainzMainzTrierTrier RPRP MainzMainz. . .. . .

•• An immediate An immediate consequenceconsequence of such of such casescases of of storingstoring multiple multiple topicstopics in in oneone tabletableisis thethe occurrenceoccurrence of soof so--calledcalled anomaliesanomalies whenwhen updatingupdating such such tablestables::

AssumeAssume KKööln (ln (as largestas largestcitycity in NW) in NW) replacesreplacesDDüüsseldorfsseldorf as capitalas capital::

One One factfact changeschanges,,butbut multiple multiple updatesupdateshavehave to to bebe mademade..

•• AnaloguosAnaloguos anomaliesanomalies maymay happenhappen duedue to to insertionsinsertions and and deletionsdeletions::•• An An instanceinstance of of topictopic 1 1 disappearsdisappears,, as soonas soon as itas it isis no no longerlonger associated associated

withwith anyany instanceinstance of of topictopic 2.2.•• A A newnew instanceinstance of of topictopic 1 1 cancan onlyonly bebe insertedinserted, , ifif itit isis combined combined

withwith an an instanceinstance of of topictopic 2 (2 (oror null null valuesvalues areare usedused).).

Page 48: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4848

DecompositionsDecompositions avoidavoid redundanciesredundancies and and anomaliesanomalies

How to prevent such „defects“ (anomalies, redundancies) ?HowHow to to preventprevent such such „„defectsdefects““ ((anomaliesanomalies, , redundanciesredundancies) ?) ?

In In thethe exampleexample, , therethere isis a simple a simple remedyremedy: : SeparateSeparate thethe twotwo topicstopics intointo different relations!different relations!

City City State CapitalState Capital

BonnBonn NWNW DDüüsseldorf sseldorf KKöölnln NWNW DDüüsseldorfsseldorfEssenEssen NWNW DDüüsseldorfsseldorf. . .. . .MainzMainz RPRP MainzMainzTrierTrier RPRP MainzMainz. . .. . .

City City State State BonnBonn NW NW KKöölnln NWNWNeussNeuss NWNW. . . . . .

State Capital State Capital

NWNW DDüüsseldorf sseldorf RPRP MainzMainz. . .. . .No No redundancyredundancy, no , no anomalyanomaly !!

DecompositionDecomposition

Page 49: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 4949

Functional dependenciesFunctional dependencies and and decompositionsdecompositions

•• AlreadyAlready discovereddiscovered byby CoddCodd beforebefore 1970:1970: Functional relationshipsFunctional relationships betweenbetweenattributesattributes areare of of helphelp forfor findingfinding meaningfulmeaningful decompositionsdecompositions and and forfor avoidingavoidingreduncanciesreduncancies!!

•• ResultingResulting fromfrom thisthis observationobservation, , CoddCodd developeddeveloped an an elaborateelaborate theorytheory of of relational normal relational normal formsforms..

•• PrerequisitePrerequisite: Designers : Designers identifyidentify such such functionalfunctional relationshipsrelationships duringduring schema schema design (design (quitequite similarsimilar to to identifyingidentifying functionalitiesfunctionalities in in thethe ER ER modelmodel) and ) and express express themthem as as specialspecial integrityintegrity constraintsconstraints..

•• PrinciplePrinciple of of functionalfunctional dependencydependency: : •• LetLet A and B A and B bebe attributesattributes of a of a relation relation R.R.•• B B dependsdepends functionallyfunctionally on Aon A, , ifif in in eacheach statestate of R of R eacheach AA--valuevalue always always

occursoccurs in in combinationcombination withwith thethe samesame, , uniquelyuniquely determineddetermined BB--valuevalue..•• symbolicsymbolic notationnotation: : A A →→ BB

Functional dependencyFunctional dependencyFunctional dependency ((shortshort: FD): FD)

Page 50: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 5050

FDsFDs in in thethe exampleexample

City City State Capital State Capital

BonnBonn NWNW DDüüsseldorfsseldorfKKöölnln NWNW DDüüsseldorfsseldorfEssenEssen NWNW DDüüsseldorfsseldorf. . .. . .MainzMainz RPRP MainzMainzTrierTrier RPRP MainzMainz. . .. . .

•• EachEach citycity lies in lies in exactlyexactly oneone statestate: : City City →→ StateState•• EachEach statestate has has exactlyexactly oneone capitalcapital: : State State →→ CapitalCapital•• ButBut alsoalso: : IfIf a a citycity isis a a capitalcapital thenthen itit isis thethe capitalcapital of of exactlyexactly oneone statestate: :

Capital Capital →→ StateState•• EachEach citycity isis associatedassociated withwith exactlyexactly oneone capitalcapital ((namelynamely thethe capitalcapital of of itsits statestate): ):

City City →→ CapitalCapital

City City State State

State State CapitalCapital

•• DecompositionDecomposition separates separates FDsFDs::

•• One FD One FD seemsseems to to bebe lostlost, , howeverhowever: City : City →→ CapitalCapital

Page 51: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 5151

""Codd'sCodd's thesisthesis" "

•• Thesis (Thesis (claimclaim)) on on whichwhich thethe normalizationnormalization theorytheory of of CoddCodd isis basedbased::

Attributes connected via an FD represent semantically significant topicsof the application domain.

Attributes Attributes connectedconnected viavia anan FDFD representrepresent semanticallysemantically significantsignificant topicstopicsof of thethe applicationapplication domaindomain. .

•• ThatThat isis: : EveryEvery FDFD identifiesidentifies a a topictopic –– thusthus separatingseparating topicstopics meansmeans separatingseparating FDsFDs..

•• ButBut: : Not Not everyevery such such topictopic isis necessarilynecessarily representedrepresented byby an FD.an FD.

•• MoreoverMoreover: FD: FD--connectionconnection isis a a sufficientsufficient, , butbut notnot a a necessarynecessary criterioncriterion forfor the the existenceexistence of a of a ‚‚topictopic‘‘..

•• Basic Basic ideaidea of of CoddCodd‘‘s s approachapproach to to normalizationnormalization of relations:of relations:DecomposeDecompose relations in such a way relations in such a way thatthat „„normallynormally"" eacheach FD has a FD has a compocompo--nentnent relationrelation of of itsits ownown. . ButBut trytry to to identifyidentify exceptionsexceptions wherewhere severalseveral FDsFDsmaymay „„coexistcoexist““ in in oneone and and thethe samesame relationrelation !!

Page 52: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 5252

Transitive Transitive FDsFDs and Armstrongand Armstrong‘‘s s axiomsaxioms

•• ThereThere areare FDsFDs whichwhich areare derivablederivable fromfrom otherother FDsFDs, , alreadyalready knownknown. An . An importantimportantexampleexample areare soso--calledcalled transitivetransitive FDsFDs::

•• EveryEvery such transitive such transitive casecase leadsleads to a (to a (newnew) ) functionalfunctional dependencydependency..

αα →→ ββ isis a a transitivetransitive FD FD ifif therethere isis an an attributeattribute setset γγ,,such such thatthat αα →→ γγ and and γγ →→ ββ areare bothboth FDsFDs, , butbut notnot γγ →→ αα. .

•• ThereThere areare twotwo otherother such such inferenceinference rulesrules forfor FDsFDs, , calledcalled thethe „„Armstrong Armstrong axiomsaxioms““((as theyas they havehave beenbeen discovereddiscovered byby thethe Canadian scientistCanadian scientist W. Armstrong).W. Armstrong).

•• Not to Not to bebe foundfound in in thethe literatureliterature, , butbut quitequite usefuluseful: Special : Special notionnotion forfor FDsFDs which which areare notnot transitive transitive FDsFDs::

αα →→ ββ isis a a directdirect FD FD ifif therethere isis nono attributeattribute setset γγ,,such such thatthat αα →→ γγ and and γγ →→ ββ areare bothboth FDsFDs, , butbut notnot γγ →→ αα..

β ⊆ α ⇒ α→ β

α → β ⇒ α γ → β γ

ββ ⊆⊆ αα ⇒⇒ αα →→ ββ

αα →→ ββ ⇒⇒ αα γγ →→ ββ γγ

EveryEvery subsetsubset dependsdepends on on itsits supersetsuperset..

AugmentationAugmentation on on bothboth sidessides..

Page 53: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 5353

Normal Normal formsforms

„Codd's recepy“ (in short from):Redundancies can be safely avoided if FDs always originate fromcandidate keys of a relation.

„„Codd'sCodd's recepyrecepy““ (in (in shortshort fromfrom):):RedundanciesRedundancies cancan bebe safelysafely avoidedavoided ifif FDsFDs alwaysalways originateoriginate fromfromcandidatecandidate keyskeys of a of a relationrelation. .

•• ProperlyProperly determiningdetermining FDsFDs and and investigatinginvestigating theirtheir propertiesproperties isis thethe basisbasis forfor eacheachmeaningfulmeaningful decompositiondecomposition of relations of relations intointo componentscomponents freefree of of redundanciesredundancies..

•• CoddCodd defineddefined variousvarious degreesdegrees of FD of FD separationseparation, , calledcalled normal normal formsforms –– thethe processprocessof of transformingtransforming a a givengiven schemaschema intointo relations all of relations all of whichwhich exhibitexhibit a a givengiven normalnormalform form isis calledcalled normalizationnormalization..

•• TheThe mostmost importantimportant normal form normal form isis thethe thirdthird normal formnormal form ((shortshort: : 3NF3NF) ) defineddefined asasfollowsfollows::

•• ThereThere areare variousvarious otherother normal normal formsforms (1NF, 2NF, 4NF and (1NF, 2NF, 4NF and othersothers).).

A relation is in 3rd normal form ⇔Each non-key attribute functionally depends directlyon each candidate key of the relation .

A A relationrelation isis in in 33rdrd normal formnormal form ⇔⇔EachEach nonnon--keykey attributeattribute functionallyfunctionally dependsdepends directlydirectlyon on eacheach candidatecandidate keykey of of thethe relationrelation ..

Page 54: ER Normalization

©© 2008 2008 Prof. Dr. Rainer MantheyProf. Dr. Rainer Manthey Foundations of IMFoundations of IM 5454

NormalizationNormalization in in thethe exampleexample schemaschema

City City StateState Capital Capital BonnBonn NWNW DDüüsseldorfsseldorfKKöölnln NWNW DDüüsseldorfsseldorfEssenEssen NWNW DDüüsseldorfsseldorf. . .. . .

City City State State

State State CapitalCapital

((ℜℜ, , )) ((ℜℜ , , ))

((ℜℜ , , ))

•• OurOur exampleexample schemaschema fromfrom thethe geographicgeographic domaindomain originallyoriginally was was notnot in 3NF, in 3NF, as as itit still still containscontains a transitive a transitive dependencydependency pointingpointing fromfrom a a candidatecandidate keykey (City) (City) to a to a nonnon--keykey attributeattribute (Capital):(Capital):

•• After After decompositiondecomposition, , howeverhowever, , eacheach of of thethe resultingresulting componentcomponent schemasschemas isisin 3NF! Such a in 3NF! Such a decompositiondecomposition will will alwaysalways bebe possiblepossible..

•• TheThe „„lostlost““ dependencydependency City City →→ Capital Capital cancan bebe reconstructedreconstructed byby meansmeans of of thethetransitivitytransitivity axiomaxiom of Armstrong.of Armstrong.

3NF3NF

3NF3NF

3NF3NF