1 conceptual design with er model lecture #2. 2 lecture outline logistics steps in building a...
TRANSCRIPT
1
Conceptual Design with ER Model
Lecture #2
2
Lecture Outline
• Logistics
• Steps in building a database application
• Conceptual design with ER model
3
Steps in Building a DB Application
• Step 1: Pick an application– something where you think you will have quite a bit
of data– need to manage multiple users who process/query the
data
• Step 2: Application analysis and conceptual design– discuss what to model for the application– need a modeling language to express what you want– ER model is the most popular such language– output: an ER diagram of the application
4
address name field
Professor
Advises
Takes
Teaches
CourseStudent
name category
quarter
name
ssn
Example ER Diagram
cid
5
Steps in Building a DB Application
• Step 3: Translate the ER diagram into a relational schema– using a set of rules
• Step 4: Refine the relational schema (normalization)– use a set of schema refinement rules to transform the
above rel. schema into a good rel. schema
• At this point– you have a good relational schema on paper
6
Example Relational Schema
SSN Name Category 123-45-6789 Charles undergrad 234-56-7890 Dan grad … …
SSN CID 123-45-6789 CSE444 123-45-6789 CSE444 234-56-7890 CSE142 …
Students: Takes:
CID Name Quarter CSE444 Databases fall CSE541 Operating systems winter
Courses:
7
Steps in Building a DB Application
• Step 5: Pick a relational DB management system (RDBMS)– MS SQL server, IBM DB2, Oracle– MySQL, PostreSQL– SQLite, ...
• Step 6: Implement your DB in that RDBMS– first, implement the schema– next, add the tuples– you can do all these using a language called SQL
8
Steps in Building a DB Application
• Step 7: Query your DB using SQL– now that your DB is in place, you can query it using
SQL in the “command-line interface”
select C.namefrom Students S, Takes T, Courses Cwhere S.name = “Mary” and S.ssn = T.ssn and T.cid = C.cid
select C.namefrom Students S, Takes T, Courses Cwhere S.name = “Mary” and S.ssn = T.ssn and T.cid = C.cid
Find all courses that Mary takes
9
Steps in Building a DB Application• Subsequent steps
– you may not want to access the DB just via the command line interface• ordinary users may not know how to interact with the database directly & ask SQL queries• and the database also cannot do everything you want; SQL is a limited language (not Turing complete, can’t do recursive stuff)
– hence you often want to write an application program in C++, Java, Perl, etc to give lay users an easier way to query the DB and take care of things that the database cannot do
– many such applications user a Web interface to interact with users
10
Database management system
(RDBMS)
DB 2
DB 3
DB 1PhP code
App 2
Web browser
interface
11
12
Database management system
(RDBMS)
DB 2
DB 3
DB 1 PhP code
Database developer
Database system administrator (DBA)
Lay users
13
Steps in Building a DB Application• Subsequent steps
– When the tables have many tuples, queries may run very slowly– so you may need to build indexes to speed up query execution– in the long run, need to maintain the DB, expand, clean, etc.
• This class will discuss the above steps, and will look into the internals of RDBMSs
14
We Will Start with Step 2
• Step 1: Pick an application– something where you think you will have quite a bit
of data– may need to manage multiple users who
process/query the data
• Step 2: Application analysis and conceptual design– discuss what to model for the application– need a modeling language to express what you want– ER model is the most popular such language– output: an ER diagram of the application
15
ER Model
• Gives us a language to specify– what information the db must hold– what are the relationships among components of that
information
• Proposed by Peter Chen in 1976
• What we will cover– basic stuff: entities, relations– constraints– weak entity sets– design principles
16address name ssn
Person
buys
makes
employs
CompanyProduct
name category
stockprice
name
price
Basic Concepts
17
Entities and Attributes
• Entities– real-world objects distinguishable from other objects– described using a set of attributes
• Attributes– each has an atomic domain: string, integers, reals, etc.
• Entity set: a collection of similar entities
Product
name categoryprice
Company
stockprice
name
18
Relations
• A mathematical definition:– if A, B are sets, then a relation R is a subset of A x B
• A={1,2,3}, B={a,b,c,d}, R = {(1,a), (1,c), (3,b)}
makes is a subset of Product x Company:
1
2
3
a
b
c
d
A=
B=
makes CompanyProduct
19address name ssn
Person
buys
makes
employs
CompanyProduct
name category
stockprice
name
price
ER Diagram
20
More about relationships ...
21
Multiplicity of E/R Relations
• one-one:
• many-one
• many-many
123
abcd
123
abcd
123
abcd
makes CompanyProduct
22
Multiplicity of E/R Relations
• one-one:
• many-one
• many-many
123
abcd
123
abcd
123
abcd
makes CompanyProduct
1 b
2 b
3 d
1 c
2 a
3 d
23
Important: Difference with the Cow Book (Raghu Book)
• You should use the notations in the lectures
Manages DepartmentEmployee
• We will use
Manages DepartmentEmployee
• Cow book use (see Page 33, 3rd edition)
24
Multiway RelationshipsHow do we model a purchase relationship between buyers,products and stores?
Purchase
Product
Person
Store
Can still model as a mathematical set (how?)
• When creating an ER diagram, you will create various relationships
• Then decide on the “cardinality” of those– should they be 1-1, 1-many, many-1, many-many, etc
• Then specify these cardinalities on the ER diagram
• You may or may not be able to specify the *exact* cardinality requirements– specially for multiway relationships
• May need to approximate
25
26
Given the following relationship
Suppose invoice determines the rest
How would we specify this?
Rental
VideoStore
Person
Movie
Invoice
Example
27
There is no good way; here’s a possible solution
Q: Why is this incomplete ?
Rental
VideoStore
Person
Movie
Invoice
Example
28
A: Because of the meaning of such an arrow sign
Rental
VideoStore
Person
Movie
Invoice
If I know the store, person, invoice, I know the movie too
29
Roles in Relationships
Purchase
What if we need an entity set twice in one relationship?
Product
Person
Store
salesperson buyer
Person
• This is a bit confusing, because we have no idea what role each Person entity would play
• So we should label the edges to indicate the roles
30
Purchase
Product
Person
Store
salesperson buyer
Person
31
We can condense this further by collapsing the two Persons into one
Purchase
Product
Person
Store
salesperson buyer
32
Attributes on Relationships
Purchase
Product
Person
Store
date
Attributes of Entities and Relationships always have atomic values!
33
34
Consider Attribute Phone of Professors
34address name phone
Professor
Advises
Takes
Teaches
CourseStudent
name category
quarter
name
ssncid
35
Converting Multiway Relationships to Binary
Purchase
Person
Store
Product
StoreOf
ProductOf
BuyerOf
date
36
Relationships: Summary
• Modeled as a mathematical set
• Binary and multiway relationships
• Converting a multiway one into many binary ones
• Constraints on the degree of the relationship– many-one, one-one, many-many– limitations of arrows
• Attributes of relationships– not necessary, but useful
37
Product
name category
price
isa isa
Educational ProductSoftware Product
Age Groupplatforms
Subclasses in ER Diagrams
38
Subclasses
• Subclass = special case = fewer entities = more properties.
• Example: Ales are a kind of beer.– Not every beer is an ale, but some are.– Let us suppose that in addition to all the properties
(attributes and relationships) of beers, ales also have the attribute color.
39
Subclasses in ER Diagrams
• Assume subclasses form a tree.– I.e., no multiple inheritance.
• Isa triangles indicate the subclass relationship.– Point to the superclass.
40
Example
Beers
Ales
isa
name manf
color
41
Constraints
• A constraint = an assertion about the database that must be true at all times
• Part of the database schema
• Very important in database design
• When creating the ER diagram, you need to find as many constraints as possible
• And specify them on the ER diagram
42
Modeling ConstraintsFinding constraints is part of the modeling process. Commonly used constraints:
Keys: social security number uniquely identifies a person.
Single-value constraints: a person can have only one father.
Referential integrity constraints: if you work for a company, it must exist in the database.
Domain constraints: peoples’ ages are between 0 and 150. General constraints: all others (at most 50 students enroll in a class)
43
Why Constraints are Important?
• Give more semantics to the data– help us better understand it
• Prevent wrong data entry
• Allow us to refer to entities (e.g, using keys)
• Enable efficient storage, data lookup, etc.
44
Keys in E/R Diagrams
address name ssn
Person
Product
name category
price
No formal way to specify multiplekeys in E/R diagrams
Underline:
45
More about Keys
• Every entity set must have a key– why?
• A key can consist of more than one attribute
• There can be more than one key for an entity set– one key will be designated as primary key
• Requirement for key in an isa hierarchy– not covered in this lecture
46
Single Value Constraint
• At most one value play a particular role
• An attribute of a entity set has a single value– we can specify if the value must be present or can be
missing (represented with say NULL or -1)– example in real-estate domain
• price vs. house-style
• A many-one relation implies single value const.
47
Referential Integrity Constraint
• Single value constraint: at most one value exists in a given role
• Ref. int. constraint: exactly one value exists in a given role
• An attribute has a non-null, single value– this can be considered a kind of ref. int. constraint
• However, we more commonly use such constraints to refer to relationships
48
Referential Integrity Constraints
• In some formalisms we may refer to other object but get garbage instead– e.g. a dangling pointer in C/C++
• the Referential Integrity Constraint on relationships explicitly requires a reference to exists
49
Referential Integrity Constraints
CompanyProduct makes
CompanyProduct makes
• This will be even clearer once we get to relational databases
50
Other Kinds of Constraints
• Domain constraints
• Constraints on degree of a relationship
• Other more general constraints
• See the book
51
Weak Entity Sets
Entity sets are weak when their key attributes come from otherclasses to which they are related.
This happens if:
- part-of hierarchies - splitting n-ary relations to binary.
UniversityTeam affiliation
numbersport name
)
52
Weak Entity Sets
• Occasionally, entities of an entity set need “help” to identify them uniquely.
• Entity set E is said to be weak if in order to identify entities of E uniquely, we need to follow one or more many-one relationships from E and include the key of the related entities from the connected entity sets.
53
Now, about design techniques ...
54
Design Principles 1: Be Faithful
PurchaseProduct Person
President PersonCountry
Teaches CourseInstructor
55
Design Principles 2:Avoid Redundancy
Purchase
Product
Store
date
personNamepersonAddr
56
Design Principles 3: KISS
Purchase
Product
Person
Store
dateDates
• Read the book for more design principles
57
More on Design Techniques (from Ullman's slides)
1. Avoid redundancy.
2. Limit the use of weak entity sets.
3. Don’t use an entity set when an attribute will do.
58
Avoiding Redundancy
• Redundancy occurs when we say the same thing in two different ways.
• Redundancy wastes space and (more importantly) encourages inconsistency.– The two instances of the same fact may become
inconsistent if we change one and forget to change the other, related version.
59
Example: Good
Beers ManfsManfBy
name
This design gives the address of each manufacturer exactly once.
name addr
60
Example: Bad
Beers ManfsManfBy
name
This design states the manufacturer of a beer twice: as an attribute and as a related entity.
name
manf
addr
61
Example: Bad
Beers
name
This design repeats the manufacturer’s address once for each beer; loses the address if there are temporarily no beers for a manufacturer.
manf manfAddr
62
Entity Sets Versus Attributes
• An entity set should satisfy at least one of the following conditions:
– It is more than the name of something; it has at least one nonkey attribute.
or– It is the “many” in a many-one or many-many
relationship.
63
Example: Good
Beers ManfsManfBy
name
•Manfs deserves to be an entity set because of the nonkey attribute addr.
•Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy.
name addr
64
Example: Good
Beers
name
There is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name.
manf
65
Example: Bad
Beers ManfsManfBy
name
Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set.
name
66
Don’t Overuse Weak Entity Sets
• Beginning database designers often doubt that anything could be a key by itself.– They make all entity sets weak, supported by all
other entity sets to which they are linked.
• In reality, we usually create unique ID’s for entity sets.– Examples include social-security numbers,
automobile VIN’s etc.
67
When Do We Need Weak Entity Sets?
• The usual reason is that there is no global authority capable of creating unique ID’s.
• Example: it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world.
68
ER Review
• Basic stuff– entity, attribute, entity set– relation: binary, multiway, converting from multiway– relationship roles, attributes on relationships– subclasses (is-a)
• Constraints– on relations
• many-one, one-one, many-many
• limitations of arrows
– keys, single-valued, ref integrity, domain & general constraints
69
ER Review
• Weak entity set
• Design principles– be faithful – avoid redundancy– KISS