sql basics and normalization

50
Examples of Data Types Examples of Data Types

Upload: dotram

Post on 12-Feb-2017

234 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SQL Basics and Normalization

Examples of Data TypesExamples of Data Types

Page 2: SQL Basics and Normalization

Constraints in Create Table

• Adding constraints to a table enables the

database system to enforce data integrity.

• However, adding constraints also makes

inserting data slower.

• Different types of constraints:

* Not Null * Default Values

* Unique * Primary Key

* Foreign Key * Check Condition

Page 3: SQL Basics and Normalization

Not Null Constraint

CREATE TABLE Employee(

ID INTEGER NOT NULL,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

Dept INTEGER

);

Page 4: SQL Basics and Normalization

Default Values

CREATE TABLE Employee(

ID INTEGER NOT NULL,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1) default(‘F’),

Salary INTEGER NOT NULL,

Dept INTEGER

);

Page 5: SQL Basics and Normalization

Unique Constraint

(Syntax 1)

CREATE TABLE Employee(

ID INTEGER UNIQUE NOT NULL,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1) default(‘F’),

Salary INTEGER NOT NULL,

Dept INTEGER

);

Page 6: SQL Basics and Normalization

Unique Constraint

(Syntax 2)

CREATE TABLE Employee(

ID INTEGER NOT NULL,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1) default(‘F’),

Salary INTEGER NOT NULL,

Dept INTEGER,

UNIQUE(ID)

);

Page 7: SQL Basics and Normalization

Unique Constraint

(Another Example)

Can this be written differently?

CREATE TABLE Employee(

ID INTEGER NOT NULL,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1) default(‘F’),

Salary INTEGER NOT NULL,

Dept INTEGER,

UNIQUE(FNAME,LNAME)

);

Page 8: SQL Basics and Normalization

Primary Key Constraint

Primary Key implies: * NOT NULL * UNIQUE.

There can only be one primary key.

CREATE TABLE Employee(

ID INTEGER PRIMARY KEY,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

Dept INTEGER,

UNIQUE(FNAME,LNAME)

);

Page 9: SQL Basics and Normalization

Primary Key Constraint

(Syntax 2)

CREATE TABLE Employee(

ID INTEGER,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

Dept INTEGER,

PRIMARY KEY(FNAME,LNAME)

);

Page 10: SQL Basics and Normalization

CREATE TABLE Department(

DeptNumber INTEGER PRIMARY KEY,

Name VARCHAR(20),

ManagerId INTEGER

);

Shouldn’t all department numbers in

Employee appear in Department?

CREATE TABLE Employee(

ID INTEGER primary key,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

Dept INTEGER );

Page 11: SQL Basics and Normalization

Foreign Key

ID FName LName Gender Sallar

y

Dept

02334 Larry Bird M 230000 12

04556 Magic Johnso

n

M 270000 45

��Foreign KeyForeign Key

Dept Name ManID

12 Sales 988

45 Repair 876

Employee

��DepartmentDepartment

Page 12: SQL Basics and Normalization

Foreign Key Constraint

Note 1: DeptNumber must be unique (or primary key) in

Department

Note 2:You can use this syntax only if the name of the

fields in both tables are identical

Note 3: The referencing attribute (DeptNumber) can be

null

CREATE TABLE Employee (

ID INTEGER primary key,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

DeptNumber INTEGER REFERENCES Department

);

Page 13: SQL Basics and Normalization

Foreign Key Constraint (different names of

attributes)

CREATE TABLE Employee(

ID INTEGER primary key,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

Dept INTEGER REFERENCES

Department(DeptNumber)

);

Page 14: SQL Basics and Normalization

Foreign Key Constraint

CREATE TABLE Employee(

ID INTEGER primary key,

Fname VARCHAR(20),

Lname VARCHAR(20),

Gender CHAR(1),

Salary INTEGER NOT NULL,

DeptNumber INTEGER,

FOREIGN KEY (DeptNumber)

REFERENCES Department

);

Using this syntax allows a pair or more of attributes to be

a foreign key

Page 15: SQL Basics and Normalization

Deleting referenced

values

What happens to Employeess in

department 312 when Department 312

is removed from the Department

table?

Page 16: SQL Basics and Normalization

Deleting a Referenced ValueDeleting a Referenced Value

If nothing additional is specified, then

postgreSQL will not allow Department 312 to

be deleted if there are Employees working in

(referencing to) this department.

You can also specify this explicitly:

FOREIGN KEY (Dept) REFERENCES

Department ON DELETE RESTRICT

�Alternatively, if the constraint is written

as

FOREIGN KEY (Dept) REFERENCES

Department ON DELETE CASCADE

then Employees working in 312 will be deleted

automatically from the Employee table, when

312 is deleted from Departments

Page 17: SQL Basics and Normalization

Check Conditions

A check condition is a Boolean

expression:

� “And”s and “Or”s of conditions of the

type X > 5…

�On a column: it can refer only to

the column

�On a table: it can refer only to

multiple columns in the table

Page 18: SQL Basics and Normalization

Check Constraints

CREATE TABLE Employee(CREATE TABLE Employee(

ID INTEGER primary key,ID INTEGER primary key,

FnameFname VARCHAR(VARCHAR(2020),),

LnameLname VARCHAR(VARCHAR(2020),),

Gender CHAR(Gender CHAR(11) ) check(gender=‘F’ or check(gender=‘F’ or

gender=‘M’),gender=‘M’),

Salary INTEGER NOT NULL,Salary INTEGER NOT NULL,

DeptNumberDeptNumber INTEGER INTEGER

););

Page 19: SQL Basics and Normalization

Deleting a Table

To delete the table Employee :

DROP TABLE Employee;

Mind the order of dropping when there are foreign key

constraints.

Page 20: SQL Basics and Normalization

From ER Diagrams To Relational Database

Generally, but not always

• An entity set is converted into a table

• A relationship is converted into a table

When converting ER diagrams to Relations, we should try to:

1. Reduce duplicate information

2. Constrain as tightly as possible

Note:

1. Some scenarios can be represented in different ways.

2. Sometimes we will not be able to fully represent

constraints, or will be forced to represent

information more than once.

Page 21: SQL Basics and Normalization

Simple entity translationSimple entity translation

General Rule:

� Create a table with the name of the Entity.

� There is a column for each attribute

� The key in the diagram is the primary key of the

table

ActorActor

idid

namename addressaddress

birthdaybirthday

create table Actor(id varchar(20) primary key,

name varchar(40),

birthday date,

address varchar(100));

Page 22: SQL Basics and Normalization

Translating Entities with Relationships Translating Entities with Relationships

(without constraints)(without constraints)

1. Create tables for the entities as before

2. Create a table with the name of the relationship

3. Relationship table attributes: its own attributes (salary) + all

keys of the relating entities (title, id).

Q: What is the primary key of the table?

A: A composite of both entity keys

Q: What foreign keys are needed?

A: From the relationship table to the entities

Actor

id

name

address

birthday

Acted In

Film

title

type

year

salary

Page 23: SQL Basics and Normalization

create table ActedIn(id varchar(20) references

actor, title varchar(20) references film,salary

integer, primary key(id,title));

Page 24: SQL Basics and Normalization

Translating Recursive Relationships Translating Recursive Relationships

(without constraints)(without constraints)

Employee

id

name

address

Manages

manager

worker

Relation: Manages (Wid, Mid)

What would be the table definition?

create table Manages(

Eid varchar(20) references Employee(id),

Mid varchar(20) references Employee(id),

Primary key(Eid, Mid));How would you prevent someone

from being his own manager?

Page 25: SQL Basics and Normalization
Page 26: SQL Basics and Normalization

More About The Example

The given ER diagram is clear, other than

• Discovered, which is the continent in which a

particular species was first discovered

Each child is a “dependant” of only one employee in our

database

• If both parents are employees, the child is

“assigned” to one of them

We are given additional information about the application

• Values of attributes in a primary key must not be

missing (this is a general rule, not only for this

example)

• Other than attributes in a primary key, other

attributes unless stated otherwise may be missing

• The value of Name is known for every Employee

To build up our intuition, let’s look at some specific

instance of our application

Page 27: SQL Basics and Normalization

Country

There are four countries, listing for them: Country name,

Population

•United States,

•India, 1150

•Canada, 1330

•Russia,

We create a table for Country “in the most obvious way,”

by creating a column for each attribute (underlying the

attributes of the primary key) and this works:

Country Name Population

United States

India 1150

Canada 1330

Russia

Note that some “slots” are NULL, indicated by emptiness

Page 28: SQL Basics and Normalization

Animal

There are five animals, listing for them: Species,

Discovered (note, that even though not required,

Discovered happens to be known for every Species):

• Horse, Asia

• Wolf, Asia

• Cat, Africa

• Yak, Asia

• Zebra, Africa

We create a table for Animal as before, and this

works:

Species Discovered

Horse Asia

Wolf Asia

Cat Africa

Yak Asia

Zebra Africa

Page 29: SQL Basics and Normalization

Employee

There are five employees, listing for them: ID#, Name,

(name of) Child (note there may be any number of Child

values for an Employee, zero or more):

• 9168, Nitin, Shivam, Bhargav

• 9270, Damodar, Anushka

• 1234, Shrikant, Monali,Rutuja

• 2140, Gajanan,

• 2167, Shrikala, Rajan, Harshad

ID Name Child Child

9168 Nitin Shivam Bhargav

9270 Damodar Anushka

1234 Shrikant Rutuja Monali

2140 Gajanan

2167 Shrikala Rajan Harshad

Page 30: SQL Basics and Normalization

Likes

Note that there are foreign key constraints

• ID# appearing in Likes is a foreign key

referencing Employee

• Species appearing in Likes is a foreign key

referencing Animal

And two many-to-one mappings are induced

ID Name

9168 Nitin

9270 Damodar

1234 Shrikant

2140 Gajanan

2167 Shrikala

ID Species

9168 Horse

9168 Horse

9270 Yak

2140 Zebra

2167 Zebra

Species Discovered

Horse Asia

Wolf Asia

Cat Africa

Yak Asia

Zebra Africa

Page 31: SQL Basics and Normalization

Born

• Born needs to specify which employees were born in which countries

(for whom this information is known)

• Such specification can done using the primary keys of the entities/

tables

• The relation Born contains some tuples:

ID Country Name

9168 India

9270 India

1234 United State

2140 Canada

2167 Russia

Page 32: SQL Basics and Normalization

Functional DependenciesFunctional Dependencies

We say an attribute, B, has a functional dependency on another attribute, A, if

for any two records, which have

the same value for A, then the values for B in these two records must be the

same. We illustrate this as:

A � B

Example: Suppose we keep track of employee email addresses, and we

only track one email address for each employee. Suppose each employee

is identified by their unique employee number. We say there is a

functional dependency of email address on employee number:

employee number � email address

Page 33: SQL Basics and Normalization

Functional DependenciesEmpNum EmpEmail EmpFname EmpLname

If EmpNum is the PK then the FDs:

EmpNum � EmpEmail

EmpNum � EmpFname

EmpNum � EmpLname

must exist.

Page 34: SQL Basics and Normalization

Functional Dependencies

EmpNum � EmpEmail

EmpNum � EmpFname

EmpNum � EmpLname

EmpNum

EmpEmail

EmpFname

EmpLname

EmpNum EmpEmail EmpFname EmpLname

3 different ways you

might see FDs

depicted

Page 35: SQL Basics and Normalization

Determinant

Functional Dependency

EmpNum � EmpEmail

Attribute on the LHS is known as the determinant

• EmpNum is a determinant of EmpEmail

Page 36: SQL Basics and Normalization

Transitive dependency

Transitive dependency

Consider attributes A, B, and C, and where

A � B and B � C.

Functional dependencies are transitive, which means that we also

have the functional dependency A � C

We say that C is transitively dependent on A through B.

Page 37: SQL Basics and Normalization

Transitive dependency

EmpNum EmpEmail DeptNum DeptNname

EmpNum EmpEmail DeptNum DeptNname

DeptName is transitively dependent on EmpNum via DeptNum

EmpNum � DeptName

EmpNum � DeptNum

DeptNum � DeptName

Page 38: SQL Basics and Normalization

Partial dependencyA partial dependency exists when an attribute B is functionally

dependent on an attribute A, and A is a component of a multipart

candidate key.

InvNum LineNum Qty InvDate

Candidate keys: {InvNum, LineNum} InvDate is partially

dependent on {InvNum, LineNum} as InvNum is a determinant of

InvDate and InvNum is part of a candidate key

Page 39: SQL Basics and Normalization

Normalization Process in DatabasesNormalization Process in Databases

Page 40: SQL Basics and Normalization

Database NormalizationDatabase Normalization

• Database normalization is the process of

removing redundant data from your tables

in to improve storage efficiency, data

integrity, and scalability.

• In the relational model, the methods for

quantifying how efficient a database is we

can use methods which are classified as

normal forms (or NF). and there are

algorithms for converting a given database

between them.

Page 41: SQL Basics and Normalization

• Typically, a Normalization process

generally involves splitting existing

tables into multiple ones, which must be

re-joined or linked each time a query is

issued.

Page 42: SQL Basics and Normalization

HistoryHistory

• Edgar F. Codd first proposed the processof normalization and what came to be knownas the 1st normal form in his paper “ARelational Model of Data for Large SharedData Banks**” Codd stated:

“There is, in fact, a very simpleelimination procedure which we shall callnormalization. Through decomposition non-simple domains are replaced by ‘domainswhose elements are atomic (non-decomposable) values.’”

** http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf** http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf

Page 43: SQL Basics and Normalization

Normal FormNormal Form

• Edgar F. Codd originally established three normal

forms: 1NF, 2NF and 3NF.

• There are now others that are generally accepted,

but 3NF is widely considered to be sufficient for

most applications.

• Most tables when reaching 3NF are also in BCNF

(Boyce-Codd Normal Form).

Page 44: SQL Basics and Normalization

•• 11NFNF EliminateEliminate RepeatingRepeating GroupsGroups - Make a

separate table for each set of related

attributes, and give each table a primary

key.

•• 22NFNF EliminateEliminate RedundantRedundant DataData - If an

attribute depends on only part of a multi-

valued key, remove it to a separate table.

•• 33NFNF EliminateEliminate ColumnsColumns NotNot DependentDependent OnOn KeyKey

- If attributes do not contribute to a

description of the key, remove them to a

separate table.

Page 45: SQL Basics and Normalization

Movie_idMovie_id TitleTitle ActorsActors

11 ShaanShaan AmitabhAmitabh BachchanBachchan

ShashiShashi KapoorKapoor

Sunil Sunil DuttDutt

ShatruganShatrugan SinhaSinha

22 DhoomDhoom AbhishekAbhishek BachchanBachchan

UdayUday ChopraChopra

33 KoiKoi Mil GayaMil Gaya HritikHritik RoshanRoshan

44 JodhaJodha AkbarAkbar HritikHritik RoshanRoshan

55 JaniJani DushmanDushman Sunny Sunny DeolDeol

Movie List

Page 46: SQL Basics and Normalization

Movie_id Title

1 Shaan

2 Dhoom

3 Koi Mil Gaya

4 Jodha Akbar

5 Jani Dushman

Actor Id Movie_id Actors

1 1 Amitabh Bachchan

2 1 Shashi Kapoor

3 1 Sunil Dutt

4 1 Shatrugan Sinha

5 2 Abhishek Bachchan

6 2 Uday Chopra

7 3 Hritik Roshan

8 4 Hritik Roshan

9 5 Sunney Deol

1 NF1 NF

SeparatingSeparating thethe repeatingrepeating

groupsgroups ofof databasesdatabases

fromfrom thethe membermember

informationinformation resultsresults inin

firstfirst normalnormal formform..

Movie

Table

Actor

Table

Page 47: SQL Basics and Normalization

Actor Id Movie_id Actors

1 1 Amitabh Bachchan

2 1 Shashi Kapoor

3 1 Sunil Dutt

4 1 Shatrugan Sinha

5 2 Abhishek

Bachchan

6 2 Uday Chopra

7 3 Hritik Roshan

8 4 Hritik Roshan

9 5 Sunney Deol

Actor

Table

2 NF2 NF

EliminateEliminate RedundantRedundant

DataData inin SecondSecond normalnormal

formform..

Page 48: SQL Basics and Normalization

Movie_id Title

1 Shaan

2 Dhoom

3 Koi Mil Gaya

4 Jodha Akbar

5 Janni Dushman

Movie

Table

Actor Id Actors

1 Amitabh Bachchan

2 Shashi Kapoor

3 Sunil Dutt

4 Shatrugan Sinha

5 Abhishek Bachchan

6 Uday Chopra

7 Hritik Roshan

8 Sunney Deol

Actor

Table

Actor Id Movie_id

1 1

2 1

3 1

4 1

5 2

6 2

7 3

7 4

8 5

Mo

vie

_A

cto

r T

ab

le

Page 49: SQL Basics and Normalization

3 3 NFNF

Eliminate Columns Not

Dependent On Key

Mem_idMem_id NameName CompanyCompany Company_LocationCompany_Location

11 ManojManoj IBMIBM HinjewadiHinjewadi

22 RajeshRajesh GSGS LabsLabs AundhAundh

33 RakeshRakesh PSPLPSPL ShardaSharda CenterCenter

44 PoorviPoorvi SymantecSymantec BanerBaner RoadRoad

55 PrathameshPrathamesh GSGS LabsLabs AundhAundh

66 GautamGautam GS LabsGS Labs AundhAundh

77 UdayUday SymantecSymantec BanerBaner RoadRoad

88 PankajPankaj PSPLPSPL ShardaSharda CenterCenter

Member Table Member Table

Page 50: SQL Basics and Normalization

Mem_idMem_id NameName Company_IdCompany_Id

11 ManojManoj 1

22 RajeshRajesh 2

33 RakeshRakesh 3

44 PoorviPoorvi 4

55 PrathameshPrathamesh 2

66 GautamGautam 2

77 UdayUday 4

88 PankajPankaj 3

Company_IdCompany_Id CompanyCompany Company Company

LocationLocation

1 IBMIBM HinjewadiHinjewadi

2 GSGS LabsLabs AundhAundh

3 PSPLPSPL ShardaSharda CenterCenter

4 SymantecSymantec BanerBaner RoadRoad

Member Table Member Table Company Table Company Table