1 introduction to database design methodology chapter 6 data redundancy and anomalies

23
1 Introduction to Introduction to Database Design Database Design Methodology Methodology Chapter 6 Data Redundancy and Anomalies

Upload: jaelyn-hadwin

Post on 15-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

1

Introduction to Introduction to Database Design Database Design MethodologyMethodology

Chapter 6 Data Redundancy and Anomalies

2

Learning GoalsLearning Goals

To explain the problems associated with poorly-designed database

To identify good and poor table structures

3

Table of ContentsTable of Contents

Problems with Poorly-designed Database

Data Redundancy and Data Inconsistency

Anomalies in Modifying Data

Example of Poorly-designed Tables

4

5

Problems with Poorly-designed Problems with Poorly-designed Database Database In relational database, a poorly-designed

database is the one with poorly-designed tables.

Poorly-designed tables may lead to:

unnecessary data redundancy, which will

increase the chance of data inconsistency

require more storage space for storing the redundant data

different anomalies in modifying data in the database

6

Data Redundancy and Data Data Redundancy and Data Inconsistency Inconsistency Remark:

Data redundancy occurs when the same data are stored more than once in one or more tables in a relational database.

Data inconsistency means that the various copies of the data in a database no longer agree.

7

Anomalies in Updating Data Anomalies in Updating Data Anomalies refer to the inconsistencies or errors that may arise

when a user attempts to modify data in a database containing redundant data.

There are three main types of anomalies in modifying data in a database:

Update AnomalyIn updating a data item in a database, data inconsistency will arise if not all the corresponding copies of redundant data are updated.

Insertion AnomalyIt may not be able to insert certain data in a database unless some data of other entities are stored.

Deletion AnomalyDeletion of some data may result in unintended loss of data in other entities.

8

Example of Poorly-designed Example of Poorly-designed Tables Tables Example 1

Suppose you are a pop song fan and would like to develop a database to store the information of pop songs. The original design of the database consists of a table with the following structure:

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Song_Artist

Assume that each song is associated with one artist only.

9

Example of Poorly-designed Example of Poorly-designed Tables Tables

Questions:

1. What is the primary key of the above table structure?

2. Are there any redundant data in the above table? If yes, give an example.

3. Give some examples of anomalies in updating data in the above table.

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Example 1 (con’t)

Song_Artist

10

Example of Poorly-designed Example of Poorly-designed Tables Tables

Answers:

1. What is the primary key of the given table structure?

Clearly, song_id is the primary key field as its value is unique for each record.

2. Are there any redundant data in the given table? If yes, give an example.

Some data in the artist and artist_web fields are redundant. For example, records with song_id “0001” and “0004” have the same artist and artist_web values.

3. Give some examples of anomalies in modifying data in the above table.

(See next slide)

Example 1 (con’t)

11

Example of Poorly-designed Example of Poorly-designed Tables Tables

Example of Update Anomaly:

Suppose the artist_web of Alan Chan should be amended to www.alanyyy.org.

Carelessly, you only updated the artist_web of the first record (song_id = “0001”). The other records of Alan Chan (e.g. the 4th record) have not been updated.

The term for describing this situation is called data ___________ .

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Example 1 (con’t)

Song_Artist

12

Example of Poorly-designed Example of Poorly-designed Tables Tables

Example of Insertion Anomaly:

Suppose we want to insert the information of a new artist, King Kong (art_id = “099”), into the above table. King Kong has not yet published any song.

The information of King Kong cannot be stored unless we also enter his song information. (Why?)

Example 1 (con’t)

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Song_Artist

13

Example of Poorly-designed Example of Poorly-designed Tables Tables

Example of Deletion Anomaly:

Peter Pan had only published one song (“Freedom”) up to now.

Suppose you want to delete the record of the song “Freedom” (song_id=“0002”) from the table.

What would be the problem of deleting the record (song_id=“0002”) from the above table?

Example 1 (con’t)

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Song_Artist

14

15

Task 1Task 1

Complete Task 1 in Activity 1 of Chapter 6.

16

Example of Poorly-designed Example of Poorly-designed Tables Tables

song_id song_title art_id artist artist_web0001 A great song 002 Alan Chan www.alanxxx.org0002 Freedom 007 Peter Pan www.peterxxx.org0003 Escape 019 Edison Chow www.edxxxx.com0004 A poor song 002 Alan Chan www.alanxxx.org

…. ……… ……… ………..

Example 1 (con’t)

Suppose the original table Song_Artist is decomposed into two tables Song and Artist:

song_id song_title art_id0001 A great song 0020002 Freedom 0070003 Escape 0190004 A poor song 002

…. ………

art_id artist artist_web002 Alan Chan www.alanxxx.org007 Peter Pan www.peterxxx.org019 Edison Chow www.edxxxx.com….. ……… ………..

(relationship)

Song_Artist

Song Artist

17

Example of Poorly-designed Example of Poorly-designed Tables Tables Example 1 (con’t)

Questions

1. Are there any redundant data in the above tables?

2. Will similar anomalies occur when modifying the records with this design?

Answers

1. Except the data in the foreign key (art_id) of the table Song, there is no redundant data in the new tables.

2. All the anomalies in modifying data in the database have been removed.

18

19

Task 2Task 2

Complete Task 2 in Activity 1 of Chapter 6.

20

Conclusions from Example 1 Conclusions from Example 1 Conclusions from the findings in Example 1:

1. Poorly-designed table structures may lead to:

Unnecessary data redundancy

Anomalies in:

Updating

Insertion

Deletion

2. Poor table structures can be “repaired” by decomposing the tables into smaller ones.

In the above Example, it is quite nature to decompose the original table Song_Artist into the tables Song and Artist.

If we construct the ER diagram for the problem, the two basic entities that would be identified are probably Song and Artist.

21

Techniques for Constructing Techniques for Constructing Well-Designed Tables Well-Designed Tables

In Chapters 4 and 5, you learnt how to construct an ER diagram and to transform the diagram into table structures for designing a relational database.

Although this approach can help us identify the entities, attributes and relationships in a database, the identification process is mainly based on the collected users’ requirements and some common senses.

It is still possible that the resulting tables are not well-designed.

22

Techniques for Constructing Techniques for Constructing Well-Designed TablesWell-Designed Tables

Badly-structured tables may arise from the errors in the original ER diagram or in the process of translating the ER diagram into tables.

Therefore, we need a more standardized technique for: Checking whether the tables are well-structured;

“Repairing” poorly-designed table structures (mainly by decomposing the original tables);

Assisting the construction of tables from the collected users’ requirements.

The technique we will study in next Chapter is called “normalisation”, which was introduced by Dr. E.F. Codd in 1972. (http://en.wikipedia.org/wiki/Database_normalization )

23

ReferencesReferences http://en.wikipedia.org/wiki/Database_normalization

http://adbc.kennesaw.edu/

http://mis.chna.edu.tw/teacher_web/hhhsu/資料庫系統 _9.ppt

http://web.ydu.edu.tw/~taniah/f7970-03.ppt

Peter Rob, Carlos Coronel, Database Systems – Design, Implementation, & Management, 5th Ed., Thomson Learning, Chapter 4.

Thomas Connolly, Carolyn Begg, Database Solutions, end Ed., Pearson Addison Wesley, Chapter 8.