dbms mca-section a

Introduction to DBMS

Definition Of Database –

A database is a collection of information that is organized so that it can easily beaccessed, managed, and updated. In one view, databases can be classifiedaccording to types of content: bibliographic, full-text, numeric, and images.

In computing, databases are sometimes classified according to theirorganizational approach. The most prevalent approach is the relational database,a tabular database in which data is defined so that it can be reorganized andaccessed in a number of different ways.

A distributed database is one that can be dispersed or replicated amongdifferent points in a network.

An object-oriented programming database is one that is congruent with thedata defined in object classes and subclasses.

Computer databases typically contain aggregations of data records or files,such as sales transactions, product catalogs and inventories, and customerprofiles. Typically, a database manager provides users the capabilities ofcontrolling read/write access, specifying report generation, and analyzingusage

Database –

Often abbreviated DB. A collection of information organized in such a way that acomputer program can quickly select desired pieces of data. You can think of adatabase as an electronic filing system.

To access information from a database, you need a database management system(DBMS). This is a collection of programs that enables you to enter, organize, andselect data in a database.

Database Management System (DBMS) -

A database management system (DBMS) is a software package designed to define,manipulate, retrieve and manage data in a database. A DBMS generallymanipulates the data itself, the data format, field names, record structure and filestructure. It also defines rules to validate and manipulate this data. A DBMSrelieves users of framing programs for data maintenance. Fourth-generation querylanguages, such as SQL, are used along with the DBMS package to interact with adatabase.

Data Base Management System

DBMS is a collection of programs that enables users to create and maintain adata base. DBMS is a general purpose software system that provides theprocess of (i) Defining (ii) Constructing & (iii) manipulating database forapplications.

• Defining : It involves specifying the data type structures, and constraints

for the data to be stored in the data base.

• Constructing : It is the process of storing the data itself on some storage

medium that is controlled by the DBMS.

• Manipulating : It is a data base includes function as querying the database

to retrieve specific data, updating the database to reflect changes.

Types Of Database organization

There are four main types of database organization:

Relational Database: Data is organized as logically independent tables.Relationships among tables are shown through shared data. The data in one tablemay reference similar data in other tables, which maintains the integrity of thelinks among them. This feature is referred to as referential integrity - an importantconcept in a relational database system. Operations such as select and join can beperformed on these tables. This is the most widely used system of databaseorganization.

Flat Database: Data is organized in a single kind of record with a fixed number offields. This database type encounters more errors due to the repetitive nature ofdata.

Object Oriented Database: Data is organized with similarity to object oriented programming concepts. An object consists of data and methods, while classes group objects having similar data and methods.

Hierarchical Database: Data is organized with hierarchical relationships. It becomes a complex network if the one-to-many relationship is violated.

Database Environment

CASE

Tools

DBMS

User

Interface

Application

Programs

Repository Database

Database Components

DBMS===============

Design toolsTable Creation

Form Creation

Query Creation

Report Creation

Procedural

language

compiler (4GL)

=============

Run timeForm processor

Query processor

Report Writer

Language Run time

User

Interface

Applications

Application

ProgramsDatabase

Database contains:

User’s Data

Metadata

Indexes

Application Metadata

Types of Database Systems

PC Databases Centralized Databases

Central

Computer

Types of Database Systems

• Client Server Databases

NetworkClient

Client

Client

Database

Server

Distributed Databases

From File Systems to DBMS

Problems with file processing systems

• Inconsistent data• Inflexibility• Limited data sharing• Poor enforcement of standards• Excessive program maintenance

DBMS Benefits

• Minimal data redundancy• Consistency of data• Integration of data• Sharing of data• Ease of application development• Uniform security, privacy, and integrity controls• Data accessibility and responsiveness• Data independence• Reduced program maintenance

Database Management System Vs File

Management System

• File Management Systems

Advantages Disadvantages

Simpler to use Typically does not support multi-user access

Less expensive· Limited to smaller databases

Fits the needs of many small businesses

and home users

Limited functionality (i.e. no support for

complicated transactions, recovery, etc.)

Popular FMS’s are packaged along with the

operating systems of personal computers

(i.e. Microsoft Cardfile and Microsoft

Works)

Decentralization of data

Good for database solutions for hand held

devices such as Palm PilotRedundancy and Integrity issues

Database Management System Vs File

Management System

Advantages Disadvantages

Simpler to use Typically does not support multi-user access

Less expensive· Limited to smaller databases

Fits the needs of many small businesses and

home users

Limited functionality (i.e. no support for

complicated transactions, recovery, etc.)

Popular FMS’s are packaged along with the

operating systems of personal computers

(i.e. Microsoft Cardfile and Microsoft

Works)

Decentralization of data

Good for database solutions for hand held

devices such as Palm PilotRedundancy and Integrity issues

Database Management Systems

File Oriented Approach The file processing system or File Management System is used to store data in a computerizeddata base. Before the advent of DBMS, the application programs are defined & maintained in amaster file and other supporting transaction files. Hence one master file with one or moretransaction files are used. Here each user defines & implements the files needed for specificapplications as part of programming application. The draw backs of file oriented approach arefollowing:

(1) It leads to Data redundancy : The same data may be present in more than one file.

(2) Wasted Memory Space : Redundant or duplicate copies of the same data results in wastage of storage space.

(3) Loss of data Integrity : Data redundancy leads to inconsistency problems.It results that the same data item has a different value in different files.

(4) Difficulty in accessing data : To access data from the files, programming language programs must be written.

(5) Information is available in reports : Information is available only through reports, in between queries are not supported.

(6) Data Isolation : Data are accessed through reports. This results in data isolation.

(7) Inadequate Security : Only operating system level security is provided. In depth security is not provided.

(8) High Maintenance Cost : Software maintenance cost will be very expense. (9) Each data file of an application is a separate entity.

Traditional Approach for Data Storage and the Need of DBMS

• Traditional Data Storage Model

In traditional approach, information is stored in flat files which aremaintained by the file system under the operating system’s control.Application programs go through the file system in order to access these flatfiles

Traditional Approach for Data Storage and the Need of DBMS

How data is stored in flat files

• Data is stored in flat files as records.

• Records consist of various fields which are delimited by a space, comma, pipe, any special character etc.

• End of records and end of files will be marked using any predetermined character set or special characters in order to identify them

• Example: Storing employee data in flat files

Problems with traditional approach for storing data

• Data Security

The data stored in the flat file(s) can be easily accessible and hence it is not secure.

Example: Consider an online banking application where we store the account related information of all customers in flat files. A customer will have access only to his account related details. However from a flat file, it is difficult to put such constraints. It is a big security issue.

• Data Redundancy

In this storage model, the same information may get duplicated in two or more files. This may lead to higher storage and access cost. it also may lead to data inconsistency.

For Example, assume the same data is repeated in two or more files. If a changeis made to data stored in one file, other files also needs to be change accordingly.Example: Assume employee details such as first name, last name, emailed arestored in employee details file and employee salary file. If a change needs to bemade to emailed, both employee details file and emplyee_salary file need to beupdated otherwise it will lead to inconsistent data.


• Data Isolation

Data Isolation means that all the related data is not available in one file. Usually thedata is scattered in various files having different formats. Hence writing newapplication programs to retrieve the appropriate data is difficult.

• Lack of Flexibility

The traditional systems are able to retrieve information for predetermined requests fordata. If we need unanticipated data, huge programming effort is needed to make theinformation available, provided the information is there in the files. By the time theinformation is made available, it may no longer be required or useful.

Example :

Consider a software application which is able to generate employee salary report.Assume that all the data is stored in flat files. Suppose we now have a requirement toretrieve all the employee details whose salary is greater than Rs.10000. It is not easy togenerate such on-demand reports and lot of time is needed for application developers tomodify the application to meet such requirements.


• Program/Data Dependence

In traditional file approach, application programs are closely dependent on the files inwhich data is stored. If we make any changes in the physical format of the file(s), likeaddition of a data field , etc, all application programs needs to be changed accordingly.Consequently, for each of the application programs that a programmer writes or maintains,the programmer must be concerned with data management. There is no centralizedexecution of the data management functions. Data management is scattered among all theapplication programs.

Example: Consider the banking system. An employee salary file exists which has details about the salary of employees. An employee salary record is described by

• employee_id• firstname• lastname• salary_amount

• An application program is available to display all the details about the salary of allemployees. Assume a new data field, the date_of_joining is added to the employeesalary file. Since the application program depends on the file, it also needs to be altered.

• If the physical format of the employee salary file for example the field delimiter, record delimiter, etc. are changed, it necessitates that the application program which depends on it, also be altered.


• Concurrent Access Anomalies

Many traditional systems allow multiple users to access and update the samepiece of data simultaneously. However this concurrent updates may result ininconsistent data. To guard against this possibility, the system must maintainsome form of supervision. But supervision is difficult because data may beaccessed by many different application programs and these applicationprograms may not have been coordinated previously.

• Example :

Consider a personal information system which has the data of all employees.Now there may be an employee updating his address details in the systemand at the same time, an administrator may be taking a report containing thedata of all employees. This is called concurrent access. Since the employee'saddress is being updated at the same time, there is a possibility of theadministrator reading an incorrect address.

These difficulties lead to the development of database systems.

Master and Transaction Files

• Master File

The data stored in these files are permanent by nature

This file is empty while nature

This files are updated only through recent transactions

This file stores large amount of data

Example : customer ledgers, student database

• Transaction File

The data stored in these files are temporary by nature

This file contains data only for period of time and send to the master file

Any data to be modified is done in this file

In this file the data to be modified is stored .

Example : price of the products, customers order for the products, inserting new data to the database etc.

Master and Transaction Files

Components of DBMS Environment

A database management system (DBMS) consists of several components.

Each component plays very important role in the database management

system environment. The major components of database management

system are:

Software

Hardware

Data

Procedures

Database Access Language


Software

• The main component of a DBMS is the software. It is the set of programs

used to handle the database and to control and manage the overall

computerized database

• DBMS software itself, is the most important software component in the

overall system

• Operating system including network software being used in network, to

share the data of database among multiple users.

• Application programs developed in programming languages such as

C++, Visual Basic that are used to access database in database

management system. Each program contains statements that request

the DBMS to perform operation on database. The operations may

include retrieving, updating, deleting data etc. . The application

program may be conventional or online workstations or terminals.


Hardware

Hardware consists of a set of physical electronic devices such as computers(together with associated I/O devices like disk drives), storage devices, I/Ochannels, electromechanical devices that make interface betweencomputers and the real world systems etc. and so on. It is impossible toimplement the DBMS without the hardware devices, In a network, apowerful computer with high data processing speed and a storage devicewith large storage capacity is required as database server.

Data

Data is the most important component of the DBMS. The main purpose ofDBMS is to process the data. In DBMS, databases are defined, constructedand then data is stored, updated and retrieved to and from the databases.The database contains both the actual (or operational) data and themetadata (data about data or description about data).


Procedures

• Procedures refer to the instructions and rules that help to design the

database and to use the DBMS. The users that operate and manage

the DBMS require documented procedures on hot use or run the

database management system. These may include.

• Procedure to install the new DBMS.

• To log on to the DBMS.

• To use the DBMS or application program.

• To make backup copies of database.

• To change the structure of database.

• To generate the reports of data retrieved from database.


Database Access Language

• The database access language is used to access the data to and from the database. The

users use the database access language to enter new data, change the existing data in

database and to retrieve required data from databases. The user write a set of

appropriate commands in a database access language and submits these to the DBMS.

The DBMS translates the user commands and sends it to a specific part of the DBMS called

the Database Jet Engine. The database engine generates a set of results according to the

commands submitted by user, converts these into a user readable form called an Inquiry

Report and then displays them on the screen. The administrators may also use the

database access language to create and maintain the databases.

• The most popular database access language is SQL (Structured Query Language).

Relational databases are required to have a database query language.


• Users

The users are the people who manage the databases and perform different operations on the databases in the database system. There are three kinds of people who play different roles in database system

Application Programmers

Database Administrators

End-Users


• Application Programmers

The people who write application programs in programming languages (such as

Visual Basic, Java, or C++) to interact with databases are called Application

Programmer.

• Database Administrators

A person who is responsible for managing the overall database management system

is called database administrator or simply DBA.

• End-Users

The end-users are the people who interact with database management system to

perform different operations on database such as retrieving, updating, inserting,

deleting data etc.

Database Schema versus Database InstanceWhile working with any data model, it is necessary to distinguish between the overalldesign or description of the database (database schema) and the database itself. Asdiscussed earlier, the overall design or description of the database is knownas database schema or simply schema. The database schema is also knownas intension of the database, and is specified while designing thedatabase.Figure 1.6 shows the database schema for Online Book database.

Database Schema versus Database Instance

The data in the database at a particular point of time is known as database instance or

database state or snapshot. The database state is also called an extension of the

schema.

The various states of the database are given here.

• Empty state: When a new database is defined, only its schema is specified. At this

point, the database is said to be in empty state as it contains no data.

• Initial state: When the database is loaded with data for the first time, it is said to

be in initial state.

• Current state: The data in the database is updated frequently. Thus, at any point

of time, the database is said to be in the current state

Database Schema versus Database Instance

The DBMS is responsible to check whether the database state is valid state. Thus, each time the database isupdated, DBMS ensures that the database remains in the valid state. The DBMS refers to DBMS catalogwhere the metadata is stored in order to check whether the database state satisfies the structure andconstraints specified in the schema. Figure 1.7 shows an example of an instance for PUBLISHER schema.

Fig. 1.7. An example of instance - PUBLISHER instance

P_ID Pname Address State Phone Email_id

P001 Hills Publications 12, Park street, Atlanta Georgia 7134019 [email protected]

P002 Sunshine Publishers Ltd.

45, Second street, Newark New Jersey 6548909 Null

P003 Bright Publications 123, Main street, Honolulu Hawai 7678985 [email protected]

P004 Paramount Publishing House

789, Oak street, New York New York 9254834 [email protected]

P005 Wesley Publications

456, First street, Las Vegas Nevada 5683452 Null

The schema and instance can be compared with a program written in a programming language. Thedatabase schema is similar to a variable declared along with the type description in a program. Thevariable contains a value at a given point of time. This value of a variable corresponds to an instance of adatabase schema

mailto:[email protected]



DBMS Architecture and Data Independence

The DBMS architecture describes how data in the database is viewed by the users. It isnot concerned with how the data is handled and processed by the DBMS. The databaseusers are provided with an abstract view of the data by hiding certain details of howdata is physically stored. This enables the users to manipulate the data withoutworrying about where it is located or how it is actually stored.

In this architecture, the overall database description can be defined at three levels,namely, internal, conceptual, and external levels and thus, named three-level DBMSarchitecture.

The three levels are discussed here.

Internal level:

It is the lowest level of data abstraction that deals with the physical representation ofthe database on the computer and thus, is also known as physical level. Itdescribes how the data is physically stored and organized on the storage medium. Atthis level, various aspects are considered to achieve optimal runtime performance andstorage space utilization. These aspects include storage space allocation techniques fordata and indexes, access paths such as indexes, data compression and encryptiontechniques, and record placement

DBMS Architecture and Data Independence

Conceptual level:

This level of abstraction deals with the logical structure of the entire database and thus, isalso known as logical level. It describes what data is stored in the database, therelationships among the data and complete view of the user’s requirements without anyconcern for the physical implementation. That is, it hides the complexity of physicalstorage structures. The conceptual view is the overall view of the database and it includesall the information that is going to be represented in the database.

External level:

It is the highest level of abstraction that deals with the user’s view of the database andthus, is also known as view level. In general, most of the users and application programsdo not require the entire data stored in the database. The external level describes a part ofthe database for a particular group of users. It permits users to access data in a way that iscustomized according to their needs, so that the same data can be seen by different usersin different ways, at the same time. In this way, it provides a powerful and flexiblesecurity mechanism by hiding the parts of the database from certain users, as the user isnot aware of existence of any attributes that are missing from the view.

DBMS Architecture and Data IndependenceThese three levels are used to describe the schema of the database at various levels.Thus, the three-level architecture is also known as three-schema architecture. Theinternal level has an internal schema, which describes the physical storage structureof the database. The conceptual level has a conceptual schema, which describes thestructure of entire database. The external level has external schemas or userviews, The three-schema architecture is shown in Figure 1.1.

DBMS Architecture ExampleOnline Book database as shown in Figure 1.2. In this figure, two views (view 1 and view 2) of

the BOOK file have been defined at the external level. Different database users can see these views. Thedetails of the data types are hidden from the users. At the conceptual level, the BOOK records aredescribed by a type definition. The application programmers and the DBA generally work at this level ofabstraction. At the internal level, the BOOK records are described as a block of consecutive storagelocations such as words or bytes. The database users and the application programmers are not aware ofthese details; however, the DBA may be aware of certain details of the physical organization of the data.

Advantages of Database SystemDatabase approach came into existence due to the bottlenecks of file processing system. In thedatabase approach, the data is stored at a central location and is shared among multiple users.Thus, the main advantage of DBMS is centralized data management. The centralized nature ofdatabase system provides several advantages, which overcome the limitations of theconventional file processing system. These advantages are listed here.

• Controlled data redundancy:

During database design, various files are integrated and each logical data item is stored atcentral location. This eliminates replicating the data item in different files, and ensuresconsistency and saves the storage space. Note that the redundancy in the database systemscannot be eliminated completely as there could be some performance and technical reasonsfor having some amount of redundancy. However, the DBMS should be capable of controllingthis redundancy in order to avoid data inconsistencies.

• Enforcing data integrity:

In database approach, enforcing data integrity is much easier. Various integrity constraints areidentified by database designer during database design. Some of these data integrityconstraints can be enforced automatically by the DBMS, and others may have to be checked bythe application programs.

• Data sharing:

The data stored in the database can be shared among multiple users or application programs. Moreover, new applications can be developed to use the same stored data. Due to shared data, it is possible to satisfy the data requirements of the new applications without having to create any additional data or with minimal modification.

Advantages of Database System

• Ease of application development

The application programmer needs to develop the application programs according to the users’needs. The other issues like concurrent access, security, data integrity, etc., are handled by theDBMS itself. This makes the application development an easier task.

• Data security

Since the data is stored centrally, enforcing security constraints is much easier. The DBMSensures that the only means of access to the database is through an authorized channel. Hence,data security checks can be carried out whenever access is attempted to sensitive data. Toensure security, a DBMS provides security tools such as user codes and passwords. Differentchecks can be established for each type of access (addition, modification, deletion, etc.) to eachpiece of information in the database.

• Multiple user interfaces

In order to meet the needs of various users having different technical knowledge, DBMS

provides different types of interfaces such as query languages, application program interfaces, and graphical user interfaces (GUI) that include forms-style and menu-driven interfaces. A form-style interface displays a form to each user and user interacts using these forms. In menu-driven interface, the user interaction is through lists of options known as menus.

Advantages of Database System

• Backup and recovery:

The DBMS provides backup and recovery subsystem that is responsible for recovery from hardware and

software failures. For example, if the failure occurs in between the transaction, the DBMS recovery

subsystem either reverts back the database to the state which existed prior to the start of the transaction or

resumes the transaction from the point it was interrupted so that its complete effect can be recorded in the

database.

• Data abstraction:

The property of DBMS that allows program-data independence is known as data abstraction. Data

abstraction allows the database system to provide an abstract view of the data to its users without giving

the physical storage and implementation details.

• Supports multiple views of the data:

A database can be accessed by many users and each of them may have a different perspective or view of the

data. A database system provides a facility to define different views of the data for different users. A view is

a subset of the database that contains virtual data derived from the database files but it does not exist in

physical form. That is, no physical file is created for storing the data values of the view; rather, only the

definition of the view is stored.

Disadvantage of DBMS

Although there are many advantages of DBMS, the DBMS may also have some minor disadvantages. These are:

• Cost of Hardware and Software

A processor with high speed of data processing and memory of large size is required to run the DBMS software. It means that you have to up grade the hardware used for file-based system. Similarly, DBMS software is also very costly,.

• Cost of Data Conversion

When a computer file-based system is replaced with database system, the data stored into data file must be converted to database file. It is very difficult and costly method to convert data of data file into database. You have to hire database system designers along with application programmers. Alternatively, you have to take the services of some software house. So a lot of money has to be paid for developing software.

• Cost of Staff Training

Most database management system are often complex systems so the training for users to use the DBMS is required. Training is required at all levels, including programming, application development, and database administration. The organization has to be paid a lot of amount for the training of staff to run the DBMS.

• Appointing Technical Staff

The trained technical persons such as database administrator, application programmers, data entry operations etc. are required to handle the DBMS. You have to pay handsome salaries to these persons. Therefore, the system cost increases.

• Database Damage

In most of the organization, all data is integrated into a single database. If database is damaged due to electric failure or database is corrupted on the storage media, the your valuable data may be lost forever.

Difference between Centralized and Distributed databases

Centralized database is a database in which data is stored and maintained in a single location.

This is the traditional approach for storing data in large enterprises. Distributed database is a

database in which data is stored in storage devices that are not located in the same physical

location but the database is controlled using a central Database Management System (DBMS).

In a centralized database, all the data of an organization is stored in a single place such as a

mainframe computer or a server. Users in remote locations access the data through the Wide

Area Network (WAN) using the application programs provided to access the data. The

centralized database (the mainframe or the server) should be able to satisfy all the requests

coming to the system, therefore could easily become a bottleneck. But since all the data reside in

a single place it easier to maintain and back up data. Furthermore, it is easier to maintain data

integrity, because once data is stored in a centralized database, outdated data is no longer

available in other places

Distributed database

• In a distributed database, the data is stored in storage devices that are located indifferent physical locations. They are not attached to a common CPU but the database iscontrolled by a central DBMS. Users access the data in a distributed database byaccessing the WAN. To keep a distributed database up to date, it uses the replication andduplication processes. The replication process identifies changes in the distributeddatabase and applies those changes to make sure that all the distributed databases lookthe same. Depending on the number of distributed databases, this process could becomevery complex and time consuming. The duplication process identifies one database as amaster database and duplicates that database. This process is not complicated as thereplication process but makes sure that all the distributed databases have the same data

.

What is the difference between Distributed distributed-database-and-vs-centralized-database

While a centralized database keeps its data in storage devices that are in a singlelocation connected to a single CPU, a distributed database system keeps its data instorage devices that are possibly located in different geographical locations andmanaged using a central DBMS.

A centralized database is easier to maintain and keep updated since all the data arestored in a single location. Furthermore, it is easier to maintain data integrity andavoid the requirement for data duplication. But, all the requests coming to accessdata are processed by a single entity such as a single mainframe, and therefore itcould easily become a bottleneck. But with distributed databases, this bottleneck canbe avoided since the databases are parallelized making the load balanced betweenseveral servers.

keeping the data up to date in distributed database system requires additionalwork, therefore increases the cost of maintenance and complexity and also requiresadditional software for this purpose.

Furthermore, designing databases for a distributed database is more complex than centralized data base .

Users are of 4 types:

1. Application programmers or Ordinary users

2. End users

3. Database Administrator (DBA)

4. System Analyst

Application programmers or Ordinary users:

These users write application programs to interact with the database. Applicationprograms can be written in some programming language such a COBOL, PL/I, C++,JAVA or some higher level fourth generation language. Such programs access thedatabase by issuing the appropriate request, typically a SQL statement to DBMS.

Types of users in DBMS

2. End Users:

End users are the users, who use the applications developed. End users need not know about the working, database design, the access mechanism etc. They just use the system to get their task done. End users are of two types:

a) Direct users b) Indirect users

a) Direct users:

Direct users are the users who se the computer, database system directly, by following instructions provided in the user interface. They interact using the application programs already developed, for getting the desired result. E.g. People at railway reservation counters, who directly interact with database.

b) Indirect users:

Indirect users are those users, who desire benefit form the work of DBMS indirectly. They use theoutputs generated by the programs, for decision making or any other purpose. They are justconcerned with the output and are not bothered about the programming part.


3. Database Administrator (DBA):

Database Administrator (DBA) is the person which makes the strategic and policy decisions

regarding the data of the enterprise, and who provide the necessary technical support for

implementing these decisions. Therefore, DBA is responsible for overall control of the system

at a technical level. In database environment, the primary resource is the database itself and

the secondary resource is the DBMS and related software administering these resources is the

responsibility of the Database Administrator (DBA).

4. System Analyst:

System Analyst determines the requirement of end users, especially naïve and parametric

end users and develops specifications for transactions that meet these requirements. System

Analyst plays a major role in database design, its properties; the structure prepares the system

requirement statement, which involves the feasibility aspect, economic aspect, technical aspect

etc. of the system.


Introduction to the Data Dictionary

One of the most important parts of an Oracle database is its data dictionary, which is a read-only set

of tables that provides information about the database. A data dictionary contains:

• The definitions of all schema objects in the database (tables, views, indexes, clusters, synonyms,

sequences, procedures, functions, packages, triggers, and so on)

• How much space has been allocated for, and is currently used by, the schema objects

• Default values for columns

• Integrity constraint information

• The names of Oracle users

• Privileges and roles each user has been granted

• Auditing information, such as who has accessed or updated various schema objects

• Other general database information

• The data dictionary is structured in tables and views, just like other database data. All the data

dictionary tables and views for a given database are stored in that

database's SYSTEM tablespace.

Structure of the Data Dictionary

The data dictionary consists of the following:

Base Tables

The underlying tables that store information about the associated database. Only Oracle should write to and read these tables. Users rarely access them directly because they are normalized, and most of the data is stored in a cryptic format.

User-Accessible Views

The views that summarize and display the information stored in the base tables of the datadictionary. These views decode the base table data into useful information, such as user ortable names, using joins and WHERE clauses to simplify the information. Most users aregiven access to the views rather than the base tables.

SYS, Owner of the Data Dictionary

The Oracle user SYS owns all base tables and user-accessible views of the data dictionary.No Oracle user should ever alter (UPDATE, DELETE, or INSERT) any rows or schemaobjects contained in the SYS schema, because such activity can compromise data integrity.The security administrator must keep strict control of this central account.

How the Data Dictionary Is Used

The data dictionary has three primary uses:

• Oracle accesses the data dictionary to find information about users, schema objects,

and storage structures.

• Oracle modifies the data dictionary every time that a data definition language (DDL)

statement is issued.

• Any Oracle user can use the data dictionary as a read-only reference for information

about the database.

Data Model

A database model is a type of data model that determines the logical structure of a database

and fundamentally determines in which manner data can be stored, organized, and

manipulated. The most popular example of a database model is the relational model (or the

SQL approximation of relational), which uses a table-based format.

Common logical data models for databases include:

Hierarchical database model Network model Relational model Entity–relationship model Enhanced entity–relationship model Object model

Data Model

• A hierarchical database model is a data model in which the data is organized into a tree-like structure.

The structure allows representing information using parent/child relationships: each parent can have

many children, but each child has only one parent (also known as a 1-to-many relationship). All

attributes of a specific record are listed under an entity type.

Example of a hierarchical model

• In a database an entity type is the equivalent of a table. Each individual record is represented as a row,

and each attribute as a column. Entity types are related to each other using 1:N mappings, also known

as one-to-many relationships.

• Mother–child relationship: Child may only have one mother but a mother can have multiple children.

Mothers and children are tied together by links called "pointers". A mother will have a list of pointers

to each of her children.

• A hierarchical data model is a data model that the data is organized into a tree like structure. The

structure permits repeating information using parent/child relationships: Every parent can have many

children but each child only has one parent. All attributes of a precise record are listed under an entity

type.

Data Model

Advantages:

a) The representation of records is completed using an ordered tree which is natural

method of implementation of one–to-many relationships.

b) Proper ordering of the tree results in easier as well as faster retrieval of records.

c) Permits the use of virtual records. This result in a steady database especially when

modification of the data base is made.

Hierarchical Model

The E-R Model

• The entity-relationship model (or ER model) is a way of graphically representing the

logical relationships of entities (or objects) in order to create a database

• The entity-relationship model is based on a perception of the world as consisting of a

collection of basic objects (entities) and relationships among these objects.

• An entity is a distinguishable object that exists.

• Each entity has associated with it a set of attributes describing it.

• E.g. number and balance for an account entity.

• A relationship is an association among several entities.

• e.g. A cust_acct relationship associates a customer with each account he or she has.

• The set of all entities or relationships of the same type is called the entity set or

relationship set.

• Another essential element of the E-R diagram is the mapping cardinalities, which

express the number of entities to which another entity can be associated via a

relationship set.

The E-R Model

• The overall logical structure of a database can be expressed graphically by an E-R

diagram:

• Rectangles: represent entity sets.

• Ellipses: represent attributes.

• Diamonds: represent relationships among entity sets.

• Lines: link attributes to entity sets and entity sets to relationships.

Graphical Representation in E-R diagram

Rectangle -- Entity

Ellipses -- Attribute (underlined attributes are [part of] the

primary key)

Double ellipses -- multi-valued attribute

Dashed ellipses-- derived attribute, e.g. age is derivable from

birthdate and current date.

Symbols & Notations

The E-R Model

Entities and Attributes

Entity: an object that is involved in the enterprise and that be distinguished from other objects. (not shown in the ER diagram--is an instance)

Can be person, place, event, object, concept in the real world

Can be physical object or abstraction

Ex: "John", "CSE305“

Entity Type: set of similar objects or a category of entities; they are well defined

A rectangle represents an entity set

Ex: students, courses

We often just say "entity" and mean "entity type"

Entities and Attributes

Attribute: describes one aspect of an entity type; usually [and best when] single valued and indivisible (atomic)

Represented by oval on E-R diagram

Ex: name, maximum enrollment

May be multi-valued – use double oval on E-R diagram

May be composite – attribute has further structure; also use oval for composite attribute, with ovals for components connected to it by lines

May be derived – a virtual attribute, one that is computable from existing data in the database, use dashed oval. This helps reduce redundancy

Relationships

Relationship: connects two or more entities into an association/relationship"John" majors in "Computer Science“

Relationship Type: set of similar relationshipsStudent (entity type) is related to Department (entity type) by MajorsIn(relationship type).

Relationship Types may also have attributes in the E-R model. When they are mapped to the relational model, the attributes become part of the relation. Represented by a diamond on E-R diagram.

Relationship types can have descriptive attributes like entity setsRelationships tend to be verbs or verb phrases; attributes of relationships are again nouns

Cardinality of Relationships

• Cardinality is the number of entity instances to which another entity set can map under the relationship. This does not reflect a requirement that an entity has to participate in a relationship. Participation is another concept.

• One-to-one: X-Y is 1:1 when each entity in X is associated with at most one entity in Y, and each entity in Y is associated with at most one entity in X.

• One-to-many: X-Y is 1:M when each entity in X can be associated with many entities in Y, but each entity in Y is associated with at most one entity in X.

• Many-to-many: X:Y is M:M if each entity in X can be associated with many entities in Y, and each entity in Y is associated with many entities in X ("many" =>one or more and sometimes zero)

Generalization, Specialization and Aggregation are the ways to represent special relationships between entities and attribute

Generalization

• Generalization is a bottom-up approach in which two lower level entities combine toform a higher level entity. In generalization, the higher level entity can also combinewith other lower level entity to make further higher level entity.


Specialization

Specialization is opposite to Generalization. It is a top-down approach in which onehigher level entity can be broken down into two lower level entity. In specialization,some higher level entities may not have lower-level entity sets at all.


Aggregation

Aggregation is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course, is acting as an Entity in relation with Visitor.

Generalization, Specialization and Aggregation Database technology Notes, RDBMS,SQL Query, E-R diagram, Generalization, Specialization, Aggregation, Database Model,Normalization, SQL Sequence, SQL Constraints, Database View, table, row, SQL JoinGeneralization, Specialization and Aggregation are the ways to represent specialrelationships between entities and attributes

Transforming a Logical Data Model to a Physical Database Design ( Refer Till Slide Number -74 )

The following procedures can be successfully used to transform a Logical Analytical Data

Model into a physical database, such as a data warehouse or a data mart. By following these

procedures, one can ensured that:

• The data model has integrity (e.g., that business rules have been honored),

• The process has integrity (i.e., that it accounts for all the necessary steps and feedback

loops), and

• The overall process is pragmatic (e.g., that it addresses issues of query, report and load

performance). This chapter describes a process that is generic enough to be used as a guide

for different projects.

Transforming a Logical Data Model to a Physical Database Design

Overview

• Let us start with an overview of the overall process for creating logical and physical data models. Here are the major steps in the transformation of the model:

1-The business authorization to proceed is received.

3-Business requirements are gathered and represented in a logical data model, which will completely represent the business data requirements and will be non-redundant.

5-The logical model is transformed into a first-cut physical model by applying several simple modifications, such as splitting a large table or combining entities in a 1:1 relationship.

7-The logical model is then transformed into a second-cut physical model by iteratively applying three levels of optimizations or compromises. The outcome from this is a physical database design.

b. Apply safe compromises to the model, such as splitting a table or combining two tables.d. Apply aggressive compromises to the model, such as adding redundant data.e. Apply technical optimizations to the model, such as indices, referential integrity or partitioning.

9- The physical database design is then converted to a physical structure by generating or writing the DDL and installing the database.

11- Steps 4 and 5 are iteratively performed so that the database can be tested before going into production. Sometime aggressive compromises are done so that the effect of them can be properly tested. Aggressive compromising is an iterative process. If views are used up-front, it may even be transparent to the application teams and business users.

Analysis Phase - Develop a Logical Data Model

This step is obviously the crucial starting point. There is no substitute for business knowledge. Any

team that does not know the business has no business building the logical data model. The key

characteristics of this model are that it:

• Must declare the grain.

• Should be atomic to the appropriate level.

• Should fully satisfy user requirements.

• Should be non-redundant.

• Should be independent of technology, implementation and organizational constraints.

• Should also contain any external data that is essential to the business.

Information Gathering

The most critical task in the development of the data model is requirementsgathering. There are three general ways to approach model expansion:

• Top-down,• Inside-out and

• Bottom-up.

• Top-down expansion essentially starts with a high level model to represent the breadth of data in the DW. This model is rough cut and coarse but represents the major subject areas of data that the DW must support. Top-down expansion starts 3with this rough-cut model and progressively enlarges it by adding data elements to it.

• Inside out expansion starts with an initial high level model, which expands by adding details. For example, we can start with Customer and a few attributes, and eventually populate it with many attributes.

• Bottom-up expansion proceeds by collection of views or use cases. The data from these is

incorporated into the model. A typical view could be as simple as a query. A typical query could be "How much volume did we sell last month (and this year to date) versus a year ago last month (and last year to date).“

Information Gathering

There are several vehicles to use for information gathering:

• user interviews,

• facilitated sessions,

• examination of current data and queries and

• direct observation.

Scenarios

Queries and reports can be used to verify that the database will support them. From these it is best to

identify a set of key user scenarios that can be used to pass against the model. A scenario is a sequence of

activities that respond to a major business event. A scenario is actually a test case -- but a business test

case. It consists of two parts: the definition of the business case and a script of values. The purpose of

scenarios is to test the correctness and completeness of the model. A major business event is something of

importance that happens to the business. Usually, but not always, the event is something that is initiated

by some agent who is outside of the business area. A scenario should be defined for each major business

event. Some people call scenarios Use Case

How to Use Scenarios

• The simplest way to use scenarios is to manually walk them through the model. It is

easiest to do in a facilitated session. The facilitator takes the model and the scenario.

Each activity, data element and sample value in a scenario are compared to the model

to ensure that the model is complete and correct. One wants to ensure the data is

there and that the model will allow one to navigate properly. Usually, the first 3 or 4

scenarios disrupt the model. This trauma is essential for the health of the model.

• Using scenarios and queries with actual sample values validates the data model,

ensures that the model satisfies the user needs, guarantees model completeness and

simplifies database changes.

• It is important to collect whatever volumetric information you can. Volumetric

information includes the number of occurrences of data, the frequency of query

execution, the ratio between tables, and such things. These factors substantially

influence the physical design and should be a part of the process for gathering

business requirements.

Finalization of the Logical Model

• Two important additions need to be made to the logical model before it is

complete:

• Completion of definitions of all entities and attributes. In a data warehouse

environment, this is especially critical. Users will need to understand the data in

order to use it. Do not underestimate the magnitude of this task. Do yourself a

favor and compile the definitions as the model grows.

• Addition of domain information. A domain is the set of physical characteristics

that one or more columns can have in common, such as length and data type.

• In the next step, do not neglect to forward engineer the definitions to the physical

model.

Design Phase - First Cut Physical ModelBefore doing extensive usage analysis, it is possible and even advisable to construct a first-cutphysical model. This can be done quickly by applying the following modifications to the datamodel. These seven steps represent what is considered the first cut physical model:

• Ensure physical columns characteristicsUse domains or assign data type, length, null ability, optionality, etc.

• Ensure proper physical database object names are usedFollow standard (enforced by CASE tool) for physical names.

• Add necessary constraints and business rulesTake example of an Order Item that must belong to Order and that an Order must have Order Item. The first is enforced by referential integrity (RI); the second requires a trigger or stored procedure

• Resolve implementation of subtypesDecide how the entire hierarchy will be implemented.

• Perform any obvious denormalization or other safe compromises

Safe compromises are those trade-offs that do not compromise the integrity and non-redundancy of the data,e.g., splitting or partitioning a table based on usage. To store mailing address both in both Customer and inCustomer Address table is an example.

• Adjust keysEnsure keys truly provide uniqueness and stability (even considering anomalies over time, such as Order Numbers getting reused). Add surrogate keys as justified.

• Add indexes on primary keys, major foreign keys only

Introduce indexing sparingly at first to determine raw capabilities of the design.

• Add any production objects

• Example of these are transient tables, processing tables/columns (e.g. batch control tables).

• Forward Engineer Logical Model to a Physical Database Design

Using these scenarios, the entire data model should be reviewed table by tableand column by column to confirm what is really required, what could beeliminated, which data elements should be combined and which should bebroken into smaller tables to improve ease of use and reduce query and reportresponse time.

For example-

it is conceivable to reduce the number of tables from 100 to 60. It is alsopossible to eliminate unnecessary columns when many of the remaining tablesare code tables, used only to look up descriptions. Consequently, the number ofactual fact tables and dimensions could significantly be reduced. Also,information gathered during this activity is useful in determining indexdeployment, load balancing across devices and optimal data clustering. Alloptimization changes based on discoveries made during this process should beimplemented in the physical model. Any business rule changes or additionalcolumns should be incorporated into the logical model and then forwardengineered again.

Remember to forward engineer all relevant data, including definitions. A data warehouse is useless without these.

• Design Data Model OptimizationThere are three general ways to optimize a database:

get better hardware (or take better advantage of it),

get better software (or take better advantage of it), or

optimize the data structure itself.

Before the database can be generated a variety of optimizations need to be evaluated for their applicability to the logical model. These optimizations are actually trade-offs. A trade-off is the emphasis of one feature, which is considered to be an advantage, while compromising another, which is considered to be less important.

These trade-offs are of two types:

safe trade-offs and aggressive trade-offs.

Safe Trade-Offs

• Safe trade-offs do not compromise the integrity of the data. They are exemplified bythe following model changes:

• Collapse some or all subtypes into a supertype

• Collapse a supertype into one or more subtypes

• Split one entity into multiple tables,

• Merge multiple entities into a single table,

• Collapsing certain code tables,

• Converting sub-types to tables (there are various ways)

• Violate first normal form, among others.

Aggressive Trade-Offs

• Aggressive trade-offs compromise the integrity of the model. They are exemplified byseveral main modifications:

• Storing derived data and

• Adding redundant data or relationships and

• Adding surrogate keys.

Volume Determination and Database Sizing

Volume determination and database sizing is facilitated by determining the best source fordata knowledge and getting a best estimate for each table. The numbers can be plugged into atool, such as Platinum Desktop DBA, that can generate space requirements, or into aspreadsheet designed to perform the same estimations, or manually. It is important to padyour estimates in case actual row counts exceed what was anticipated. An additional 20percent when sizing each table is reasonable. Determine Hardware Requirements

• Subsequent Steps

While the completion of the above step signifies the end of the logical model to physicaldatabase transformation, the task of delivering a production database includes severaladditional activities that could result in modifications. Issues like load file content, new queryrequirements, additional data requirements or new user requirements can result innecessitating database modifications. It is recommended that all database changes areimplemented through a physical design tool, as this will provide a central authoritative sourcefor the physical database environment and allow the generation of scripts that can recreate thedatabase server if necessary.

Merits and Demerits of E-R Modeling.Advantages and Disadvantages of E-R Data Model

Following are advantages of an E-R Model:• Straightforward relation representation: Having designed an E-R diagram for a database application, the relational representation of the database model becomes relatively straightforward.• Easy conversion for E-R to other data model: Conversion from E-R diagram to a network or hierarchical data model can· easily be accomplished.• Graphical representation for better understanding: An E-R model gives graphical and diagrammatical representation of various entities, its attributes and relationships between entities. This is turn helps in the clear understanding of the data structure and in minimizing redundancy and other problems.

• Disadvantages of E-R Data Model

Following are disadvantages of an E-R Model:

• No industry standard for notation: There is no industry standard notation for developing an E-R diagram.

• Popular for high-level design: The E-R data model is especially popular for high level.

Record Based Logical Models

Record based logical models are used in describing data at the logical and viewlevels. In contrast to object based data models, they are used to specify theoverall logical structure of the database and to provide a higher-leveldescription of the implementation. Record based models are so named becausethe database is structured in fixed format records of several types. Each recordtype defines a fixed number of fields, or attributes, and each field is usually ofa fixed length.

The three most widely accepted record based data models are:

• Hierarchical Model

• Network Model

• Relational Model

Hierarchical Database Model

• Hierarchical Database model is one of the oldest database models, dating from late1950s. One of the first hierarchical databases Information Management System (IMS)was developed jointly by North American Rockwell Company and IBM. This model islike a structure of a tree with the records forming the nodes and fields forming thebranches of the tree.

Hierarchical Database Model

• In the hierarchical data model, records are linked with other superior records on whichthey are dependent and also on the records, which are dependent on them. A tree structuremay establish one-to-many relationship. Figure illustrates the structure of a family. Greatgrandparent is the root of the structure. Parents can have many children exhibiting one tomany relationships. The great grandparent record is known as the root of the tree. Thegrandparents and children are the nodes or dependents of the root. In general, a root mayhave any number of dependents. Each of these dependent may have any number of lowerlevel dependents, and so on, with no restriction of levels.

The different elements (e.g. records) present in the hierarchical tree structure have Parent-Child relationship. A Parent element can have many children elements but a Child element cannot have many parent elements. That is, hierarchical model cannot represent many to many relationships among records.

Operations on Hierarchical Model

There are four basic operations Insert, Update, Delete and Retrieve that can be performed on each model. Now, we consider in detail that how these basic operations are performed in hierarchical database model.

• Insert Operation:

It is not possible to insert the information of the supplier e.g. S4 who does notsupply any part. This is because a node cannot exist without a root. Since, a part P5that is not supplied by any supplier can be inserted without any problem, because aparent can exist without any child. So, we can say that insert anomaly exists onlyfor those children, which has no corresponding parents.

• Update Operation:

Suppose we wish to change the city of supplier S1 from Qadian to Jalandhar, thenwe will have to carry out two operations such as searching S1 for each part and thenmultiple updations for different occurrences of S1. But, if we wish to change the cityof part P1 from Qadian to Jalandhar, then these problems will not occur becausethere is only a single entry for part P I and the problem of inconsistency will notarise. So, we can say that update anomalies only exist for children not for parentbecause children may have multiple entries in the database.

Operations on Hierarchical Model

• Delete Operation:

In hierarchical model, quantity information is incorporated into supplier record. Hence, theonly way to delete a shipment (or supplied quantity) is to delete the corresponding supplierrecord. But such an action will lead to loss of information of the supplier, which is not desired.For example: Supplier S2 stops supplying 250 quantity of part PI, then the whole record of S2has to be deleted under part PI which may lead to loss the information of supplier. Anotherproblem will arise if we wish to delete a part information and that part happens to be onlypart supplied by some supplier. In hierarchical model, deletion of parent causes the deletion ofchild records also and if the child occurrence is the only occurrence in the whole database,then the information of child records will also lost with the deletion of parent. For example: ifwe wish to delete the information of part P2 then we also lost the information of S3, S2 and S1supplier. The information of S2 and Sl can be obtained from PI, but the information aboutsupplier S3 is lost with the deletion of record for P2.

• Record Retrieval:

Record retrieval methods for hierarchical model are complex and asymmetric which can be clarified with the following queries:

Advantages Of Hierarchical Model1. Simplicity

Data naturally have hierarchical relationship in most of the practical situations. Therefore, it iseasier to view data arranged in manner. This makes this type of database more suitable for thepurpose.

2. Security

These database system can enforce varying degree of security feature unlike flat-file system.

3. Database Integrity

Because of its inherent parent-child structure, database integrity is highly promoted in thesesystems.

4. Efficiency:

The hierarchical database model is a very efficient, one when the database contains a largenumber of I: N relationships (one-to-many relationships) and when the users require large numberof transactions, using data whose relationships are fixed.

Disadvantages Of Hierarchical Model1. Complexity of Implementation:The actual implementation of a hierarchical database depends on thephysical storage of data. This makes the implementation complicated.

2. Difficulty in Management:The movement of a data segment from one location to another cause all theaccessing programs to be modified making database management acomplex affair.

3. Complexity of Programming:Programming a hierarchical database is relatively complex because theprogrammers must know the physical path of the data items.

4. Poor Portability:

The database is not easily portable mainly because there is little or no standardexisting for these types of database.

5. Database Management Problems:If you make any changes in the database structure of a hierarchicaldatabase, then you need to make the necessary changes in all theapplication programs that access the database. Thus, maintaining thedatabase and the applications can become very difficult.

The Network model replaces the hierarchical tree with a graph thus allowing more generalconnections among the nodes. The main difference of the network model from the hierarchicalmodel, is its ability to handle many to many (N:N) relations. In other words, it allows a record tohave more than one parent. Suppose an employee works for two departments. The stricthierarchical arrangement is not possible here and the tree becomes a more generalized graph - anetwork. The network model was evolved to specifically handle non-hierarchical relationships. Asshown below data can belong to more than one parent. Note that there are lateral connections aswell as top-down connections. A network structure thus allows 1:1 (one: one), l: M (one: many), M:M (many: many) relationships among entities.

• In network database terminology, a relationship is a set. Each set is made up of at least two

types of records: an owner record (equivalent to parent in the hierarchical model) and a

member record (similar to the child record in the hierarchical model).

• The database of Customer-Loan, which we discussed earlier for hierarchical model, is now

represented for Network model as shown.

• In can easily depict that now the information about the joint loan L1 appears single time, but in

case of hierarchical model it appears for two times. Thus, it reduces the redundancy and is

better as compared to hierarchical model.

Network Model

Network Model

Operations on Network Model• Detailed description of all basic operations in Network Model is as under:


To insert a new record containing the details of a new supplier, we simply create a new record occurrence. Initially, there will be no connector. The new supplier's chain will simply consist of a single pointer starting from the supplier to itself.

• For example, supplier S4 can be inserted in network model that does not supply any part as a new record occurrence with a single pointer from S4 to itself. This is not possible in case of hierarchical model. Similarly a new part can be inserted who does not supplied by any supplier.

• Consider another case if supplier S 1 now starts supplying P3 part with quantity 100, then a new connector containing the 100 as supplied quantity is added in to the model and the pointer of S1 and P3 are modified as shown in the below.


Unlike hierarchical model, where updation was carried out by search and had manyinconsistency problems, in a network model updating a record is a much easier process. Wecan change the city of S I from Qadian to Jalandhar without search or inconsistencyproblems because the city for S1 appears at just one place in the network model. Similarly,same operation is performed to change the any attribute of part.

• Delete operation:

If we wish to delete the information of any part say PI, then that record occurrence can be deleted by removing the corresponding pointers and connectors, without affecting the supplier who supplies that part i.e. P1, the model is modified as shown. Similarly, same operation is performed to delete the information of supplier.

• Retrieval Operation:

Record retrieval methods for network model are symmetric but complex. In order to

understand this considers the following example queries:

Advantages and Disadvantages of Network Model

The Network model retains almost all the advantages of the hierarchical model while eliminating some of its shortcomings.

The main advantages of the network model are:

Conceptual simplicity:Just like the hierarchical model, the network model IS also conceptually simple and easy to design.

Capability to handle more relationship types:

The network model can handle the one to- many (l:N) and many to many (N:N) relationships, which is a real help in modeling the real life situations.

• Ease of data access: The data access is easier and flexible than the hierarchical model.

• Data Integrity: The network model does not allow a member to exist without an owner. Thus, a user must first define the owner record and then the member record. This ensures the data integrity.

• Data independence: The network model is better than the hierarchical model in isolating the programs from the complex physical storage details.

• Database Standards: One of the major drawbacks of the hierarchical model was the non-availability of universal standards for database design and modeling. The network model is based on the standards formulated by the DBTG and augmented by ANSI/SP ARC (American National Standards Institute/Standards Planning and Requirements Committee) in the 1970s. All the network database management systems conformed to these standards. These standards included a Data Definition Language (DDL) and the Data Manipulation Language (DML), thus greatly enhancing database administration and portability.

Disadvantages of Network Model

• Even though the network database model was significantly better than the hierarchical database model, it also had many drawbacks. Some of them are

• System complexity:

All the records are maintained using pointers and hence the whole database structure becomes very complex.

• Operational Anomalies:

As discussed earlier, network model's insertion, deletion and updatingoperations of any record require large number of pointer adjustments, whichmakes its implementation very complex and complicated.

• Absence of structural independence:

Since the data access method in the network database model is a navigationalsystem, making structural changes to the database is very difficult in mostcases and impossible in some cases. If changes are made to the databasestructure then all the application programs need to be modified before they canaccess data. Thus, even though the network database model succeeds inachieving data independence, it still fails to achieve structural independence.

Relational Model

• Relational model stores data in the form of tables. This concept purposed by Dr. E.F. Codd, a researcher of IBM in the year 1960s. The relational model consists of three major components:

1. The set of relations and set of domains that defines the way data can be

represented (data structure).

2. Integrity rules that define the procedure to protect the data (data integrity).

3. The operations that can be performed on data (data manipulation).

A rational model database is defined as a database that allows you to group its

data items into one or more independent tables that can be related to one

another by using fields common to each related table.

• Basic Terminology used in Relational ModelThe figure shows a relation with the. Formal names of the basic components marked the entire structure is, as we have said, a relation.

Tuples of a RelationEach row of data is a tuple. Actually, each row is an n-tuple, but the "n-" is usually dropped.Cardinality of a relation: The number of tuples in a relation determines its cardinality. In this case, the relation has a cardinality of 4.Degree of a relation: Each column in the tuple is called an attribute. The number of attributes in a relation determines its degree. The relation in figure has a degree of 3.Domains: A domain definition specifies the kind of data represented by the attribute.

Operations in Relational Model

The four basic operations Insert, Update, Delete and Retrieve operations are shown below on

the sample database in relational model:


Suppose we wish to insert the information of supplier who does not supply any part, can beinserted in S table without any anomaly e.g. S4 can be inserted in Stable. Similarly, if we wishto insert information of a new part that is not supplied by any supplier can be inserted into a Ptable. If a supplier starts supplying any new part, then this information can be stored inshipment table SP with the supplier number, part number and supplied quantity. So, we cansay that insert operations can be performed in all the cases without any anomaly.


Suppose supplier S1 has moved from Qadian to Jalandhar. In that case we need to makechanges in the record, so that the supplier table is up-to-date. Since supplier number is theprimary key in the S (supplier) table, so there is only a single entry of S 1, which needs a singleupdate and problem of data inconsistencies would not arise. Similarly, part and shipmentinformation can be updated by a single modification in the tables P and SP respectivelywithout the problem of inconsistency. Update operation in relational model is very simple andwithout any anomaly in case of relational model.

• Delete Operation:

Suppose if supplier S3 stops the supply of part P2, then we have to delete the shipment connecting part P2and supplier S3 from shipment table SP. This information can be deleted from SP table without affecting thedetails of supplier of S3 in supplier table and part P2 information in part table. Similarly, we can delete theinformation of parts in P table and their shipments in SP table and we can delete the information suppliers inS table and their shipments in SP table.

• Record Retrieval:Record retrieval methods for relational model are simple and symmetric which can be clarified with the following queries:

• Query1: Find the supplier numbers for suppliers who supply part P2.

Solution

In order to get this information we have to search the information of part P2 in the SP table (shipment table). For this a loop is constructed to find the records of P2 and on getting the records, corresponding supplier numbers are printed.

Algorithm

do until no more shipments;get next shipment where PNO=P2;print SNO;end;

Advantages and Disadvantages of Relational Model

The major advantages of the relational model are:

Structural independence:

In relational model, changes in the database structure do not affect the data access. When it is possible to make change to the database structure without affecting the DBMS's capability to access data, we can say that structural independence has been achieved. So, relational database model has structural independence.

• Conceptual simplicity:

We have seen that both the hierarchical and the network database model were conceptually simple. But the relational database model is even simpler at the conceptual level. Since the relational data model frees the designer from the physical data storage details, the designers can concentrate on the logical view of the database.

• Design, implementation, maintenance and usage ease:

The relational database model\ achieves both data independence and structure independence making the database design, maintenance, administration and usage much easier than the other models.

Disadvantages of Relational ModelThe relational model's disadvantages are very minor as compared to the advantages and their capabilitiesfar outweigh the shortcomings Also, the drawbacks of the relational database systems could be avoided ifproper corrective measures are taken. The drawbacks are not because of the shortcomings in the databasemodel, but the way it is being implemented.

Some of the disadvantages are:

• Hardware overheads:

Relational database system hides the implementation complexities and the physical data storage detailsfrom the users. For doing this, i.e. for making things easier for the users, the relational database systemsneed more powerful hardware computers and data storage devices. So, the RDBMS needs powerfulmachines to run smoothly. But, as the processing power of modem computers is increasing at anexponential rate and in today's scenario, the need for more processing power is no longer a very big issue.

• Ease of design can lead to bad design: .

The relational database is an easy to design and use. The users need not know the complex details ofphysical data storage. They need not know how the data is actually stored to access it. This ease of designand use can lead to the development and implementation of very poorly designed database managementsystems. Since the database is efficient, these design inefficiencies will not come to light when the databaseis designed and when there is only a small amount of data. As the database grows, the poorly designeddatabases will slow the system down and will result in performance degradation and data corruption.

•

Characteristic Hierarchical model Network model Relational model

Data structureOne to many or one to one

relationshipsAllowed the network model to

support many to many relationshipsOne to One,One to many, Many to many

relationships

In hierarchical model only one-to-many or one-to-one relationships can be exist. But in network data model makes it possible to map many to many relationships In relational each record can have multiple parents and multiple child records. In effect, it supports many to many

relationships

Data structureBased on parent child

relationshipA record can have many parents as

well as many children.Based on relational data structures

In hierarchical model relationship based in terms of parent child. So a child may only have one parent but a parent can have multiple children. But in network data model a record can have many parents as well as many children. Relational data model is based on relational

data structures

Comparison between Hierarchical, Network and Relational Model

Comparison between Hierarchical, Network and Relational Model

Characteristic Hierarchical model Network model Relational model

Data manipulationDoes not provide an independent

stand alone query interfaceCODASYL (Conference on Data Systems Languages)

Relational databases are what brings many sources into a common query (such as SQL)

In relational database it use powerful operations such as SQL languages or query by example are used to manipulate data stored in the database But in hierarchical data model it does not

provide an independent stand alone query interface while network model uses CODASYL

Data manipulationretrieve algorithms are complex and

asymmetricRetrieve algorithms are complex and symmetric

Retrieve algorithms are simple and symmetric

In hierarchical data model and network model retrieve algorithms are complex and symmetric But in relational data model retrieve algorithms are simple and symmetric

Data integrityCannot insert the information of a

child who does not have any parent.Does not suffer form any insertion anomaly.

Does not suffer from any insert anomaly.

In hierarchical data model we cannot insert the information of a child who does not have any parent. But in network model does not suffer form any insertion anomaly. relational model does not suffer from any insert anomaly

Data integrity

Multiple occurrences of child records which lead to problems of inconsistency during the update

operation

Free from update anomalies.Free form update

anomalies

In network model it is free from update anomalies because there is only a single occurrence for each record set.In relational model it also free form update anomalies because it removes the redundancy of data by proper designing through normalization process. But in hierarchical model there are multiple occurrences of child records. which lead

to problems of inconsistency during the update operation

Data intergirtyDeletion of parent results in deletion

of child recordsFree from delete anomalies

Free from delete anomalies

In hierarchical model it is based on parent child relationship and deletion of parent results in deletion of child records .But in network model and in relational model it is free from deletion anomalies. Because information is stored in different tables.

dbms mca-section a

Technology

dbmsdefinition of database

database manager

tabular database

distributed database

data format

data maintenance

similar data

shared data