chapter1_rdbms

8/4/2019 Chapter1_RDBMS

1/30

Chapter 1

Database System concept & Data Modeling

Database concepts

A data is a piece of information

A database is a set of interrelated and meaningful data related to an enterprise.

A database Management system (DBMS) is a collection of interrelated data and a

set of programs to access those data.

The primary goal of a DBMS is to provide an efficient way to store & retrieve datato or from a database.

Database System Applications /Applications of DBMS

Databases are widely used. Here are some representative applications:

1. Banking: To maintain customer information, accounts, loans, and banking

transactions.

2. Airlines: For flight reservations and schedule information. Airlines were among the

first to use databases

3. Universities: For student information, course registrations, and grades.

4. Credit card transactions: For purchases on credit cards and generation of monthly

statements.

5. Telecommunication: For keeping records of phone calls made, generating monthly

phone bills, maintaining balances on prepaid calling cards, and storing information

about the communication networks.

6. Finance: For storing information about holdings, sales, and purchases of financial

instrument such as stocks and bonds.

7. Sales: For customer, product, and purchase information.

1


2/30

8. Manufacturing: For management of supply chain and for keeping track of

production of items in factories, inventories of items in warehouses/stores, and orders

for items.

9. Human resources: For information about employees, salaries, payroll taxes and

benefits, and for generation of paychecks.

Disadvantage of File processing System:

File processing system has a number of disadvantages. They are as follows:

1) Data redundancy and inconsistency:

Since different

programmers create files and application programs.

Various files have

different formats and the programs may be written in several programming languages.

Moreover, the same

information may be duplicated in several places (files). For example, the address and

telephone number of a particular student may be stored in office file while taking

admissions and in department file also. This storing of same data more than once is

called as data redundancy.

This redundancy

leads to higher storage and access cost.

Data redundancy

leads to data inconsistency; which means multiple copies of a data may have

different values for a particular field. For example, a students changed address may be

updated in the department file but not in the Office file. Thus this leads to incorrect

data.

2) Difficulty in accessing data:

Suppose that a bank officers needs to find out the names of all customers who live

in kothrud.

2


3/30

The officer asks the data-processing department to generate such a list. Because the

designers of the original system did not anticipate this request, there is no application

program on hand to meet it.

There is, however, an application program to generate the list of all customers.

The bank officer has now two choices: either obtain the list of all customers and

extract the needed information manually or ask a system programmer to write the

necessary application program.

Both alternatives are obviously unsatisfactory. Suppose that such a program is

written, and that, several days later, the same officer needs to trim that list to include

only those customers who have an account balance of $10,000 or more.

As expected, a program to generate such a list does not exist. Again, the officer has

the preceding two options, neither of which is satisfactory.

The point here is that conventional file-processing environments do not allow

needed data to be retrieved in a convenient and efficient manner.

More responsive data-retrieval systems are required for general use.

3) Data isolation.

Because data are scattered in various files, and files were in different

formats, writing new application programs to access the appropriate data

was difficult.

4)Integrity problems

The data values stored in the database must satisfy certain types of

consistency constraints.

For example, the balance of a bank account may never fall below a

prescribed amount (say, $25).

Developers enforce these constraints in the system by adding appropriate

code in the various application programs.

However, when new constraints are added, it is difficult to change the

programs to enforce them.

3


4/30

The problem is compounded when constraints involve several data items

from different files.

5)Atomicity problems.

A computer system, like any other mechanical or electrical device, is subject to

failure.

In many applications, it is crucial that, if a failure occurs, the data be restored to the

consistent state that existed prior to the failure.

Consider a program to transfer $50 from account A to account B. If a system failure

occurs during the execution of the program, it is possible that the $50 was removed

from account a but was not credited to account B, resulting in an inconsistent database

state.

Clearly, it is essential to database consistency that either both the credit and debit

occur, or that neither occur.

That is, the funds transfer must be atomicit must happen in its entirety or not at

all. It is difficult to ensure atomicity in a conventional file-processing system.

6)Concurrent-access anomalies

For the sake of overall performance of the system and faster response, many

systems allow multiple users to update the data simultaneously.

In such an environment, interaction of concurrent updates may result in

inconsistent data.

Consider bank account a, containing $500. If two customers withdraw funds (say

$50 and $100 respectively) from account A at about the same time, the result of the

concurrent executions may leave the account in an incorrect (or inconsistent) state.

Suppose that the programs executing on behalf of each withdrawal read the oldbalance, reduce that value by the amount being withdrawn, and write the result back.

If the two programs run concurrently, they may both read the value $500, and write

back $450 and $400, respectively.

4


5/30

Depending on which one writes the value last, the account may contain $450 or

$400, rather than the correct value of $350.

To guard against this possibility, the system must maintain some form of

supervision.

But supervision is difficult to provide because many different application programs

that have not been coordinated previously may access data.

7)Security problems.

Not every user of the database system should be able to access all the data.

For example, in a banking system, payroll personnel need to see only that part of

the database that has information about the various bank employees.

They do not need access to information about customer accounts. But, since

application programs are added to the system in an ad hoc manner, enforcing such

security constraints is difficult.

Advantage of a DBMS over file processing System

1) Redundancy Can be controlled

In traditional non-database system, each department maintains its own files for

handling its data processing application.

2) Data can be shared

3) Standard can be enforced

4) Security measure can be applied

5) Integrity can be maintained

6) Conflicting requirements can be balanced

View of Data

A database system is a collection of interrelated files and a set of programs that allow users

to access and modify these files. A major purpose of a database system is to provide users

with an abstract view of the data. That is, the system hides certain details of how the data

are stored and maintained

5


6/30

Data Abstraction

For the system to be usable, it must retrieve data efficiently. The need for efficiency has led

designers to use complex data structures to represent data in the database. Since many

database-systems users are not computer trained, developers hide the complexity fromusers through several levels of abstraction, to simplify users interactions with the system:

Physical level.

1) The lowest level of abstraction describes how the data are actually stored. The physical

level describes complex low-level data structures in detail.

Logical level.

1) The next-higher level of abstraction describes what data are stored in the database, and

what relationships exist among those data.

2) The logical level thus describes the entire database in terms of a small number of

relatively simple structures.

3) It involves complex physical-level structures but its logical level structure is simple. The

user of the logical level does not need to be aware of this complexity.

4) Database administrators, who must decide what information to keep in the database, use

the logical level of abstraction.

View level.

1) The highest level of abstraction describes only part of the entire database.

2) Even though the logical level uses simpler structures, complexity remains because of the

variety of information stored in a large database.

3) Many users of the database system do not need all this information; instead, they need to

access only a part of the database.

4) The view level of abstraction exists to simplify their interaction with the system. The

system may provide many views for the same database.

5) Figure 1.1 shows the relationship among the three levels of abstraction.

6


7/30

Fig 1.1 the three level of data Abstraction

Most high-level programming languages support the notion of a record type. For example,

in a Pascal-like language, we may declare a record as follows:

Type customer = record

Customer-id: string;

Customer-name: string;

Customer-street: string;

Customer-city: string; end;

This code defines a new record type called customer with four fields. Each field has a name

and a type associated with it. A banking enterprise may have several such record types,

including

Account, with fields account-number and balance

Employee, with fields employee-name and salary

At the physical level, a customer, account, or employee record can be described as a block

of consecutive storage locations (for example, words or bytes). The language compiler

hides this level of detail from programmers. Similarly, the database system hides many of

the lowest-level storage details from database programmers. Database administrators, on

the other hand, may be aware of certain details of the physical organization of the data.

7


8/30

At the logical level, each such record is described by a type definition, as in the previous

code segment, and the interrelationship of these record types is defined as well.

Programmers using a programming language work at this level of abstraction. Similarly,

database administrators usually work at this level of abstraction.

Finally, at the view level, computer users see a set of application programs that hide

details of the data types. Similarly, at the view level, several views of the database are

defined, and database users see these views. In addition to hiding details of the logical level

of the database, the views also provide a security mechanism to prevent users from

accessing certain parts of the database. For example, tellers in a bank see only that part of

the database that has information on customer accounts; they cannot access information

about salaries of employees.

Database Languages:

There are two types of languages as DDL and DML. A database system provides a data

definition language to specify the database schema and a data manipulation language to

express database queries and updates.

In practice, the data definition and data manipulation languages are not two separate

languages; instead they simply form parts of a single database language, such as the widely

used SQL language.

1) Data Manipulation Language

Data manipulation language (DML) is a language that enables users to access or

manipulate data as organized by the appropriate data model. The types of access are

The retrieval of information stored in the database

The insertion of new information into the database

The deletion of information from the database

The modification of information stored in the database

There are basically two types:

8


9/30

Procedural DMLs require a user to specify what data are needed and how to get

those data.

Declarative DMLs (also referred to as nonprocedural DMLs) require a user to

Specify whatdata are needed withoutspecifying how to get those data.

Declarative DMLs are usually easier to learn and use than are procedural DML s. since a

user does not have to specify how to get the data, the database system has to figure out an

efficient means of accessing data. The DML component of the SQL language is

nonprocedural.

A query is a statement requesting the retrieval of information. The portion of a DML that

involves information retrieval is called a query language.

This query in the SQL language finds the name of the customer whose customer-id

Is 101:

Select customer. Customer-name

From customer

Where customer. customer-id= 101

The query specifies that those rows from the table customer where the customer-id is

101 must be retrieved, and the customer-name attribute of these rows must be

displayed.

2) Data-Definition Language

A special language used to specify a database schema by a set of definitions called a data-

definition language (DDL). For instance, the following statement in the SQL language

defines the account table:

Create table account (account-numberchar (10), balance integer (4));

9


10/30

Execution of the above DDL statement creates the account table. In addition, it updates a

special set of tables called the data dictionary or data directory. A data dictionary

contains metadatathat is, data about data . The schema of a table is an example of

metadata. A database system consults the data dictionary before reading or modifying

actual data. We specify the storage structure and access methods used by the database

system by a set of statements in a special type of DDL called a data storage and

definition language.

These statements define the implementation details of the database schemas, which are

usually hidden from the users. The data values stored in the database must satisfy certain

consistency constraints. For example, suppose the balance on an account should not fall

below $100. The DDL provides facilities to specify such constraints. The database systems

check these constraints every time the database is updated.

1) Domain Constraints

A domain of possible values must be associated with every attribute. (For example

integer type, character type, date/time type)

Declaring an attribute to be of a particular domain acts as a constraints on the values

that it can take.

Its an elementary part of the integrity constraints.

They are tested easily by the system whenever a new data items is inserted into the

database

2) Referential Integrity

In this case, we ensure that the value appear in one relation for a given set of

attributes also appears for a certain set of attributes in another relation (Referential

Integrity).

Database modification can cause violation of the referential integrity

When a referential integrity constraints is violated, the normal procedure is to reject

the action that caused the violation

3) Assertions

An assertion is the condition that the database must always satisfy

10


11/30

Domain constraints and referential integrity constraints are the special form of

the assertion.

But there are many constraints that we cant specify by using only these special

forms.

For example, every loan has at least one customer who maintains an account with

a minimum balance of $1000.00.

When an assertion is created, the system test for its validity

If the assertion is valid, then any future modification to the database is allowed

without violating the assertion.

4) Authorization

We can differentiate among user based on type of access they are permitted

on various data values in the database are expressed in terms of authorization

Authorization are of following type

Read Authorization: which allows reading but not modification of data.

Insert Authorization: which allows insertion of new data , but not

modification of the existing data

Update Authorization: Which allows modification of data but not deletion of

the data

Delete Authorization: which allows deletion of data

Instances And Schema

Database changes over a time as information is inserted and deleted over a time.

The collection of information stored in the database at a particular moment is called

as instance of the database

The overall design of the database is called the database schema

The concept of database schema and instances can be understood by comparing it

with programming language.

A database schema corresponds to the variable declarations (along with type

definitions) in a program

11


12/30

Each variable has a particular value at a given instant. The value of the variable in a

program at a point in time correspond to an instances of the database schema

Database system have several schemas based on the level of abstraction

1) Physical Schema: It describes database schemas at physical level

2) Logical schema: It describes database schemas at logical level

3) Subschema: A database may have several schemas at the view level called

subschema that describes different views of the database.

Physical data independence

It is the ability to modify the physical schema without causing application program at the

physical level is called as Physical data independence.

Or

The physical schema is hidden and can usually be changed easily without affecting

application programs. Application programs are said to exhibit physical data

independence if they do not depend on the physical schema, and thus need not be rewritten

if the physical schema changes.

Logical data independence

It is the ability to modify the logical schema without causing application program at the

logical level is called as logical data independence.

Database Users and Administrators

A primary goal of a database system is to retrieve information from and store new

information in the database. People who work with a database can be categorized as

database users or database administrators.

Database Users and User Interfaces

There are four different types of database-system users, differentiated by the way they

expect to interact with the system. Different types of user interfaces have been designed for

the different types of users.

1) Naive users

These are unsophisticated users who interact with the system by invoking one of the

application programs that have been written previously.

12


13/30

For example, a bank teller who needs to transfer $50 from accountA to accountB

invokes a program called transfer.

This program asks the teller for the amount of money to be transferred, the account

from which the money is to be transferred, and the account to which the money is to

be transferred.

As another example, consider a user who wishes to find her account balance over

the World Wide Web. Such a user may access a form, where she enters her account

number.

An application program at the Web server then retrieves the account balance, using

the given account number, and passes this information back to the user.

2) Application programmers

These are computer professionals who write application programs. Application

programmers can choose from many tools to develop user interfaces.

Rapid application development (RAD) tools are tools that enable an application

programmer to construct forms and reports without writing a program.

There are also special types of programming languages that combine imperative

control structures (for example, for loops, while loops and if-then-else statements)

with statements of the data manipulation language.

These languages, sometimes calledfourth-generation languages.

4) Sophisticated users

These areinteracting with the system without writing programs. Instead, they form

their requests in a database query language. They submit each such query to a

query processor, whose function is to break down DML statements into

instructions that the storage manager understands.

Analysts who submit queries to explore data in the database fall in this category

Online analytical processing (OLAP) tools simplify analysts tasks by letting

them view summaries of data in different ways.

13


14/30

For instance, an analyst can see total sales by region (for example, North, South,

East, and West), or by product, or by a combination of region and product (that is,

total sales of each product in each region).

The tools also permit the analyst to select specific regions, look at data in more

detail (for example, sales by city within a region) or look at the data in less detail

(for example, aggregate products together by category).

Another class of tools for analysts is data mining tools, which help them, find

certain kinds of patterns in data.

5) Specialized users

These are sophisticated users who write specialized database applications that

do not fit into the traditional data-processing framework.

Among these applications are computer-aided design systems, knowledge

base and expert systems, systems that store data with complex data types (for

example, graphics data and audio data), and environment-modeling systems.

Database Administrator

One of the main reasons for using DBMSs is to have central control of both the data and

the programs that access those data. A person who has such central control over the system

is called a database administrator (DBA).

The functions of a DBA include:

Schema definition. The DBA creates the original database schema by executing a set of

data definition statements in the DDL.

Storage structure and access-method definition.

Schema and physical-organization modification. The DBA carries out changes to the

schema and physical organization to reflect the changing needs of the organization, or to

alter the physical organization to improve performance.

Granting of authorization for data access. By granting different types of authorization,

the database administrator can regulate which parts of the database various users can

access. The authorization information is kept in a special system structure that the database

system consults whenever someone attempts to access the data in the system.

14


15/30

Routine maintenance. Examples of the database administrators routine maintenance

activities are:

Periodically backing up the database, either onto tapes or onto remote servers, to

prevent loss of data in case of disasters such as flooding.

Ensuring that enough free disk space is available for normal operations, and

upgrading disk space as required

Monitoring jobs running on the database and ensuring that performance is not

degraded by very expensive tasks submitted by some users.

Database System Structure / Overall Structure

of DBMS

15


16/30

16


17/30

A database system is partitioned into modules. The functional components of a database

system can be broadly divided into the storage manager and the query processor

components. The storage manager is important because databases typically require a large

amount of storage space. Corporate databases range in size from hundreds of gigabytes to,

for the largest databases, terabytes of data. Since the main memory of computers cannot

store this much information, the information is stored on disks. Data are moved between

disk storage and main memory as needed. Since the movement of data to and from disk is

slow relative to the speed of the central processing unit, it is imperative that the database

system structures the data so as to minimize the need to move data between disk and main

memory. The query processor is important because it helps the database system simplify

and facilitate access to data. High-level views help to achieve this goal; with them, users of

the system are not being burdened unnecessarily with the physical details of the

implementation of the system. However, quick processing of updates and queries is

important. It is the job of the database system to translate updates and queries written in a

nonprocedural language, at the logical level, into an efficient sequence of operations at the

physical level.

1) Storage Manager

Astorage manageris a program module that provides the interface between the low level

data stored in the database and the application programs and queries submitted to the

system. The storage manager is responsible for the interaction with the file manager. The

raw data are stored on the disk using the file system, which is usually provided by a

conventional operating system. The storage manager translates the various DML statements

into low-level file-system commands. Thus, the storage manager is responsible for storing,

retrieving, and updating data in the database.

The storage manager components include:

Authorization and integrity manager, which tests for the satisfaction of integrity

constraints and checks the authority of users to access data.

Transaction manager, which ensures that the database remains in a consistent (correct)

state despite system failures, and that concurrent transaction executions proceed without

conflicting.

17


18/30

File manager, which manages the allocation of space on disk storage and the data

structures used to represent information stored on disk.

Buffer manager, which is responsible for fetching data from disk storage into main

memory, and deciding what data to cache in main memory. The buffer manager is a critical

part of the database system, since it enables the database to handle data sizes that are much

larger than the size of main memory. The storage manager implements several data

structures as part of the physical system implementation:

Data files, which store the database itself.

Data dictionary, which stores metadata about the structure of the database, in particular

the schema of the database.

Indices, which provide fast access to data items that hold particular values.

Statistical data: It stores statistics about the data stored.

2) The Query Processor

The query processor components include

DDL interpreter, which interprets DDL statements and records the definitions in the data

dictionary.

DML compiler, which translates DML statements in a query language into an evaluation

plan consisting of low-level instructions that the query evaluation engine understands. A

query can usually be translated into any of a number of alternative evaluation plans that all

give the same result. The DML compiler also performs query optimization, that is, it picks

the lowest cost evaluation plan from among the alternatives.

Query evaluation engine, which executes low-level instructions generated by

The DML compiler.

Query processing refers to the range of activities involved in extracting data from a

database. The activities include translation of queries in high-level database languages into

expressions that can be used at the physical level of the file system

The steps involved in processing a query appear

1. Parsing and translation

18


19/30

2. Optimization

3. Evaluation

Before query processing can begin, the system must translate the query into a

usable form.

Thus, the first action the system must take in query processing is to translate a given

query into its internal form.

In generating the internal form of the query, the parser checks the syntax of the

users query, verifies that the relation names appearing in the query are names of

the relations in the database, and so on.

The system constructs a parse-tree representation of the query, which it then

translates into a relational-algebra expression.

If the query was expressed in terms of a view, the translation phase also replaces all

uses of the view by the relational-algebra expression that defines the view.

Furthermore, the relational-algebra representation of a query specifies only partially how to

evaluate a query; there are usually several ways to evaluate relational-algebra expressions.

As an illustration, consider the query

Select balance

from account

Where balance < 2500

This query can be translated into either of the following relational-algebra expressions:

abalance


20/30

.

A sequence of primitive operations that can be used to evaluate a query is a query

execution

Query-evaluation plan. Fig shows an evaluation plan for our example query. The query-

execution engine takes a query-evaluation plan, executes that plan, and returns the answers

to the query.

The different evaluation plans for a given query can have different costs. It is the

responsibility of the system to construct a query-evaluation plan that minimizes the cost of

query evaluation.

Once the query plan is chosen, the query is evaluated with that plan, and the result of the

query is output.

Query optimization is the process of selecting the most efficient query-evaluation plan

especially if the query is complex. We do not expect users to write their queries so that they

can be processed efficiently. Rather, we expect the system to construct a query-evaluation

plan that minimizes the cost of query evaluation.

Consider the relational-algebra expression for the query Find the names of all customers

Who have an account at any branch located in Brooklyn.

Data Model

A data Model is a collection of conceptual tools for describing data, data relationship, data

semantics and consistency constraints. There are three different types of model

1) E_R model

2) Network Model

3) Hierarchical Model

20


21/30

1) E_R Model

The Entityrelationship (E-R) model is a high-level data model. It is based on a perception

of a real world that consists of a collection of basic objects, called entities, and of

relationships among these objects.

The relational model is a lower-level model. It uses a collection of tables to represent both

data and the relationships among those data. Its conceptual simplicity has led to its

widespread adoption; today a vast majority of database products are based on the relational

model. Designers often formulate database schema design by first modeling data at a high

level, using the E-R model, and then translating it into the relational model.

The E-R data model employs three basic notions: entity sets, relationship sets, and

attributes.

1) An entity is a thing or object in the real world that is distinguishable from all other

objects. For example, each person in an enterprise is an entity. An entity has a set of

properties, and the values for some set of properties may uniquely identify an entity.

2) Relationship set

3) Attributes

Mapping Cardinalities

Mapping cardinalities, or cardinality ratios, express the number of entities to which another

entity can be associated via a relationship set. Mapping cardinalities are most useful in

describing binary relationship sets, although they can contribute to the description of

relationship sets that involve more than two entity sets. In this section, we shall concentrate

on only binary relationship sets.

For a binary relationship setRbetween entity setsA andB, the mapping cardinality must be

one of the following:

One to one. An entity inA is associated with at mostone entity inB, and an entity inB is

associated with at mostone entity inA.

One to many. An entity inA is associated with any number (zero or more) of entities in

B. An entity inB, however, can be associated with at mostone entity inA.

Many to one. An entity in A is associated with at mostone entity in B. An entity in B,

however, can be associated with any number (zero or more) of entities in A.

21


22/30

Many to many. An entity inA is associated with any number (zero or more) of entities in

B, and an entity inB is associated with any number (zero or more) of entities inA.

Entity-Relationship Diagram

An E-R diagram can express the overall logical structure of a database graphically. Such a

diagram consists of the following major components: Rectangles, which represent entity sets

Ellipses, which represent attributes

Diamonds, which represent relationship sets

Lines, which link attributes to entity sets and entity sets to relationship sets

Double ellipses, which represent multivalued attributes

22


23/30

Dashed ellipses, which denote derived attributes

Double lines, which indicate total participation of an entity in a relationship set

Double rectangles, which represent weak entity sets

2)

Network Model

The popularity of the network data model coincided with the popularity of the hierarchical

data model. Some data were more naturally modeled with more than one parent per child.

So, the network model permitted the modeling of many-to-many relationships in data. A set

23

Employee

Name

First name

Middlename

Last name

Student

DOB


24/30

consists of an owner record type, a set name, and a member record type. A member record

type can have that role in more than one set; hence the multiparent concept is supported.

An owner record type can also be a member or owner in another set. The CODASYL

network model is based on mathematical set theory.

The network model's original inventor was Charles Bachman, and it was developed into a

standard specification published in 1969 by the CODASYL Consortium.

In network model data are represented by a collection of records and relationships among

data are represented by links. It is used to handle non hierarchical relationshipshis model

uses pointers toward stored data.

Network Model

A network database consists of a collection of records connected to one another through

links. A record is in many respects similar to an entity in the E-R model. Each record is a

collection of fields (attributes), each of which contains only one data value. A link is an

association between precisely two records.

Advantage

1) Simplicity

The network model is also conceptually simple and easy to design

2) Ability to handle more relationship types

The network model can handle one to many and many to many relationship.

24
http://en.wikipedia.org/wiki/Charles_Bachmanhttp://en.wikipedia.org/wiki/CODASYLhttp://en.wikipedia.org/wiki/File:Network_Model.jpghttp://en.wikipedia.org/wiki/Charles_Bachmanhttp://en.wikipedia.org/wiki/CODASYL


25/30

3) Ease of data access

In network database terminology, a relationship is a set. Each set comprises of two

types os records-an owner record and a member record. In a network model an

application can access an owner record and then the menber records within a set.

4) Data Integrity

In network model no member can exit without an owner. A user must therefore first

define the owner record and then the menber record.tgis ensure the data integrity.

5) Data Indpendence

The apllication program run independently of the data.any changes made in the data

characheristics do not effect the application program

6) Database Standards

The standards form by the DBTG (Database Task Group of CODASYL comittee)

form th basis of the network model.all the network database management systems

adhere to these standards . These standards comprise of a (DDL) and DML that

augments database administrotion and portability.

Disadvantage

1) System coplexity

In this model data can be accessed one record at a time. This make it essential for

the database designer administrators and programmer to be familiar with the

internal data structure to gain access to data.therefore user friendly database

management system cannot be created by Netwok model

2) Lack of structural Independence

Making the structural modification to the database is very difficult as data access

method is navigational

3) Hierarchical model

The data is sorted hierarchically, using a downward tree. This model uses pointers to

navigate between stored data. It was the first DBMS model.

25


26/30

A hierarchical data model is a data model in which the data is organized into a tree-like

structure. The structure allows repeating information using parent/child relationships: each

parent can have many children but each child only has one parent. All attributes of a

specific record are listed under an entity type.

Example of a Hierarchical Model.

In a database, an entity type is the equivalent of a table; each individual record is

represented as a row and an attribute as a column. Entity types are related to each other

using 1: Nmapping, also known as one-to-many relationships.

The most recognized and used hierarchical databases are IMS developed by IBM and

Windows Registry by Microsoft.

An example of a hierarchical data model would be if an organization had records of

employees in a table (entity type) called "Employees". In the table there would be

attributes/columns such as First Name, Last Name, Job Name and Wage. The company

26
http://en.wikipedia.org/wiki/Data_modelhttp://en.wikipedia.org/wiki/Tree_data_structurehttp://en.wikipedia.org/wiki/One-to-manyhttp://en.wikipedia.org/wiki/Information_Management_Systemhttp://en.wikipedia.org/wiki/International_Business_Machineshttp://en.wikipedia.org/wiki/Windows_Registryhttp://en.wikipedia.org/wiki/Microsofthttp://en.wikipedia.org/wiki/File:Hierarchical_Model.jpghttp://en.wikipedia.org/wiki/Data_modelhttp://en.wikipedia.org/wiki/Tree_data_structurehttp://en.wikipedia.org/wiki/One-to-manyhttp://en.wikipedia.org/wiki/Information_Management_Systemhttp://en.wikipedia.org/wiki/International_Business_Machineshttp://en.wikipedia.org/wiki/Windows_Registryhttp://en.wikipedia.org/wiki/Microsoft


27/30

also has data about the employees children in a separate table called "Children" with

attributes such as First Name, Last Name, and date of birth. The Employee table represents

a parent segment and the Children table represents a Child segment. These two segments

form a hierarchy where an employee may have many children, but each child may only

have one parent. In a hierarchical database the parent-child relationship is one to many.

This restricts a child segment to having only one parent segment.

This simple model is commonly known as the adjacency list model, and was introduced by

Dr. Edgar F. Codd after initial criticisms surfaced that the relational model could not model

hierarchical data.

Advantage

1) Simplicity

Data base is based on hierarchical structure therelationship between various layers

is logically simple.Thus design of data model is simple

2) Data security

It is the first model which provide data security and enforced by DBMS

3) Data Integrity

As it is based on parent child relationship.there is always a link between parent

segment and the child segment under it.The child segments are always

automatically referenced to its parent, this model promotes data integrity

4) Efficiency

This model is very efficient when the database contains a large number of 1: n

relationship and when the users require large number of trascations using data

whose relationships are fixed

Disadvantage

27
http://en.wikipedia.org/wiki/Adjacency_listhttp://en.wikipedia.org/wiki/Edgar_F._Coddhttp://en.wikipedia.org/wiki/Adjacency_listhttp://en.wikipedia.org/wiki/Edgar_F._Codd


28/30

1) Implementation Complexity

Although the model is simple and easy to design ,it is quite complex to implement

The database designer should have very good knowledge of the data storage

characterics.

2) Database Management Problem

If you make any changes in the database structure of hierarchical database then it is

required to make the necessary changes in all the application prgrams that accessthe

database.Thus maintaining the database and the applications can become very

difficult.

3) Lack of structural Independence

Structural independence exits when the change made in the database structure does

not affect the DBMSs ability to access the data. It uses the physical storage paths

to navigates to the different data segments .so application programmer should have

a good knowledge of the relevant data path to access data.so if the physical

structure is changed the application will also have to be altered.

4) Programming Complexity

Due to structural independence and navigational structure the application

programmers and end user must know precisely how the data is distributed

physically in the database in order to access the data.It requires the more knowledge

of a system which is difficult for the user who have little or no programming

knowlegde

5) Implementation Limitation

Many of the common relationship de not conform to the 1:n format required by this

model.the m:m (Many to many) relationship,which is more common in real life is

very difficult to implement in this model

28


29/30

Client Server Architecture

The architecture of a database system is greatly influenced by the underlying

computer system on which it runs, in particular by such aspects of computer architecture

as networking, parallelism, and distribution:

Networking of computers allows some tasks to be executed on a server

system, and some tasks to be executed on client systems. This division of work has led to

clientserver database systems

As personal computers became faster, more powerful, and cheaper, there

was a shift away from the centralized system architecture.

Personal computers support terminals connected to centralized systems that

are used to handle directly by the centralized systems.

As a result, centralized systems today act as server systems that satisfy

requests generated by client systems.

Figure 1 shows the general structure of a clientserver system. Database

functionality can be broadly divided into two partsthe front end and the back endas

in Figure 2

The back end manages access structures, query evaluation and

optimization, concurrency control, and recovery. The front end of a database system

consists of tools such as forms, report writers, and graphical user interface facilities.

The interface between the front end and the back end is through SQL, or

through an application program to interface clients with servers.

29


30/30

Any client that uses the ODBC or JDBC interfaces can connect to any server that

provides the interface.

In earlier-generation database systems, the lack of such standards necessitated that

the front end and the back end be provided by the same software vendor.

With the growth of interface standards, the front-end user interface and the back-

end server are often provided by different vendors.

Application development toolsare used to construct user interfaces; they provide

graphical tools that can be used to construct interfaces without any programming.

Some of the popular application development tools are PowerBuilder, Magic, and

Borland Delphi; Visual Basic is also widely used for application development.

Further, certain application programs, such as spreadsheets and statistical-

analysis packages, use the clientserver interface directly to access data from a back-end

server.

In effect, they provide front ends specialized for particular tasks some transaction-

processing systems provide a transactional remote procedure call interface to connect

clients with a server.

These calls appear like ordinary procedure calls to the programmer, but all the

remote procedure calls from a client areenclosed in a single transaction at the server end.

Thus, if the transaction aborts, the server can undo the effects of the individual

remote procedure calls.

chapter1_rdbms

Documents