lectures ppt for unit 1
TRANSCRIPT
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 1/81
What is Data and Information?
Data is the term, that may be new to beginners, but it is veryinteresting and simple to understand. It can be anything like
name of a person or a place or a number etc. Data is the name
given to basic facts and entities such as names and numbers.
The main examples of data are weights, prices, costs, numbers
of items sold, employee names, product names, addresses, taxcodes, registration marks etc.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 2/81
Information: Information is data that has been converted into
a more useful or intelligible form. It helps human beings in
their decision making process. Examples are: Time Table,Merit List, Report card, Headed tables, printed documents,
pay slips, receipts, reports etc. The information is obtained by
assembling items of data into a meaningful form.
For example, marks obtained by students and their roll
numbers form data, the report card/sheet is the information.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 3/81
Difference between Data and
Information
Data is the material on which computer programs work upon. It
can be numbers, letters of the alphabet, words, special symbols
etc. But by themselves they have no meaning. For example, thefollowing sequence of digits 240343 is meaningless by itself
since it could refer to a data of birth, a part number for a
automobile, the number of rupees spent on a project, population
of a town, the number of people employed in a largeorganization etc. Once we know what the sequence refers to,
then it becomes meaningful and can be called information.
When we write above as 24-03-43, it may mean date of birth as
24th March 1943.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 4/81
A set of words would be data but text would be information.
For example “ANNUAL-EXAMINATION, AMITABH,
JYOTSNA, PHYSICS” is a set of data and “JYOTSNA
SCORED THE HIGHEST MARKS IN PHYSICS IN
ANNUAL EXAMINATION” is information.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 5/81
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 6/81
Database
The related information when placed is an organized form makes a
database. The organization of data/information is necessary becauseunorganized information has no meaning. There are so many examples
of organized information, more precisely and the most common are, the
dictionary, the telephone directory, student record register, your own
address book and many more. In each of these the data is stored in some
particular order i.e. in an organized form.
In dictionary, the words are arranged in alphabetic order along with their
meanings. So that it becomes easier to search any word whose meaning
is required. If this ordering would not have done, how could you find one
word out of say 10,000 words. Similarly everybody can make a database
of his/her own to keep the information in an organized manner. Think of
your own address directory where you keep the addresses and phone
numbers of your near and dear ones and it is also a database. Now let us
move one step ahead. What do we do with that database?
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 7/81
There are so many operations like:
¨ To add new information (e.g. to add the address of a new friend in
your address book)
To view or retrieve the stored information (e.g. you have to find
the address of one of your old friends)
¨ To modify or edit the existing information (e.g. your friend hasshifted to a new place so his address would get changed)
¨ To remove or delete the unwanted information (e.g. your friend has
changed his/her mobile number, so his/her mobile number would have to
be removed from list)¨ Arranging the information in a desired order etc.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 8/81
Manual database and its
problems Consider an example of accounts department of an organization. To
make the salary calculations of the employees every month they are to
keep the record of every employee and do a number of calculations such
as addition of allowances like DA, HRA to the basic salary and to makeseveral deductions as loan recoveries, income tax and insurance etc and
at the end, to make the pay slips of the net pay. This whole procedure is
repeated every month and is very tedious and laborious job. It‟s a mere
calculation job and does not require any logic or intelligence. So to waste
the skills and intelligence of human beings on such repetitivecalculations is not a wise decision.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 9/81
Consider another situation where a magazine publisher, who has 10000
subscribers, receives a cheque from Mr. Suneet Bhatia with a request to
renew his subscription for the magazine, but Mr. Suneet Bhatia does not
mention his subscription number. Now, the publisher has to search the
entire list of 10,000 names to find out the subscription number of Mr.
Suneet Bhatia. This is a boring job, isn‟t it?
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 10/81
Consider the case list of shareholders maintained by a public limited
company. Usually there are thousands of shareholders in each company.
Some large companies may have even millions of shareholders. Every
day, the company receives several requests for a change of address.
Similarly, when a person sells a share certificate to another person, the
buyer sends the share certificate to the company for transferring of the
corresponding shares to his name. Then the company has to update its
records.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 11/81
The company also has to send regular communications for meetings of
shareholders, annual accounts, payment of dividends, offers of additionalshares, etc. Next, assume that a company, which has one million share
holders wants to pay dividend. And according to the latest Government
regulations the company must complete the job within 10 days. Now, to
calculate the dividend payable to its one million shareholders, and writecheques and mail letters, in ten days, even by putting 100 people on the
job, is very difficult to accomplish within the time frame. By employing
extra staff or by taking additional time, the company may be able to
complete the job. However, there are bound to be several mistakes, such
as a wrong calculation of the dividend amount, or mailing the letters to
wrong addresses.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 12/81
Database and Computers
¨ Computer has a large storage capacity. It can store thousands of
records at a time.
¨ It has high speed, within no time it searches any desired
information, arrange the data in alphabetical order, do calculations onthe data and make repetitions and so on.
¨ Computer is more accurate.
¨ Data in computers can be stored in the form of a file, records andfields.
¨ There are two approaches for storing data in computers such as
File based approach and Database approach.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 13/81
File Based Approach
File Based system: File-based systems were an early attempt to
computerize the manual filing system that we are all familiar with.
A file system is a method for storing and organizing computer files and
the data they contain to make it easy to find and access them.
File systems may use a storage device such as a hard disk or CD-ROM
and involve maintaining the physical location of the files
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 14/81
In our own home, we probably have some sort of filing system, which
contains receipts, guarantees, invoices, bank statements, and such like.
When we need to look something up, we go to the filing system andsearch through the system starting from the first entry until we find what
we want.
Alternatively, we may have an indexing system that helps to locate what
we want more quickly. For example we may have divisions in the filingsystem or separate folders for different types of item that are in some
way logically related.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 15/81
The manual filing system works well when the number of items to bestored is small. It even works quite adequately when there are large
numbers of items and we have only to store and retrieve them. However,
the manual filing system breaks down when we have to cross-reference
or process the information in the files. For example, a typical real estate
agent‟s office might have a separate file for each property for sale or rent,
each potential buyer and renter, and each member of staff.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 16/81
Clearly the manual system is inadequate for this type of work. The file
based system was developed in response to the needs of industry for
more efficient data access. In early processing systems, an
organization‟s information was stored as groups of records in separate
files. These file processing systems consisted of a few data files and
many application programs.
Each file called a flat file, contained and processed information for one
specific function, such as accounting or inventory.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 17/81
In addition, early programmers focused on physical implementation andaccess procedures when designing a database. These physical procedures
were written into database application; therefore, physical changes
resulted in intensive rework on the part of the programmer. As systems
became more complex, file processing systems offered little flexibility,
presented many limitations, and were difficult to maintain.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 18/81
There are following problems associated with the File Based Approach:
Separated and Isolated Data
To make a decision, a user might need data from two separate files. First,
the files were evaluated by analysts and programmers to determine the
specific data required from each file and the relationships between the
data and then applications could be written in a programming languageto process and extract the needed data. Imagine the work involved if data
from several files was needed.
Limitations of the File-Based Approach
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 19/81
Duplication of data
Often the same information is stored in more than one file. Uncontrolled
duplication of data is not required for several reasons, such as:
¨ Duplication is wasteful. It costs time and money to enter the data
more than once.
¨ It takes up additional storage space, again with associated costs.
¨ Duplication can lead to loss of data integrity; in other words the data
is no longer consistent. For example, consider the duplication of data
between the Payroll and Personnel departments. If a member of staff
moves to new house and the change of address is communicated only to
Personnel and not to Payroll, the person‟s pay slip will be sent to thewrong address. A more serious problem occurs if an employee is
promoted with an associated increase in salary. Again, the change is
notified to Personnel but the change does not filter through to Payroll.
Now, the employee is receiving the wrong salary. When this error isdetected, it will take time and effort to resolve.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 20/81
Data Dependence
In file processing systems, files and records were described by specific
physical formats that were coded into the application program by
programmers. If the format of a certain record was changed, the code in
each file containing that format must be updated. Furthermore,
instructions for data storage and access were written into the
application‟s code.
Therefore, changes in storage structure or access methods could greatlyaffect the processing or results of an application.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 21/81
Difficulty in representing data from the user’s view
To create useful applications for the user, often data from various files
must be combined. In file processing it was difficult to determine
relationships between isolated data in order to meet user requirements.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 22/81
Data Inflexibility
Program-data interdependency and data isolation, limited the flexibility
of file processing systems in providing users with ad hoc information
requests.
Incompatible file formats
As the structure of files is embedded in the application programs, the
structures are dependent on the application programming language. For
example, the structure of a file generated by a COBOL program may be
different from the structure of a file generated by a „C‟ program. The
direct incompatibility of such files makes them difficult to process jointly.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 23/81
Database Approach
In order to remove all the above limitations of the File Based Approach,
a new approach was required that must be more effective known as
Database approach.
A database is a computer based record keeping system whose over all
purpose is to record and maintain information. The database is a single,
large repository of data, which can be used simultaneously by many
departments and users.
Instead of disconnected files with redundant data, all data items are
integrated with a minimum amount of duplication. The database is no
longer owned by one department but is a shared corporate resource.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 24/81
The database holds not only the organization‟s operational data but also a
description of this data. For this reason, a database is also defined as a
self-describing collection of integrated records. The description of the
data is known as the Data Dictionary or Meta Data (the „data about
data‟). It is the self-describing nature of a database that providesprogram-data independence.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 25/81
A database implies separation of physical storage from use of the data byan application program to achieve program/data independence.
Using a database system, the user or programmer or application
specialist need not know the details of how the data are stored and such
details are “transparent to the user”. Changes (or updating) can be madeto data without affecting other components of the system. These changes
include, for example, change of data format or file structure or relocation
from one device to another.
Characteristics of data in a database
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 26/81
Characteristics of data in a database
The data in a database should have the following features:
Shared: Data in a database are shared among different users and
applications.
Validity/Integrity/Correctness: Data should be correct with respect to
the real world entity that they represent.
Security: Data should be protected from unauthorized access.Consistency: Whenever more than one data element in a database
represents related real world values, the values should be consistent with
respect to the relationship.
Non-redundancy: No two data items in a database should represent the
same real world entity.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 27/81
Independence: Data at different levels should be independent of each
other so that the changes in one level should not affect the other levels.
To create, manage and manipulate data in databases, a management
system known as database management system was developed.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 28/81
The Database Management System (DBMS)
DBMS A database management system is the software system
that allows users to define, create and maintain a database and providescontrolled access to the data.
A database management system (DBMS) is basically a collection of
programs that enables users to store, modify, and extract information
from a database as per the requirements. DBMS is an intermediate layer
between programs and the data. Programs access the DBMS, which then
accesses the data. There are different types of DBMS ranging from
small systems that run on personal computers to huge systems that runon mainframes. The following are main examples of database
applications:
Computerized library systems
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 29/81
Computerized library systems
¨ Automated teller machines
¨ Flight reservation systems
¨ Computerized parts inventory systems
Commercially available Database management systems in the market
are dbase, Foxpro, IMS and Oracle.
These systems allow users to create, update, and extract information
from their databases. Compared to a manual filing system, the biggestadvantages to a computerized database system are speed, accuracy, and
accessibility.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 30/81
DBMS provides the following languages to provide the facilities to its
users:
Data Definition Language (DDL): It is a language that allows the users to define data and their relationship
to other types of data. It is mainly used to create files, databases, data
dictionary and tables within databases.
It is also used to specify the structure of each table, set of associated
values with each attribute, integrity constraints, security and
authorization information for each table and physical storage structure of
each table on disk.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 31/81
Data Manipulation Language (DML)
It is a language that provides a set of operations to support the basic data
manipulation operations on the data held in the databases. It allows users
to insert, update, delete and retrieve data from the database. The part of
DML that involves data retrieval (SELECT command) is called a query
language.
DML can be distinguished between two types: procedural and non-
procedural.
The procedural language specify how the output of a DML statement
must be obtained, while the non-procedural DMLs describe only what
output is to be obtained. Typically procedural language treats records
individually, while non-procedural languages operate on sets of records.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 32/81
Components of the DBMS Environment
Hardware
¨ Software
¨ Data
¨ Users
Procedures
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 33/81
Hardware
The hardware is the actual computer system used for keeping andaccessing the database. Conventional DBMS hardware consists of
secondary storage devices, usually hard disks, on which the database
physically resides, together with the associated Input- Output devices,
device controllers and so forth. Databases run on a range of machines,
from Micro-computers to large mainframes. Other hardware issues for a
DBMS includes database machines, which is hardware designed
specifically to support a database system.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 34/81
Software
The software is the actual DBMS. Between the physical database itself
(i.e. the data as actually stored) and the users of the system is a layer of software, usually called the Database Management System or DBMS. All
requests from users for access to the database are handled by the DBMS.
One general function provided by the DBMS is thus the shielding of
database users from complex hardware-level detail.The DBMS allows the users to communicate with the database. In a
sense, it is the mediator between the database and the users. The DBMS
controls the access and helps to maintain the consistency of the data.
Utilities are usually included as part of the DBMS. Some of the mostcommon utilities are report writers, application development tools and
other design aids. Examples of DBMS software include Microsoft
Access, Oracle Corporation‟s Personal Oracle and IBM DB2.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 35/81
Data
It is the most important component of DBMS environment from the endusers point of view. As shown in above figure 1.2 observe that data acts
as a bridge between the machine components and the user components.
The database contains both the operational data and the meta-data, the
„data about data‟.
The database should contain all the data needed by the organization. One
of the major features of databases is that the actual data are separated
from the programs that use the data. A database should always be
designed, built and populated for a particular audience and for a specificur ose.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 36/81
Users
There are a number of users who can access or retrieve data on demand
using the applications and interfaces provided by the DBMS. Each type
of user needs different software capabilities. The users of a database
system can be classified in the following groups, depending on their
degrees of expertise or the mode of their interactions with the DBMS.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 37/81
The users can be:
¨ Naive Users
¨ Online Users
¨ Application Programmers
¨ DBA
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 38/81
Naive Users
Naive Users are those users who need not be aware of the presence of thedatabase system or any other system supporting their usage. Naive users
are end users of the database who work through a menu driven
application program where the type and range of response is always
indicated to the user.
A user of an Automatic Teller Machine (ATM) falls in this category. The
user is instructed through each step of a transaction. He or she then
responds by pressing a coded key or entering a numeric value.
The operations that can be performed by naïve users are very limited and
affect only a precise portion of the database. For example, in the case of
the user of the Automatic Teller Machine, user‟s action affects only one
or more of his/her own accounts.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 39/81
Online Users
Online users are those who may communicate with the database directly
via an online terminal or indirectly via a user interface and applicationprogram. These users are aware of the presence of the database system
and may have acquired a certain amount of expertise with in the limited
interaction permitted with a database.
Application Programmers
Professional programmers are those who are responsible for developing
application programs or user interface. The application programs could
be written using general purpose programming language or the
commands available to manipulate a database.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 40/81
Database Administrator
The database administrator (DBA) is the person or group in charge for
implementing the database system within an organization. The DBAhas all the system privileges allowed by the DBMS and can assign
(grant) and remove (revoke) levels of access (privileges) to and form
other users. DBA is also responsible for the evaluation, selection and
implementation of DBMS package.
P d
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 41/81
Procedures
Procedures refer to the instructions and rules that govern the design and
use of the database. The users of the system and the staff that manage the
database require documented procedures on how to use or run thesystem.
These may consist of instructions on how to:
¨ Log on to the DBMS.¨ Use a particular DBMS facility or application program.
¨ Start and stop the DBMS.
¨ Make backup copies of the database.
¨ Handle hardware or software failures.
¨ Change the structure of a table, reorganize the database across
multiple disks, improve performance, or archive data to secondary
storage.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 42/81
Advantages of DBMS Controlling Redundancy
In file system, each application has its own private files, which cannot
be shared between multiple applications. This can often lead to
considerable redundancy in the stored data, which results in wastage of
storage space. By having centralized database most of this can be
avoided. It is not possible that all redundancy should be eliminated.
Sometimes there are sound business and technical reasons for
maintaining multiple copies of the same data. In a database system,
however this redundancy can be controlled.
For example: In case of college database, there may be the number of
applications like General Office, Library, Account Office, Hostel etc.
Each of these applications may maintain the following information into
own private file applications:
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 43/81
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 44/81
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 45/81
Integrity can be enforced
Integrity of data means that data in database is always accurate, such
that incorrect information cannot be stored in database. In order to
maintain the integrity of data, some integrity constraints are enforced on
the database. A DBMS should provide capabilities for defining and
enforcing the constraints.
For Example: Let us consider the case of college database and suppose
that college having only BA, BCA, BIT, BBA and BCOM classes. But if
a user enters the class MCA, then this incorrect information must not be
stored in database and must be prompted that this is an invalid data entry.
In order to enforce this, the integrity constraint must be applied to the
class attribute of the student entity. But, in case of file system thisconstraint must be enforced on all the application separately (because all
applications have a class field).
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 46/81
In case of DBMS, this integrity constraint is applied only once on the
class field of the General Office (because class field appears only once inthe whole database), and all other applications will get the class
information about the student from the General Office table so the
integrity constraint is applied to the whole database. Now, we can say
that integrity constraint can be easily enforced in centralized DBMSsystem as compared to file system.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 47/81
Inconsistency can be avoided
When the same data is duplicated and changes are made at one site,
which is not propagated to the other site, it gives rise to inconsistency
and the two entries regarding the same data will not agree. At such times
the data is said to be inconsistent. So if the redundancy is removed
chances of having inconsistent data is also removed.
Let us again consider the college system and suppose that in case of
General_Office file it is indicated that Roll_Number 5 lives in Amritsar
but in library file it is indicated that Roll_Number 5 lives in Jalandhar.
Then this is a state at which the two entries of the same object do not
agree with each other (that is one is updated and other is not). At such
time the database is said to be inconsistent.
An inconsistent database is capable of supplying incorrect or conflicting
information. So, there should be no inconsistency in database. It can be
clearly shown that inconsistency can be avoided in centralized system
very well as compared to file system.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 48/81
Let us consider again the example of college system and suppose that
RollNo 5 is shifted from Amritsar to Jalandhar, then address information
of Roll Number 5 must be updated, whenever Roll number and address
occurs in the system. In case of file system, the information must be
updated separately in each application, but if we make updation only at
three places and forget to make updation at fourth application, then the
whole system show the inconsistent results about Roll Number 5.
In case of DBMS, Roll number and address occurs together only single
time in General_Office table. So, it needs single updation and then all
other application retrieve the address information from General_Office
which is updated so, all application will get the current and latest
information by providing single update operation and this single updateoperation is propagated to the whole database or all other application
automatically, this property is called as Propagation of Update.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 49/81
Data can be shared
As explained earlier, the data about Name, Class, Father_name etc. of
General_Office is shared by multiple applications in centralized DBMS
as compared to file system so now applications can be developed to
operate against the same stored data. The applications may be developed
without having to create any new stored files.
Standards can be enforced
Since DBMS is a central system, so standard can be enforced easily may
be at Company level, Department level, National level or International
level. The standardized data is very helpful during migration or
interchanging of data. The file system is an independent system sostandard cannot be easily enforced on multiple independent applications.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 50/81
Restricting unauthorized access
When multiple users share a database, it is likely that some users will
not be authorized to access all information in the database. For exampleaccount office data is often considered confidential, and hence only
authorized persons are allowed to access such data. In addition, some
users may be permitted only to retrieve data, whereas other are allowed
both to retrieve and to update. Hence the type of access operation
retrieval or update must also be controlled. Typically, users or usergroups are given account numbers protected by passwords, which they
can use to gain access to the database. A DBMS should provide a
security and authorization subsystem, which the DBA uses to create
accounts and to specify account restrictions. The DBMS should thenenforce these restrictions automatically.
Note: Without such checks the security of the data may actually be more
at risk in database system than in a traditional file system.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 51/81
Solving enterprise requirement than individual requirement
Since many types of users with varying level of technical knowledge use
a database, a DBMS should provide a variety of user interface. The
overall requirements of the enterprise are more important than the
individual user requirements. So the DBA can structure the database
system to provide an overall service that is “best for the enterprise”.
For example: A representation can be chosen for the data in storage that
gives fast access for the most important application at the cost of poor
performance in some other application. But the file system favors the
individual requirements than the enterprise requirements.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 52/81
Providing Backup and Recovery
A DBMS must provide facilities for recovering from hardware or
software failures. The backup and recovery subsystem of the DBMS is
responsible for recovery. For example, if the computer system fails in the
middle of a complex update program, the recovery subsystem is
responsible for making sure that the database is restored to the state it
was in before the program started executing.
i f S
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 53/81
The disadvantages of the database approach are summarized as follows:
Complexity
The provision of the functionality that is expected of a good DBMS
makes the DBMS an extremely complex piece of software. Database
designers, developers, database administrators and end-users must
understand this functionality to take full advantage of it. Failure tounderstand the system can lead to bad design decisions, which can have
serious consequences for an organization.
Disadvantages of DBMS
Si
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 54/81
Size
The complexity and breadth of functionality makes the DBMS an
extremely large piece of software, occupying many megabytes of disk
space and requiring substantial amounts of memory to run efficiently.
Performance
Typically, a File Based system is written for a specific application, such
as invoicing. As result, performance is generally very good. However,the DBMS is written to be more general, to cater for many applications
rather than just one. The effect is that some applications may not run as
fast as they used to.
Higher impact of a failure
The centralization of resources increases the vulnerability of the system.
Since all users and applications rely on the availability of the DBMS, the
failure of any component can bring operations to a halt.
C t f DBMS
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 55/81
Cost of DBMS
The cost of DBMS varies significantly, depending on the environment
and functionality provided. There is also the recurrent annual
maintenance cost.
Additional Hardware costs
The disk storage requirements for the DBMS and the database may
necessitate the purchase of additional storage space. Furthermore, to
achieve the required performance it may be necessary to purchase a
larger machine, perhaps even a machine dedicated to running the DBMS.
The procurement of additional hardware results in further expenditure.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 56/81
Cost of Conversion
In some situations, the cost of the DBMS and extra hardware may be
insignificant compared with the cost of converting existing applications
to run on the new DBMS and hardware. This cost also includes the cost
of training staff to use these new systems and possibly the employmentof specialist staff to help with conversion and running of the system. This
cost is one of the main reasons why some organizations feel tied to their
current systems and cannot switch to modern database technology.
Wh t t U DBMS
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 57/81
When not to Use a DBMS
In spite of the advantages of using a DBMS, there are a few situations in
which such a system may involve unnecessary overhead costs, as that
would not be incurred in traditional file processing.
The overhead costs of using a DBMS are due to the following:
¨ High initial investment in hardware, software, and training
¨ Generality that a DBMS provides for defining and processing data.
¨ Overhead for providing security, concurrency control, recovery, and
integrity functions.
Additional problems may arise if the database designers and DBA do notproperly design the database or if the database systems applications are
not implemented properly.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 58/81
Hence, it may be more desirable to use regular files under the following
circumstances:
¨ The database and applications are simple, well defined and not
expected to change.
¨ There are tight real-time requirements for some programs that may
not be met because of DBMS overhead.
¨ Multiple user access to data is not required.
¨ An application may need to manipulate the data in a way not
supported by the query language.
Comparison of File Management System with Database
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 59/81
Comparison of File Management System with Database
Management System
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 60/81
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 61/81
An early proposal for a standard terminology and general architecture
for database systems was produced in 1971 by the DBTG (Data
Base Task Group) appointed by the Conference on Data Systems and
Languages (CODASYL, 1971). The DBTG recognized the need for atwo level approach with a system view called the schema and user
views called subschema. The American National Standards Institute
(ANSI) recognized the need for a three level approach with a system
catalog.
Three Level Architecture
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 62/81
There are following three levels or layers of DBMS architecture:
¨ External Level
¨ Conceptual Level
¨ Internal Level
Objective of the Three Level Architecture
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 63/81
Objective of the Three Level Architecture
¨ Each user should be able to access the same data, but have a
different customized view of the data.
¨ User‟s interaction with the database should be independent of
storage considerations.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 64/81
¨ The Database Administrator (DBA) should be able to change the
database storage structures without affecting the user‟s views.
¨ The internal structure of the database should be unaffected by
changes to the physical aspects of storage, such as the changeover to anew storage device.
¨ The DBA should be able to change the conceptual structure of thedatabase without affecting all users.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 65/81
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 66/81
External Level or View level
It is the users‟ view of the database. This level describes that part of thedatabase that is relevant to each user. External level is the one, which is
closest to the end users. This level deals with the way in which individual
users view data. Individual users are given different views according to
the user‟s requirement.
External level is also known as the view level. In addition different views
may have different representations of the same data. For example, one
user may view dates in the form (day, month, year), while another may
view dates as (year, month, day).
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 67/81
Conceptual Level or Logical level
It is the community view of the database. This level describes what data
is stored in the database and the relationships among the data. Themiddle level in the three level architecture is the conceptual level.
This level contains the logical structure of the entire database as seen by
the DBA. It is a complete view of the data requirements of the
organization that is independent of any storage considerations. Theconceptual level represents:
¨ All entities, their attributes, and their relationships;
¨ The constraints on the data;
¨ Security and integrity information.
.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 68/81
This level must not contain any storage dependent details. For instance,
the description of an entity should contain only data types of attributes
(for example, integer, real, character) and their length (such as the
maximum number of digits or characters), but not any storage
considerations, such as the number of bytes occupied. Conceptual level
is also known as the logical level
Internal Level or Storage level
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 69/81
It is the physical representation of the database on the computer. This
level describes how the data is stored in the database. The internal level
is the one that concerns the way the data are physically stored on the
hardware.
The internal level covers the physical implementation of the database to
achieve optimal runtime performance and storage space utilization. It
covers the data structures and file organizations used to store data on
storage devices. It interfaces with the operating system access methods to
place the data on the storage devices, build the indexes, retrieve the data,
and so on.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 70/81
The internal level is concerned with such things as:
¨ Storage space allocation for data and indexes;
¨ Record descriptions for storage (with stored sizes for data items);
¨ Record placement;
¨ Data compression and data encryption techniques.
There will be only one conceptual view, consisting of the abstract
representation of the database in its entirely. Similarly there will be only
one internal or physical view, representing the total database, as it isphysically stored.
Schema
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 71/81
The overall description of the database is called the Database Schema.
There are three different types of schema in the database corresponding
to each data view of database. In other words, the data views at each of three levels are described by schema.
A schema is defined as an outline or a plan that describes the
records and relationships existing at the particular level. The External
view is described by means of a schema called external schema thatcorrespond to different views of the data. Similarly the Conceptual view
is defined by conceptual schema, which describes all the entities,
attributes, and relationship together with integrity constraints. Internal
View is defined by internal schema, which is a complete description of the internal model, containing definition of stored records, the methods
of representation, the data fields, and the indexes used.
There is only one conceptual schema and one internal schema per
database The schema also describes the way in which data elements at
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 72/81
database. The schema also describes the way in which data elements at
one level can be mapped to the corresponding data elements in the next
level.
Thus, we can say that schema establishes correspondence between the
records and relationships in the two levels. In a relational database, the
schema defines the tables, the fields in each table, and the relationships
between fields and tables. Schema are generally stored in a data
dictionary.
The schema is specified during the database design process and is not
expected to change frequently. However the actual data in the database
may change frequently; The data in the database at any particular point
in time is called a database instance. Therefore, many database instancescan correspond to the same database schema. The schema is sometimes
called the intension of the database, while an instance is called an
extension (or state) of the database.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 73/81
To understand the difference between the three levels, consider again
the database schema that describes College Database system. If User1
is a Library clerk, the external view would contain only the studentand book information. If User2 is a account office clerk then he/she
may be interested in students detail and fee detail.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 74/81
Mapping between Views
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 75/81
Mapping between Views
External/Conceptual Mapping: Each external schema is related to the
conceptual schema by the external/conceptual mapping. A mapping
between the external and conceptual views gives the correspondenceamong the records and the relationships of the external and conceptual
views.. There is a mapping from a particular logical record in the
external view to one (or more) conceptual record(s) in the conceptual
view.
Names of the fields and records, for instance, may be different. A number
of conceptual fields can be combined into a single external field, for
example, Last_Name and First_Name at the conceptual level but Name
at the external level. A given external record could be derived from anumber of conceptual records.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 76/81
Conceptual/Internal Mapping: Conceptual schema is related to the
internal schema by the conceptual/internal mapping. This enables the
DBMS to find the actual record or combination of records in physical
storage that constitute a logical record in conceptual schema. Mapping
between the conceptual and the internal levels specifies the method of
deriving the conceptual record from the physical database.
Differences between Conceptual/Internal Views
¨ Representation of numeric values could be different in the two
views.
¨ Representation of string data can be considered by the two views to
be coded differently.
¨
Data Independence-Achievement of Layered Architecture of DBMS
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 77/81
Data Independence Achievement of Layered Architecture of DBMS
There are two kinds of data independence:
¨ Logical data independence¨ Physical data independence
Logical data independence
Logical data independence indicates that the conceptual schema can bechanged without affecting the existing external schemas. The change
would be absorbed by the mapping between the external and conceptual
levels.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 78/81
Physical data independence
Physical data independence indicates that the physical storage structures
or devices could be changed without affecting conceptual schema. Thechange would be absorbed by the mapping between the conceptual and
internal levels. Physical data independence is achieved by the presence
of the internal level of the database and the mapping or transformation
from the conceptual level of the database to the internal level.If there is a need to change the file organization or the type of physical
device used as a result of growth in the database or new technology, a
change is required in the conceptual/internal mapping between the
conceptual and internal levels.The physical data independence criterion requires that the conceptual
level does not specify storage structures or the access methods (indexing,
hashing etc.) used to retrieve the data from the physical storage medium.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 79/81
The Logical data independence is difficult to achieve than physical data
independence as it requires the flexibility in the design of database and
programmer has to foresee the future requirements or modifications in
the design.
Th P d f D b A
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 80/81
The Procedure for Database Access
A user‟s request for data is received by the data manager, which
determines the physical record required. The decision as to whichphysical record is needed may require some preliminary consultation of
the database and/or the data dictionary prior to the access of the actual
data itself.
The data manager sends the request for a specific physical record to the
file manager. The file manager decides which physical block of
secondary storage devices contains the required record and sends the
request for the appropriate block to the disk manager. A block is a unit of physical input/output operations between primary and secondary storage.
The disk manager retrieves the block and sends it to the file manager,
which sends the required record to the data manager.
8/4/2019 Lectures Ppt for Unit 1
http://slidepdf.com/reader/full/lectures-ppt-for-unit-1 81/81