chapter1_rdbms

Upload: omkar-joshi

Post on 07-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Chapter1_RDBMS

    1/30

    Chapter 1

    Database System concept & Data Modeling

    Database concepts

    A data is a piece of information

    A database is a set of interrelated and meaningful data related to an enterprise.

    A database Management system (DBMS) is a collection of interrelated data and a

    set of programs to access those data.

    The primary goal of a DBMS is to provide an efficient way to store & retrieve datato or from a database.

    Database System Applications /Applications of DBMS

    Databases are widely used. Here are some representative applications:

    1. Banking: To maintain customer information, accounts, loans, and banking

    transactions.

    2. Airlines: For flight reservations and schedule information. Airlines were among the

    first to use databases

    3. Universities: For student information, course registrations, and grades.

    4. Credit card transactions: For purchases on credit cards and generation of monthly

    statements.

    5. Telecommunication: For keeping records of phone calls made, generating monthly

    phone bills, maintaining balances on prepaid calling cards, and storing information

    about the communication networks.

    6. Finance: For storing information about holdings, sales, and purchases of financial

    instrument such as stocks and bonds.

    7. Sales: For customer, product, and purchase information.

    1

  • 8/4/2019 Chapter1_RDBMS

    2/30

    8. Manufacturing: For management of supply chain and for keeping track of

    production of items in factories, inventories of items in warehouses/stores, and orders

    for items.

    9. Human resources: For information about employees, salaries, payroll taxes and

    benefits, and for generation of paychecks.

    Disadvantage of File processing System:

    File processing system has a number of disadvantages. They are as follows:

    1) Data redundancy and inconsistency:

    Since different

    programmers create files and application programs.

    Various files have

    different formats and the programs may be written in several programming languages.

    Moreover, the same

    information may be duplicated in several places (files). For example, the address and

    telephone number of a particular student may be stored in office file while taking

    admissions and in department file also. This storing of same data more than once is

    called as data redundancy.

    This redundancy

    leads to higher storage and access cost.

    Data redundancy

    leads to data inconsistency; which means multiple copies of a data may have

    different values for a particular field. For example, a students changed address may be

    updated in the department file but not in the Office file. Thus this leads to incorrect

    data.

    2) Difficulty in accessing data:

    Suppose that a bank officers needs to find out the names of all customers who live

    in kothrud.

    2

  • 8/4/2019 Chapter1_RDBMS

    3/30

    The officer asks the data-processing department to generate such a list. Because the

    designers of the original system did not anticipate this request, there is no application

    program on hand to meet it.

    There is, however, an application program to generate the list of all customers.

    The bank officer has now two choices: either obtain the list of all customers and

    extract the needed information manually or ask a system programmer to write the

    necessary application program.

    Both alternatives are obviously unsatisfactory. Suppose that such a program is

    written, and that, several days later, the same officer needs to trim that list to include

    only those customers who have an account balance of $10,000 or more.

    As expected, a program to generate such a list does not exist. Again, the officer has

    the preceding two options, neither of which is satisfactory.

    The point here is that conventional file-processing environments do not allow

    needed data to be retrieved in a convenient and efficient manner.

    More responsive data-retrieval systems are required for general use.

    3) Data isolation.

    Because data are scattered in various files, and files were in different

    formats, writing new application programs to access the appropriate data

    was difficult.

    4)Integrity problems

    The data values stored in the database must satisfy certain types of

    consistency constraints.

    For example, the balance of a bank account may never fall below a

    prescribed amount (say, $25).

    Developers enforce these constraints in the system by adding appropriate

    code in the various application programs.

    However, when new constraints are added, it is difficult to change the

    programs to enforce them.

    3

  • 8/4/2019 Chapter1_RDBMS

    4/30

    The problem is compounded when constraints involve several data items

    from different files.

    5)Atomicity problems.

    A computer system, like any other mechanical or electrical device, is subject to

    failure.

    In many applications, it is crucial that, if a failure occurs, the data be restored to the

    consistent state that existed prior to the failure.

    Consider a program to transfer $50 from account A to account B. If a system failure

    occurs during the execution of the program, it is possible that the $50 was removed

    from account a but was not credited to account B, resulting in an inconsistent database

    state.

    Clearly, it is essential to database consistency that either both the credit and debit

    occur, or that neither occur.

    That is, the funds transfer must be atomicit must happen in its entirety or not at

    all. It is difficult to ensure atomicity in a conventional file-processing system.

    6)Concurrent-access anomalies

    For the sake of overall performance of the system and faster response, many

    systems allow multiple users to update the data simultaneously.

    In such an environment, interaction of concurrent updates may result in

    inconsistent data.

    Consider bank account a, containing $500. If two customers withdraw funds (say

    $50 and $100 respectively) from account A at about the same time, the result of the

    concurrent executions may leave the account in an incorrect (or inconsistent) state.

    Suppose that the programs executing on behalf of each withdrawal read the oldbalance, reduce that value by the amount being withdrawn, and write the result back.

    If the two programs run concurrently, they may both read the value $500, and write

    back $450 and $400, respectively.

    4

  • 8/4/2019 Chapter1_RDBMS

    5/30

    Depending on which one writes the value last, the account may contain $450 or

    $400, rather than the correct value of $350.

    To guard against this possibility, the system must maintain some form of

    supervision.

    But supervision is difficult to provide because many different application programs

    that have not been coordinated previously may access data.

    7)Security problems.

    Not every user of the database system should be able to access all the data.

    For example, in a banking system, payroll personnel need to see only that part of

    the database that has information about the various bank employees.

    They do not need access to information about customer accounts. But, since

    application programs are added to the system in an ad hoc manner, enforcing such

    security constraints is difficult.

    Advantage of a DBMS over file processing System

    1) Redundancy Can be controlled

    In traditional non-database system, each department maintains its own files for

    handling its data processing application.

    2) Data can be shared

    3) Standard can be enforced

    4) Security measure can be applied

    5) Integrity can be maintained

    6) Conflicting requirements can be balanced

    View of Data

    A database system is a collection of interrelated files and a set of programs that allow users

    to access and modify these files. A major purpose of a database system is to provide users

    with an abstract view of the data. That is, the system hides certain details of how the data

    are stored and maintained

    5

  • 8/4/2019 Chapter1_RDBMS

    6/30

    Data Abstraction

    For the system to be usable, it must retrieve data efficiently. The need for efficiency has led

    designers to use complex data structures to represent data in the database. Since many

    database-systems users are not computer trained, developers hide the complexity fromusers through several levels of abstraction, to simplify users interactions with the system:

    Physical level.

    1) The lowest level of abstraction describes how the data are actually stored. The physical

    level describes complex low-level data structures in detail.

    Logical level.

    1) The next-higher level of abstraction describes what data are stored in the database, and

    what relationships exist among those data.

    2) The logical level thus describes the entire database in terms of a small number of

    relatively simple structures.

    3) It involves complex physical-level structures but its logical level structure is simple. The

    user of the logical level does not need to be aware of this complexity.

    4) Database administrators, who must decide what information to keep in the database, use

    the logical level of abstraction.

    View level.

    1) The highest level of abstraction describes only part of the entire database.

    2) Even though the logical level uses simpler structures, complexity remains because of the

    variety of information stored in a large database.

    3) Many users of the database system do not need all this information; instead, they need to

    access only a part of the database.

    4) The view level of abstraction exists to simplify their interaction with the system. The

    system may provide many views for the same database.

    5) Figure 1.1 shows the relationship among the three levels of abstraction.

    6

  • 8/4/2019 Chapter1_RDBMS

    7/30

    Fig 1.1 the three level of data Abstraction

    Most high-level programming languages support the notion of a record type. For example,

    in a Pascal-like language, we may declare a record as follows:

    Type customer = record

    Customer-id: string;

    Customer-name: string;

    Customer-street: string;

    Customer-city: string; end;

    This code defines a new record type called customer with four fields. Each field has a name

    and a type associated with it. A banking enterprise may have several such record types,

    including

    Account, with fields account-number and balance

    Employee, with fields employee-name and salary

    At the physical level, a customer, account, or employee record can be described as a block

    of consecutive storage locations (for example, words or bytes). The language compiler

    hides this level of detail from programmers. Similarly, the database system hides many of

    the lowest-level storage details from database programmers. Database administrators, on

    the other hand, may be aware of certain details of the physical organization of the data.

    7

  • 8/4/2019 Chapter1_RDBMS

    8/30

    At the logical level, each such record is described by a type definition, as in the previous

    code segment, and the interrelationship of these record types is defined as well.

    Programmers using a programming language work at this level of abstraction. Similarly,

    database administrators usually work at this level of abstraction.

    Finally, at the view level, computer users see a set of application programs that hide

    details of the data types. Similarly, at the view level, several views of the database are

    defined, and database users see these views. In addition to hiding details of the logical level

    of the database, the views also provide a security mechanism to prevent users from

    accessing certain parts of the database. For example, tellers in a bank see only that part of

    the database that has information on customer accounts; they cannot access information

    about salaries of employees.

    Database Languages:

    There are two types of languages as DDL and DML. A database system provides a data

    definition language to specify the database schema and a data manipulation language to

    express database queries and updates.

    In practice, the data definition and data manipulation languages are not two separate

    languages; instead they simply form parts of a single database language, such as the widely

    used SQL language.

    1) Data Manipulation Language

    Data manipulation language (DML) is a language that enables users to access or

    manipulate data as organized by the appropriate data model. The types of access are

    The retrieval of information stored in the database

    The insertion of new information into the database

    The deletion of information from the database

    The modification of information stored in the database

    There are basically two types:

    8

  • 8/4/2019 Chapter1_RDBMS

    9/30

    Procedural DMLs require a user to specify what data are needed and how to get

    those data.

    Declarative DMLs (also referred to as nonprocedural DMLs) require a user to

    Specify whatdata are needed withoutspecifying how to get those data.

    Declarative DMLs are usually easier to learn and use than are procedural DML s. since a

    user does not have to specify how to get the data, the database system has to figure out an

    efficient means of accessing data. The DML component of the SQL language is

    nonprocedural.

    A query is a statement requesting the retrieval of information. The portion of a DML that

    involves information retrieval is called a query language.

    This query in the SQL language finds the name of the customer whose customer-id

    Is 101:

    Select customer. Customer-name

    From customer

    Where customer. customer-id= 101

    The query specifies that those rows from the table customer where the customer-id is

    101 must be retrieved, and the customer-name attribute of these rows must be

    displayed.

    2) Data-Definition Language

    A special language used to specify a database schema by a set of definitions called a data-

    definition language (DDL). For instance, the following statement in the SQL language

    defines the account table:

    Create table account (account-numberchar (10), balance integer (4));

    9

  • 8/4/2019 Chapter1_RDBMS

    10/30

    Execution of the above DDL statement creates the account table. In addition, it updates a

    special set of tables called the data dictionary or data directory. A data dictionary

    contains metadatathat is, data about data . The schema of a table is an example of

    metadata. A database system consults the data dictionary before reading or modifying

    actual data. We specify the storage structure and access methods used by the database

    system by a set of statements in a special type of DDL called a data storage and

    definition language.

    These statements define the implementation details of the database schemas, which are

    usually hidden from the users. The data values stored in the database must satisfy certain

    consistency constraints. For example, suppose the balance on an account should not fall

    below $100. The DDL provides facilities to specify such constraints. The database systems

    check these constraints every time the database is updated.

    1) Domain Constraints

    A domain of possible values must be associated with every attribute. (For example

    integer type, character type, date/time type)

    Declaring an attribute to be of a particular domain acts as a constraints on the values

    that it can take.

    Its an elementary part of the integrity constraints.

    They are tested easily by the system whenever a new data items is inserted into the

    database

    2) Referential Integrity

    In this case, we ensure that the value appear in one relation for a given set of

    attributes also appears for a certain set of attributes in another relation (Referential

    Integrity).

    Database modification can cause violation of the referential integrity

    When a referential integrity constraints is violated, the normal procedure is to reject

    the action that caused the violation

    3) Assertions

    An assertion is the condition that the database must always satisfy

    10

  • 8/4/2019 Chapter1_RDBMS

    11/30

    Domain constraints and referential integrity constraints are the special form of

    the assertion.

    But there are many constraints that we cant specify by using only these special

    forms.

    For example, every loan has at least one customer who maintains an account with

    a minimum balance of $1000.00.

    When an assertion is created, the system test for its validity

    If the assertion is valid, then any future modification to the database is allowed

    without violating the assertion.

    4) Authorization

    We can differentiate among user based on type of access they are permitted

    on various data values in the database are expressed in terms of authorization

    Authorization are of following type

    Read Authorization: which allows reading but not modification of data.

    Insert Authorization: which allows insertion of new data , but not

    modification of the existing data

    Update Authorization: Which allows modification of data but not deletion of

    the data

    Delete Authorization: which allows deletion of data

    Instances And Schema

    Database changes over a time as information is inserted and deleted over a time.

    The collection of information stored in the database at a particular moment is called

    as instance of the database

    The overall design of the database is called the database schema

    The concept of database schema and instances can be understood by comparing it

    with programming language.

    A database schema corresponds to the variable declarations (along with type

    definitions) in a program

    11

  • 8/4/2019 Chapter1_RDBMS

    12/30

    Each variable has a particular value at a given instant. The value of the variable in a

    program at a point in time correspond to an instances of the database schema

    Database system have several schemas based on the level of abstraction

    1) Physical Schema: It describes database schemas at physical level

    2) Logical schema: It describes database schemas at logical level

    3) Subschema: A database may have several schemas at the view level called

    subschema that describes different views of the database.

    Physical data independence

    It is the ability to modify the physical schema without causing application program at the

    physical level is called as Physical data independence.

    Or

    The physical schema is hidden and can usually be changed easily without affecting

    application programs. Application programs are said to exhibit physical data

    independence if they do not depend on the physical schema, and thus need not be rewritten

    if the physical schema changes.

    Logical data independence

    It is the ability to modify the logical schema without causing application program at the

    logical level is called as logical data independence.

    Database Users and Administrators

    A primary goal of a database system is to retrieve information from and store new

    information in the database. People who work with a database can be categorized as

    database users or database administrators.

    Database Users and User Interfaces

    There are four different types of database-system users, differentiated by the way they

    expect to interact with the system. Different types of user interfaces have been designed for

    the different types of users.

    1) Naive users

    These are unsophisticated users who interact with the system by invoking one of the

    application programs that have been written previously.

    12

  • 8/4/2019 Chapter1_RDBMS

    13/30

    For example, a bank teller who needs to transfer $50 from accountA to accountB

    invokes a program called transfer.

    This program asks the teller for the amount of money to be transferred, the account

    from which the money is to be transferred, and the account to which the money is to

    be transferred.

    As another example, consider a user who wishes to find her account balance over

    the World Wide Web. Such a user may access a form, where she enters her account

    number.

    An application program at the Web server then retrieves the account balance, using

    the given account number, and passes this information back to the user.

    2) Application programmers

    These are computer professionals who write application programs. Application

    programmers can choose from many tools to develop user interfaces.

    Rapid application development (RAD) tools are tools that enable an application

    programmer to construct forms and reports without writing a program.

    There are also special types of programming languages that combine imperative

    control structures (for example, for loops, while loops and if-then-else statements)

    with statements of the data manipulation language.

    These languages, sometimes calledfourth-generation languages.

    4) Sophisticated users

    These areinteracting with the system without writing programs. Instead, they form

    their requests in a database query language. They submit each such query to a

    query processor, whose function is to break down DML statements into

    instructions that the storage manager understands.

    Analysts who submit queries to explore data in the database fall in this category

    Online analytical processing (OLAP) tools simplify analysts tasks by letting

    them view summaries of data in different ways.

    13

  • 8/4/2019 Chapter1_RDBMS

    14/30

    For instance, an analyst can see total sales by region (for example, North, South,

    East, and West), or by product, or by a combination of region and product (that is,

    total sales of each product in each region).

    The tools also permit the analyst to select specific regions, look at data in more

    detail (for example, sales by city within a region) or look at the data in less detail

    (for example, aggregate products together by category).

    Another class of tools for analysts is data mining tools, which help them, find

    certain kinds of patterns in data.

    5) Specialized users

    These are sophisticated users who write specialized database applications that

    do not fit into the traditional data-processing framework.

    Among these applications are computer-aided design systems, knowledge

    base and expert systems, systems that store data with complex data types (for

    example, graphics data and audio data), and environment-modeling systems.

    Database Administrator

    One of the main reasons for using DBMSs is to have central control of both the data and

    the programs that access those data. A person who has such central control over the system

    is called a database administrator (DBA).

    The functions of a DBA include:

    Schema definition. The DBA creates the original database schema by executing a set of

    data definition statements in the DDL.

    Storage structure and access-method definition.

    Schema and physical-organization modification. The DBA carries out changes to the

    schema and physical organization to reflect the changing needs of the organization, or to

    alter the physical organization to improve performance.

    Granting of authorization for data access. By granting different types of authorization,

    the database administrator can regulate which parts of the database various users can

    access. The authorization information is kept in a special system structure that the database

    system consults whenever someone attempts to access the data in the system.

    14

  • 8/4/2019 Chapter1_RDBMS

    15/30

    Routine maintenance. Examples of the database administrators routine maintenance

    activities are:

    Periodically backing up the database, either onto tapes or onto remote servers, to

    prevent loss of data in case of disasters such as flooding.

    Ensuring that enough free disk space is available for normal operations, and

    upgrading disk space as required

    Monitoring jobs running on the database and ensuring that performance is not

    degraded by very expensive tasks submitted by some users.

    Database System Structure / Overall Structure

    of DBMS

    15

  • 8/4/2019 Chapter1_RDBMS

    16/30

    16

  • 8/4/2019 Chapter1_RDBMS

    17/30

    A database system is partitioned into modules. The functional components of a database

    system can be broadly divided into the storage manager and the query processor

    components. The storage manager is important because databases typically require a large

    amount of storage space. Corporate databases range in size from hundreds of gigabytes to,

    for the largest databases, terabytes of data. Since the main memory of computers cannot

    store this much information, the information is stored on disks. Data are moved between

    disk storage and main memory as needed. Since the movement of data to and from disk is

    slow relative to the speed of the central processing unit, it is imperative that the database

    system structures the data so as to minimize the need to move data between disk and main

    memory. The query processor is important because it helps the database system simplify

    and facilitate access to data. High-level views help to achieve this goal; with them, users of

    the system are not being burdened unnecessarily with the physical details of the

    implementation of the system. However, quick processing of updates and queries is

    important. It is the job of the database system to translate updates and queries written in a

    nonprocedural language, at the logical level, into an efficient sequence of operations at the

    physical level.

    1) Storage Manager

    Astorage manageris a program module that provides the interface between the low level

    data stored in the database and the application programs and queries submitted to the

    system. The storage manager is responsible for the interaction with the file manager. The

    raw data are stored on the disk using the file system, which is usually provided by a

    conventional operating system. The storage manager translates the various DML statements

    into low-level file-system commands. Thus, the storage manager is responsible for storing,

    retrieving, and updating data in the database.

    The storage manager components include:

    Authorization and integrity manager, which tests for the satisfaction of integrity

    constraints and checks the authority of users to access data.

    Transaction manager, which ensures that the database remains in a consistent (correct)

    state despite system failures, and that concurrent transaction executions proceed without

    conflicting.

    17

  • 8/4/2019 Chapter1_RDBMS

    18/30

    File manager, which manages the allocation of space on disk storage and the data

    structures used to represent information stored on disk.

    Buffer manager, which is responsible for fetching data from disk storage into main

    memory, and deciding what data to cache in main memory. The buffer manager is a critical

    part of the database system, since it enables the database to handle data sizes that are much

    larger than the size of main memory. The storage manager implements several data

    structures as part of the physical system implementation:

    Data files, which store the database itself.

    Data dictionary, which stores metadata about the structure of the database, in particular

    the schema of the database.

    Indices, which provide fast access to data items that hold particular values.

    Statistical data: It stores statistics about the data stored.

    2) The Query Processor

    The query processor components include

    DDL interpreter, which interprets DDL statements and records the definitions in the data

    dictionary.

    DML compiler, which translates DML statements in a query language into an evaluation

    plan consisting of low-level instructions that the query evaluation engine understands. A

    query can usually be translated into any of a number of alternative evaluation plans that all

    give the same result. The DML compiler also performs query optimization, that is, it picks

    the lowest cost evaluation plan from among the alternatives.

    Query evaluation engine, which executes low-level instructions generated by

    The DML compiler.

    Query processing refers to the range of activities involved in extracting data from a

    database. The activities include translation of queries in high-level database languages into

    expressions that can be used at the physical level of the file system

    The steps involved in processing a query appear

    1. Parsing and translation

    18

  • 8/4/2019 Chapter1_RDBMS

    19/30

    2. Optimization

    3. Evaluation

    Before query processing can begin, the system must translate the query into a

    usable form.

    Thus, the first action the system must take in query processing is to translate a given

    query into its internal form.

    In generating the internal form of the query, the parser checks the syntax of the

    users query, verifies that the relation names appearing in the query are names of

    the relations in the database, and so on.

    The system constructs a parse-tree representation of the query, which it then

    translates into a relational-algebra expression.

    If the query was expressed in terms of a view, the translation phase also replaces all

    uses of the view by the relational-algebra expression that defines the view.

    Furthermore, the relational-algebra representation of a query specifies only partially how to

    evaluate a query; there are usually several ways to evaluate relational-algebra expressions.

    As an illustration, consider the query

    Select balance

    from account

    Where balance < 2500

    This query can be translated into either of the following relational-algebra expressions:

    abalance

  • 8/4/2019 Chapter1_RDBMS

    20/30

    .

    A sequence of primitive operations that can be used to evaluate a query is a query

    execution

    Query-evaluation plan. Fig shows an evaluation plan for our example query. The query-

    execution engine takes a query-evaluation plan, executes that plan, and returns the answers

    to the query.

    The different evaluation plans for a given query can have different costs. It is the

    responsibility of the system to construct a query-evaluation plan that minimizes the cost of

    query evaluation.

    Once the query plan is chosen, the query is evaluated with that plan, and the result of the

    query is output.

    Query optimization is the process of selecting the most efficient query-evaluation plan

    especially if the query is complex. We do not expect users to write their queries so that they

    can be processed efficiently. Rather, we expect the system to construct a query-evaluation

    plan that minimizes the cost of query evaluation.

    Consider the relational-algebra expression for the query Find the names of all customers

    Who have an account at any branch located in Brooklyn.

    Data Model

    A data Model is a collection of conceptual tools for describing data, data relationship, data

    semantics and consistency constraints. There are three different types of model

    1) E_R model

    2) Network Model

    3) Hierarchical Model

    20

  • 8/4/2019 Chapter1_RDBMS

    21/30

    1) E_R Model

    The Entityrelationship (E-R) model is a high-level data model. It is based on a perception

    of a real world that consists of a collection of basic objects, called entities, and of

    relationships among these objects.

    The relational model is a lower-level model. It uses a collection of tables to represent both

    data and the relationships among those data. Its conceptual simplicity has led to its

    widespread adoption; today a vast majority of database products are based on the relational

    model. Designers often formulate database schema design by first modeling data at a high

    level, using the E-R model, and then translating it into the relational model.

    The E-R data model employs three basic notions: entity sets, relationship sets, and

    attributes.

    1) An entity is a thing or object in the real world that is distinguishable from all other

    objects. For example, each person in an enterprise is an entity. An entity has a set of

    properties, and the values for some set of properties may uniquely identify an entity.

    2) Relationship set

    3) Attributes

    Mapping Cardinalities

    Mapping cardinalities, or cardinality ratios, express the number of entities to which another

    entity can be associated via a relationship set. Mapping cardinalities are most useful in

    describing binary relationship sets, although they can contribute to the description of

    relationship sets that involve more than two entity sets. In this section, we shall concentrate

    on only binary relationship sets.

    For a binary relationship setRbetween entity setsA andB, the mapping cardinality must be

    one of the following:

    One to one. An entity inA is associated with at mostone entity inB, and an entity inB is

    associated with at mostone entity inA.

    One to many. An entity inA is associated with any number (zero or more) of entities in

    B. An entity inB, however, can be associated with at mostone entity inA.

    Many to one. An entity in A is associated with at mostone entity in B. An entity in B,

    however, can be associated with any number (zero or more) of entities in A.

    21

  • 8/4/2019 Chapter1_RDBMS

    22/30

    Many to many. An entity inA is associated with any number (zero or more) of entities in

    B, and an entity inB is associated with any number (zero or more) of entities inA.

    Entity-Relationship Diagram

    An E-R diagram can express the overall logical structure of a database graphically. Such a

    diagram consists of the following major components: Rectangles, which represent entity sets

    Ellipses, which represent attributes

    Diamonds, which represent relationship sets

    Lines, which link attributes to entity sets and entity sets to relationship sets

    Double ellipses, which represent multivalued attributes

    22

  • 8/4/2019 Chapter1_RDBMS

    23/30

    Dashed ellipses, which denote derived attributes

    Double lines, which indicate total participation of an entity in a relationship set

    Double rectangles, which represent weak entity sets

    2)

    Network Model

    The popularity of the network data model coincided with the popularity of the hierarchical

    data model. Some data were more naturally modeled with more than one parent per child.

    So, the network model permitted the modeling of many-to-many relationships in data. A set

    23

    Employee

    Name

    First name

    Middlename

    Last name

    Student

    DOB

  • 8/4/2019 Chapter1_RDBMS

    24/30

    consists of an owner record type, a set name, and a member record type. A member record

    type can have that role in more than one set; hence the multiparent concept is supported.

    An owner record type can also be a member or owner in another set. The CODASYL

    network model is based on mathematical set theory.

    The network model's original inventor was Charles Bachman, and it was developed into a

    standard specification published in 1969 by the CODASYL Consortium.

    In network model data are represented by a collection of records and relationships among

    data are represented by links. It is used to handle non hierarchical relationshipshis model

    uses pointers toward stored data.

    Network Model

    A network database consists of a collection of records connected to one another through

    links. A record is in many respects similar to an entity in the E-R model. Each record is a

    collection of fields (attributes), each of which contains only one data value. A link is an

    association between precisely two records.

    Advantage

    1) Simplicity

    The network model is also conceptually simple and easy to design

    2) Ability to handle more relationship types

    The network model can handle one to many and many to many relationship.

    24

    http://en.wikipedia.org/wiki/Charles_Bachmanhttp://en.wikipedia.org/wiki/CODASYLhttp://en.wikipedia.org/wiki/File:Network_Model.jpghttp://en.wikipedia.org/wiki/Charles_Bachmanhttp://en.wikipedia.org/wiki/CODASYL
  • 8/4/2019 Chapter1_RDBMS

    25/30

    3) Ease of data access

    In network database terminology, a relationship is a set. Each set comprises of two

    types os records-an owner record and a member record. In a network model an

    application can access an owner record and then the menber records within a set.

    4) Data Integrity

    In network model no member can exit without an owner. A user must therefore first

    define the owner record and then the menber record.tgis ensure the data integrity.

    5) Data Indpendence

    The apllication program run independently of the data.any changes made in the data

    characheristics do not effect the application program

    6) Database Standards

    The standards form by the DBTG (Database Task Group of CODASYL comittee)

    form th basis of the network model.all the network database management systems

    adhere to these standards . These standards comprise of a (DDL) and DML that

    augments database administrotion and portability.

    Disadvantage

    1) System coplexity

    In this model data can be accessed one record at a time. This make it essential for

    the database designer administrators and programmer to be familiar with the

    internal data structure to gain access to data.therefore user friendly database

    management system cannot be created by Netwok model

    2) Lack of structural Independence

    Making the structural modification to the database is very difficult as data access

    method is navigational

    3) Hierarchical model

    The data is sorted hierarchically, using a downward tree. This model uses pointers to

    navigate between stored data. It was the first DBMS model.

    25

  • 8/4/2019 Chapter1_RDBMS

    26/30

    A hierarchical data model is a data model in which the data is organized into a tree-like

    structure. The structure allows repeating information using parent/child relationships: each

    parent can have many children but each child only has one parent. All attributes of a

    specific record are listed under an entity type.

    Example of a Hierarchical Model.

    In a database, an entity type is the equivalent of a table; each individual record is

    represented as a row and an attribute as a column. Entity types are related to each other

    using 1: Nmapping, also known as one-to-many relationships.

    The most recognized and used hierarchical databases are IMS developed by IBM and

    Windows Registry by Microsoft.

    An example of a hierarchical data model would be if an organization had records of

    employees in a table (entity type) called "Employees". In the table there would be

    attributes/columns such as First Name, Last Name, Job Name and Wage. The company

    26

    http://en.wikipedia.org/wiki/Data_modelhttp://en.wikipedia.org/wiki/Tree_data_structurehttp://en.wikipedia.org/wiki/One-to-manyhttp://en.wikipedia.org/wiki/Information_Management_Systemhttp://en.wikipedia.org/wiki/International_Business_Machineshttp://en.wikipedia.org/wiki/Windows_Registryhttp://en.wikipedia.org/wiki/Microsofthttp://en.wikipedia.org/wiki/File:Hierarchical_Model.jpghttp://en.wikipedia.org/wiki/Data_modelhttp://en.wikipedia.org/wiki/Tree_data_structurehttp://en.wikipedia.org/wiki/One-to-manyhttp://en.wikipedia.org/wiki/Information_Management_Systemhttp://en.wikipedia.org/wiki/International_Business_Machineshttp://en.wikipedia.org/wiki/Windows_Registryhttp://en.wikipedia.org/wiki/Microsoft
  • 8/4/2019 Chapter1_RDBMS

    27/30

    also has data about the employees children in a separate table called "Children" with

    attributes such as First Name, Last Name, and date of birth. The Employee table represents

    a parent segment and the Children table represents a Child segment. These two segments

    form a hierarchy where an employee may have many children, but each child may only

    have one parent. In a hierarchical database the parent-child relationship is one to many.

    This restricts a child segment to having only one parent segment.

    This simple model is commonly known as the adjacency list model, and was introduced by

    Dr. Edgar F. Codd after initial criticisms surfaced that the relational model could not model

    hierarchical data.

    Advantage

    1) Simplicity

    Data base is based on hierarchical structure therelationship between various layers

    is logically simple.Thus design of data model is simple

    2) Data security

    It is the first model which provide data security and enforced by DBMS

    3) Data Integrity

    As it is based on parent child relationship.there is always a link between parent

    segment and the child segment under it.The child segments are always

    automatically referenced to its parent, this model promotes data integrity

    4) Efficiency

    This model is very efficient when the database contains a large number of 1: n

    relationship and when the users require large number of trascations using data

    whose relationships are fixed

    Disadvantage

    27

    http://en.wikipedia.org/wiki/Adjacency_listhttp://en.wikipedia.org/wiki/Edgar_F._Coddhttp://en.wikipedia.org/wiki/Adjacency_listhttp://en.wikipedia.org/wiki/Edgar_F._Codd
  • 8/4/2019 Chapter1_RDBMS

    28/30

    1) Implementation Complexity

    Although the model is simple and easy to design ,it is quite complex to implement

    The database designer should have very good knowledge of the data storage

    characterics.

    2) Database Management Problem

    If you make any changes in the database structure of hierarchical database then it is

    required to make the necessary changes in all the application prgrams that accessthe

    database.Thus maintaining the database and the applications can become very

    difficult.

    3) Lack of structural Independence

    Structural independence exits when the change made in the database structure does

    not affect the DBMSs ability to access the data. It uses the physical storage paths

    to navigates to the different data segments .so application programmer should have

    a good knowledge of the relevant data path to access data.so if the physical

    structure is changed the application will also have to be altered.

    4) Programming Complexity

    Due to structural independence and navigational structure the application

    programmers and end user must know precisely how the data is distributed

    physically in the database in order to access the data.It requires the more knowledge

    of a system which is difficult for the user who have little or no programming

    knowlegde

    5) Implementation Limitation

    Many of the common relationship de not conform to the 1:n format required by this

    model.the m:m (Many to many) relationship,which is more common in real life is

    very difficult to implement in this model

    28

  • 8/4/2019 Chapter1_RDBMS

    29/30

    Client Server Architecture

    The architecture of a database system is greatly influenced by the underlying

    computer system on which it runs, in particular by such aspects of computer architecture

    as networking, parallelism, and distribution:

    Networking of computers allows some tasks to be executed on a server

    system, and some tasks to be executed on client systems. This division of work has led to

    clientserver database systems

    As personal computers became faster, more powerful, and cheaper, there

    was a shift away from the centralized system architecture.

    Personal computers support terminals connected to centralized systems that

    are used to handle directly by the centralized systems.

    As a result, centralized systems today act as server systems that satisfy

    requests generated by client systems.

    Figure 1 shows the general structure of a clientserver system. Database

    functionality can be broadly divided into two partsthe front end and the back endas

    in Figure 2

    The back end manages access structures, query evaluation and

    optimization, concurrency control, and recovery. The front end of a database system

    consists of tools such as forms, report writers, and graphical user interface facilities.

    The interface between the front end and the back end is through SQL, or

    through an application program to interface clients with servers.

    29

  • 8/4/2019 Chapter1_RDBMS

    30/30

    Any client that uses the ODBC or JDBC interfaces can connect to any server that

    provides the interface.

    In earlier-generation database systems, the lack of such standards necessitated that

    the front end and the back end be provided by the same software vendor.

    With the growth of interface standards, the front-end user interface and the back-

    end server are often provided by different vendors.

    Application development toolsare used to construct user interfaces; they provide

    graphical tools that can be used to construct interfaces without any programming.

    Some of the popular application development tools are PowerBuilder, Magic, and

    Borland Delphi; Visual Basic is also widely used for application development.

    Further, certain application programs, such as spreadsheets and statistical-

    analysis packages, use the clientserver interface directly to access data from a back-end

    server.

    In effect, they provide front ends specialized for particular tasks some transaction-

    processing systems provide a transactional remote procedure call interface to connect

    clients with a server.

    These calls appear like ordinary procedure calls to the programmer, but all the

    remote procedure calls from a client areenclosed in a single transaction at the server end.

    Thus, if the transaction aborts, the server can undo the effects of the individual

    remote procedure calls.