dbms1-7

531
PAN African e-Network Project PGDIT DBMS Semester - II Session - 1 By- Mr. Gaurav Dubey

Upload: savoir001

Post on 16-Sep-2015

254 views

Category:

Documents


19 download

TRANSCRIPT

  • PAN African e-Network ProjectPGDIT

    DBMSSemester - II

    Session - 1

    By- Mr. Gaurav Dubey

  • Module 1. Introduction to DBMS

    What is Database and Database Management System?

    Data Model :- Definition and Types

    Role of Database Administrator

    File System Vs. DBMS Approach

  • Module 1. Introduction to DBMS

    Advantage of Using DBMS

    Data Independence : Logical and Physical

    Schema and Instances

    Architecture of Database Management System

    Levels of Database Management System

  • What is a database

    A database is any organized collection of data. Some examples of databases you may encounter in your daily life are: a telephone book T.V. Guide airline reservation system motor vehicle registration records papers in your filing cabinet files on your computer hard drive.

  • Data vs. information:What is the difference?

    What is data? Data can be defined in many

    ways. Information science defines data as unprocessed information.

    What is information? Information is data that

    have been organized and communicated in a coherent and meaningful manner.

    Data is converted into information, and information is converted into knowledge.

    Knowledge; information evaluated and organized so that it can be used purposefully.

  • Why do we need a database?

    Keep records of our: Clients Staff Volunteers

    To keep a record of activities and interventions;

    Keep sales records; Develop reports; Perform research Longitudinal tracking

  • What is the ultimate purpose of a database management system?

    DataData InformationInformation KnowledgeKnowledge ActionAction

    Is to transform

  • More about database definition

    What is a database?

    Its an organized collection of data. A database managementsystem (DBMS) such as Access, FileMaker, Lotus Notes,Oracle or SQL Server which provides you with thesoftware tools you need to organize that data in a flexiblemanner. It includes tools to add, modify or delete datafrom the database, ask questions (or queries) about thedata stored in the database and produce reportssummarizing selected contents.

  • Database Management System (DBMS)Defn:

    A software system that allows the users to define the structure of database , store and maintain the data into the database and provides controlled access to the database.

    A database management system is a complex piece of software that usually consists of a number of modules.

  • Database Administrator (DBA)

    A Database administrator (DBA) performs all activities related to maintaining a successful database environment.

    DBA is said to be the custodian of Database.

    DBA is a person or group of persons responsible for managing the Database.

  • Responsibility of DBA Includes:

    Designing, implementing, and maintaining the database system.Establishing policies and procedures pertaining to the management,

    Security, maintenance, and use of the database management system .

    Training employees in database management and use.

  • Database administrator (DBA)

    A DBA is expected to be knowledgeable of emergingt e c h n o l o g i e s a n d n e w d e s i g n a p p r o a c h e s .

    DBA has either a degree in Computer Science or some on-the-job training with a particular database product or more extensive experience with a range of database products.

    A DBA is usually expected to have experience with one or more of the major database management products, such as Structured Query Language, SAP, and Oracle-based database.

  • File System Vs. DBMS Approach

    File-Based Approach : Each program defines and manages its own data.

    Drawbacks of using file systems to store data:

    Data redundancy and inconsistencyMultiple file formats, duplication of information in

    different files.

    Difficulty in accessing data Need to write a new program to carry out each new task

  • File System Vs. DBMS Approach

    Data isolation multiple files and formats.

    Integrity problems:

    Integrity constraints (e.g. account balance > 0) become part of program code.

    Hard to add new constraints or change existing ones.

  • Drawbacks of using file systems (cont.)

    Atomicity of updates:

    Failures may leave database in an inconsistent state with partial updates carried out.

    E.g. transfer of funds from one account to another should either complete or not happen at all

  • Drawbacks of using file systems (cont.)

    Concurrent access by multiple users:

    Concurrent accessed needed for performance.

    Uncontrolled concurrent accesses can lead to inconsistencies.

    E.g. two people reading a balance and updating it at the same time.

    Security problems.

  • Database Approach

    Database Approach : A shared collection of logically related data, designed to meet the information needs of an organization.

    Database systems offer solutions to all the above problems.

  • Advantages of DBMS

    Controlled data redundancy:

    Data consistency:

    More information from the same amount of data

    Sharing of data:

  • Advantages of DBMS

    Increased concurrency.

    Improved data integrity:

    Improved backup and recovery services

  • Disadvantages of DBMS

    - Complexity & Size,

    - Cost of Software & Additional H/W costs

    - Cost of conversion, Performance,

    - Higher impact of a failure.

  • Architecture of DBMS

    Three-Tier / Three level architecture suggested by

    ANSI / SPARC

  • Architecture of DBMS

    A commonly used views of data approach is the three-level architecture suggested by ANSI/SPARC (American National Standards Institute/Standards Planning and Requirements Committee).

  • Architecture of DBMS

    ANSI/SPARC produced a final report in 1977. The reports proposed an architectural framework for databases. Under this approach, a database is considered as containing data about an enterprise.

    The three levels of the architecture are three different views of the data.

  • Levels of Abstraction

    Many external schemata,single conceptual(logical)schema and physical schema. External schemata describe

    how users see the data. Conceptual schema defines

    logical structure Physical schema describes the

    files and indexes used.

    Physical Schema

    Conceptual Schema

    ExternalSchema 1

    ExternalSchema 3

    ExternalSchema 2

  • Database Design

    Conceptual design

    Logical design

    Physical design

  • External level

    The external level is the view that the individual user of the database has.

    This view is often a restricted view of the database and the same database may provide a number of different views for different classes of users.

    In general, the end users and even the applications programmers are only interested in a subset of the database.

  • External Level

    For example:A department head may only be interested in the

    departmental finances and student enrolments but not the library information.

    The librarian would not be expected to have any interest in the information about academic staff.

    The payroll office would have no interest in student enrolments

  • Conceptual Level

    The conceptual view is the overall community view of the database and it includes all the information that is going to be represented in the database.

    The conceptual view is defined by the conceptual schema which includes definitions of each of the various types of data.

  • Internal Level

    The internal view is the view about the actual physical storage of data.

    It tells us what data is stored in the database and how.

    At least the following aspects are considered at this level:

  • Data Independence

    Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in

    logical structure of data. Physical data independence: Protection from changes in

    physical structure of data.

    * One of the most important benefits of using a DBMS!

  • Internal Level

    Storage allocation.

    Access paths e.g. specification of primary and secondary keys, indexes and pointers and sequencing.

    Miscellaneous e.g. data compression and encryption techniques, optimization of the internal structures.

  • Architecture of DBMS

  • Architecture of DBMS

    Physical level describes how a record (e.g., customer) is stored.

    Logical level: describes data stored in database, and the relationships among the data.

    type customer = recordname : string;street : string;city : integer;

    end; View level: application programs hide details of data types.

    Views can also hide information (e.g., salary) for security purposes.

  • Level for a database system

  • Data Model

    Information systems and computer sciences use data modeling to manage and organize large quantities of structured and unstructured data.

    A data model describes the information to be stored in vast database management systems like relational databases.

    Data models do not include unstructured data such as email messages, word processing documents.

  • Data Model Data modeling establishes implicit and explicit constrains

    and limitations of the structured data.

    Data Modeling Analysts use data modeling functions to supply an accurate representation of the enterprise.

    Data modeling is used to accurately reflect the data of the organization. Based on this information, a database is created.

  • Data Model (Definition) A generalized, user-defined view of data representing the real

    world.

    A description of the structure of data elements.

    Collection of concepts allowing for the representation of an environment according to arbitrary requirements.

    A diagram that shows the various subjects about which information is stored, and illustrates the relationships between those subjects

  • Data Model (Definition)

    A logical map that represents the inherent properties of the data independent of software, hardware or machine performance considerations.

    The model shows data elements grouped into records, as well as the association around those records.

    A data model is a collection of descriptions of data structures and their contained fields, together with the operations or functions that manipulate them.

  • Types of Data Model

    There are a number of data models that are used to describe how a database is structures and used, these are:

    Hierarchical Model. Network Model. Relational Model. Object Relational Model. Entity-Relationship Model

  • Hierarchical Model.

    The hierarchical data model organizes data in a tree structure.

    There is a hierarchy of parent and child data segments. This structure implies that a record can have

    repeating information, generally in the child data segments.

    Data in a series of records, which have a set of field values attached to it.

    It collects all the instances of a specific record together as a record type.

  • Hierarchical Model.

    These record types are the equivalent of tables in the relational model, and with the individual records being the equivalent of rows.

    Today, the hierarchical model is rarely in modern databases.

    It is, however primarily used storing information, ranging from geographic, file systems to the Windows registry to XML documents.

    IBM's Information Management System (IMS) DBMS, were popular Hierarchical DBMSs .

  • Network Model

    The popularity of the network data model coincided with the popularity of the hierarchical data model.

    Some data were more naturally modeled with more than one parent per child. So, the network model permitted the modeling of many-to-many relationships in data.

    In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the network model.

  • Network Model

    The basic data modeling construct in the network model is the set construct. A set consists of an owner record type, a set name, and a member record type.

    The data model is a simple network, and link and intersection record types.

  • Relational Model.

    RDBMS (relational database management system) A database based on the relational model developed by E.F. Codd.

    A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints.

    In such a database the data and relations between them are organized in tables.

    A table is a collection of records and each record in a table contains the same fields.

  • A Sample Relational Database

  • A Sample Relational Database

  • Object Relational Database Management Systems (ORDBMS) add new object storage capabilities to the relational systems at the core of modern information systems.

    These new facilities integrate management of traditional fielded data, complex objects such as time-series and geospatial data and diverse binary media such as audio, video, images, and applets.

    By encapsulating methods with data structures, an ORDBMS server can execute complex analytical and data manipulation operations to search and transform multimedia and other complex objects.

    Object Relational Model

  • Entity-Relationship Model

    E-R model of real world Entities : Real world objects (Includes both

    living or Non-living) E.g. customers, accounts, branch.Entities are of two types: Strong Entity and

    Week Entity

  • Entity-Relationship Model

    Relationships: Association between instances of entities

    E.g. Account A-101 is held by customer JohnsonRelationship are of Three Types:

    One-to-One (1:1)One-to-Many (1:M)Many-to-Many (M:N)

  • Entity-Relationship Model

    Most Widely used for database design

    Database design in E-R model usually converted todesign in the relational model which is used fors t o r a g e a n d p r o c e s s i n g

  • Entity-Relationship Model

  • Instances and Schemas

    Schema the logical structure of the database e.g., the database consists of information about a

    set of customers and accounts and the relationship between them.

    Physical schema: database design at the physical level.

    Logical schema: database design at the logical level

  • Instances and Schemas

    Instance the actual content of the database at a particular point in time.

    Analogous to the value of a variable Occurrences of an Entity.For Example: Name : John , City : Delhi and Age: 23

  • Physical Data Independence

    the ability to modify the physical schema without changing the logical schema Applications depend on the logical schema In general, the interfaces between the various levels

    and components should be well defined so that changes in some parts do not seriously influence others.

  • Physical Data Independence

    Alteration in the internal schema might include.* Using new storage device.* Switching from one access method to another.Using different file organizations or storage

    structures.* Modifying indexes.

  • Logical Data Independence:

    Logical data independence is the ability to modify the conceptual schema without having alteration in external schemas or program.

    Alterations in the conceptual schema may include:Addition or deletion of fresh entities, Attributes or

    relationships and should be possible without having alteration to existing external schemas or having to rewrite application programs.

  • Functions of a DBMS

    1.Data storage, retrieval, and update:Support of Query Language

    2. A user-accessible catalog: Data Dictionary

    3. Transaction support: Transaction Manager

    4. Concurrency control services: Lock Manager

    5. Recovery services.

  • Functions of a DBMS

    6. Authorization services7. Support for data communication8. Integrity services9. Services to promote data independence

  • Components of the DBMS Environment

    Hardware Software Data -Procedures People

  • Components of a DBMS

    ApplicationPrograms

    QueriesDatabaseSchema

    PreprocessorQuery

    processorDDL

    compiler

    Programobject code

    Databasemanager

    Dictionarymanager

    Accessmethods

    Filemanager

    Systembuffers

    DBMS

    Programmers Users DBA

    Database andsystem catalog

  • Category of Database User :

    Database Designer or Administrator.

    Application Programmer.

    End User.

  • Database Language

    DDL: Data Definition Language. DML: Data Manipulation Language. DQL: Data Query Language. DCL: Data Control Language

  • DDL: Data Definition Language. Example: CREATE Statement.

    ALTER ADD Statement.

    ALTER DROP Statement.

    ALTER MODIFY Statement

  • DML: Data Manipulation Language.

    Example: Insert Statement.

    Update Statement.

    Delete Statement.

  • DQL: Data Query Language

    Example:

    Select Statement.

    With Where Clause.

    With Order By Clause.

    With group By Clause

  • DCL: Data Control Language

    Example:

    Grant Statement.

    Revoke Statement.

  • Example: Creating Table:

    CREATE TABLE STATION (ID number(6) PRIMARY KEY, CITY CHAR(20), STATE CHAR(10), zip_code number (6));

  • Example: Inserting Data into Table:

    INSERT INTO STATION VALUES (13, 'Phoenix', 'AZ', 33112);

    INSERT INTO STATION VALUES (44, 'Denver', 'CO', 40105);

  • Example: Retrieving Data from Table:

    SELECT * FROM STATION;

    SELECT ID, CITY FROM STATION;

    SELECT STATE , CITY FROM STATION;

    SELECT STATE , CITY FROM STATIONWHERE CITY = MUMBAI;

  • DELETING Data from Table:

    DELETE FROM STATION;

    DELETE FROM STATION WHERE PIN = 53461;

  • ALTERING STRUCTURE OF TABLE

    ALTER TABLE STATION ADD EMAIL VARCHAR2(12) ALTER TABLE STATION MODIFY CITY VARCHAR2(20)

  • Question & Answer

    1. Database is defined as: Collection of similar type of entities. Processed data Collection of logically related data items Raw data

    2. Data independence allows: sharing the same database by several applications extensive modification of applications no data sharing between applications elimination of several application programs

  • Question & Answer

    3. One to Many Relationship is represented by:a) 1:Mb) M:Mc) N:Nd) M: N4. Entities are defined as :a) Real world objectb) Association among the instancesc) Property of attributesd) Property of DBMS

  • Question & Answer

    5. By data redundancy in a file based system we mean that(a) Unnecessary data is stored (b) Same data is duplicated in many files (c) Data is unavailable (d) Files have redundant data

    6. Overall logical structure of a database can be expressed graphically by

    (A). ER diagram(B). Records(C). Relations(D). Hierarchy

  • Question & Answer

    7. . A table can have how many unique key A). 1 B). any number C). 255 D). None of the above.

    8. Entity is represented by the symbol. A) Double Circle B) Ellipse C) Rectangle D) Square

  • Question & Answer

    9. Attributes arei) Properties of relationship ii) Degree to entities iii) Properties of members of an entity set

    (a ) i (b) i and ii (c) i and iii (d) iii10 A relationship is

    a) an item in an application b) a meaningful dependency between entities c) a collection of related entities d) related data

  • Thank You

    Please forward your query

    To: [email protected]: [email protected]

  • PAN African e-Network ProjectPGDIT

    DBMSSemester - II

    Session - 2

    By- Mr. Gaurav Dubey

  • Module 2. Relational Database & ER Model

    Entity , Entity Set & Type

    Attributes

    Week & Strong Entity

    Relationship Types

    E-R-Diagram

  • Module 2. Relational Database & ER Model

    Relational system

    Codds Rule

    Optimization

    Table & View

  • Relational Model

    RDBMS Relational Data Base Management System

    A database based on the relational model developed by E.F. Codd.

    A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints.

  • Relational Model

    In such a database the data and relations between them are organized in tables. - List of all logically related data are placed into one table or set of tables.

    A table is a collection of records and each record in a table contains the same fields.

  • Codd's rules Codd's 12 rules are set of twelve rules

    proposed by Edgar F. Codd,

    A pioneer of the relational model for databases. It is designed to define what is required from a

    database management system in order for it to be considered Relational DBMS

  • Codd's rules Rule 1: The Information Rule. All data should be presented to the user in table.

  • Codd's rules

  • Codd's rules

    Rule 2: Guaranteed Access Rule.All data should be accessible without ambiguity.

    This can be accomplished through a combination of the table name, primary keyand column name.

  • Codd's rules

    Rule 3: Systematic Treatment of Null Values. A field should be allowed to remain empty. This involves the support of a null value which is

    distinct from an empty string or a number with a value of zero.

  • Codd's rules Rule 4: Dynamic On-Line Catalog Based on the

    Relational Model. A relational database must provide access to its

    structure through the same tools that are used to access the data.

  • Codd's rules Rule 5: Comprehensive Data Sublanguage

    Rule. The database must support at least one clearly

    defined language that includes functionality for data definition, data manipulation, data integrity and database transaction control.

    All commercial relational databases use forms of the standard SQL (Structured Query Language)

  • Codd's rulesRule 6:View Updating Rule.Data can be presented to the user in different

    logical combinations called views. View are practically categorized as :

    Updatable View & Non- Updatable View .

  • Rule 6:Each view should support the same full range of

    data manipulation that direct-access to a table has available.

    In practice providing update and delete access to logical views is difficult and is not fully supported by any current database.

  • Codd's rules

    Rule 7: High-level Insert Update and Delete . Data can be retrieved from a relational

    database in sets constructed of data from multiple rows and / or multiple tables.

    This rule states that insert update and delete operations should be supported for any retrievable set rather than just for a single row in a single table

  • Codd's rules Rule 8: Physical Data Independence. The user is isolated from the physical method of

    storing and retrieving information from the database.

    Changes can be made to the underlying architecture ( hardware disk storage methods ) without affecting how the user accesses it

  • Codd's rules Rule 9: Logical Data Independence. How a user views data should not change when

    the logical structure (tables structure) of the database changes. This rule is particularly difficult to satisfy. Most databases rely on strong ties between the user view of the data and the actual structure of the underlying tables.

  • Codd's rules Rule 10: Integrity Independence. The Database Language (like SQL)

    should support constraints on user input that maintain database integrity. Database System should not accept any

    invalid input from user side.

  • Codd's rules

    Rule 10: This rule is not fully implemented by most major vendors.

    At a minimum all databases do preserve two constraints through SQL.

    No component of a primary key can have a null value.

    If a foreign key is defined in one table any value in it must exist as a primary key in another table.

  • Codd's rules

    Rule 11: Distribution Independence: A user should be

    totally unaware of whether or not the database is distributed (whether parts of the database exist in multiple locations).

    A variety of reasons make this rule difficult to implement.

  • Codd's rules Rule 12: Non subversion Rule: There should be no way

    to modify the database structure other than through the multiple row database language.

    Most databases today support administrative tools that allow some direct manipulation of the data structure.

  • Question & Answer

    1. How many Integrity Rules are there and what are those?.

    2. Distribution Independence Means: .

    3. Differentiate between Table and View

  • Question & Answer

    3. Systematic Treatment of null values Means

    .

    4. Physical Data Independence Means

    ..

    5. Logical Data Independence Means

  • Question & Answer

    6. Updatable View

    .

    7. Non- Updatable View .

  • Entity-Relationship Diagrams (ERD)

    An entity-relationship ( ER ) diagram is a specialized graphical method that illustrates the interrelationships between entities in a database.

    ER diagrams often use symbols to represent three different types of information in designing database.

  • Entity-Relationship Diagrams (ERD)

    Boxes are commonly used to represent entities.

    ovals are used to represent attributes.

    Diamonds are normally used to represent relationships.

  • Symbols for Drawing E-R-D

  • EntityA person, place, object, event or concept in the user environment about which the organization wishes to maintain data

    Represented by a rectangle in E-R diagrams Entity Type / Set

    A collection of entities that share common properties or characteristics.

    i.e. Student Entity Set , Customer Entity Set Attribute

    A named property or characteristic of an entity that is of interest to an organization.

    i.e registration_no , customer_id , customer_emailid

  • Entities are of Two Types:

    Strong Entities.

    Week Entities.

  • Strong and Weak Entities

  • Relationship

    An association between the instances of one or more entity types that is of interest to the organization.

    Relationships are always labeled with verb phrases

  • Relationship

    Avoid vague names Guidelines for defining relationships

    Definition explains what action is being taken and why it is important

    Give examples to clarify the action Explain reasons for any maximum cardinality.

  • Few more Symbols used for drawing E-R-D

  • Few more Symbols used for drawing E-R-D

  • Example:( Types of Attributes)

    Simple Attributes: Phone_No , Email_Id Multi-Valued attributes:

    Employee Skill set. , Hobbies.

    Derived Attributes: Age from DOB ,Gross salary from basic salary.

    Composite Attributes: Address comprises of Name , Locality & House No Name comprises of First Name , Middle Name , Last Name

  • Question & Answer

    WHAT IS DIFFERENCE BETWEEN SIMPLE ATTRIBUTE AND COMPOSITE

    ATTRIBUTE?

    WHAT IS DIFFERENCE BETWEEN DERIVED ATTRIBUTE AND MULTI VALUED

    ATTRIBUTE?

  • Guidelines for Defining relationships

    Explain any restrictions on participation in the relationship

    Explain extent of the history that is kept in the relationship

    Explain whether an entity instance involved in a relationship instance can transfer participation to another relationship instance.

  • Example: One to One Relationship Type:

    Example:

    DEPARTMENT DIRECTOR

  • Example: One to Many Relationship Type:

    PRODUCT VENDOR

  • Example: One to Many Relationship Type:

    course STUDENT

  • Many to Many Relationship Type:

    Example:

    INSTRUCTOR STUDENT

  • Many to Many Relationship Type:

    INSTRUCTOR COURSE

  • Resolving Many-to-Many Relationships

    Many-to-many relationships should be avoided.

    We can resolve a many-to-many relationship by dividing it into two one-to-many relationships.

  • Employees of a large company, e.g., IBM, where an employee reports to a manager. The manager is also an employee who reports to another manager. This chain of command continues to the very top where the CEO is the only employee who is not reporting to a manager. Draw the ER diagram for this example.

  • Primary keys:Emp: SS#Works-for: (empSS#, mgrSS#)

    WorksforEmp

    SS#

    nameaddress

  • RELATIONSHIPS (Cont)

    Example: A library database contains a listing of authors that have written books on various subjects (one author per book).

    It also contains information about libraries that carry books on various subjects.

    Entity sets: authors, subjects, books, librariesRelationship sets: wrote, carry, indexed.

  • RELATIONSHIPS (Cont)

    carry

    books indexwrote subjectauthorsSS#

    name

    title

    librariesaddress

    isbnSubjectmatter

  • Keys

    carry

    books indexwrote subjectauthorsSS#

    name

    title

    libraries

    quantity

    address

    isbnSubjectmatter

    Entities and relationships are distinguishable using various keys:

    A key is a combination of one or more than one attributes that uniquely identifies the instances of entity set or relationship.

    e.g., social-security number, Member-id, Combination of order_id and Product_id

  • Candidate key A candidate key is that uniquely

    identifies either an entity or a relationship. , e.g., social-security number, phone number, employment_id , email_id.

  • Alternate Key

    Alternate Key: An Entity Set can have various candidate Key. Among them only one can be selected as primary key. Remaining are known as alternate key.

    Alternate Key = Candidate Key- Primary Key

  • Keys

    A primary key is a candidate key that is chosen by the database designer to identify the entities of an entity set.

    Simple Primary Key Composite Primary Key

  • Criteria for Selecting Primary Key.

    Only those column should be selected as primary key which are permanent in nature or very less likely to change.

    Columns value are must

    Example: employment_no , registration_no

  • Question and Answer

    Example: A Employee Data has to be stored in a Table containing following information.

    i.e. emp_id , emp_name .emp_city ,emp_age ,emp_emailid ,emp_designation ,emp_salary ,emp_phone_no.

    Explain the possible primary key i.e. Candidate Key

  • Question and Answer

    Which column should be considered as primary key and why?

    Considering the previous example Mention all the Alternate keys.

  • Keys

    A foreign key is a set of one or more attributes of a strong entity set that are employed to construct the discriminator of a weak entity set.

    The primary key of a weak entity set is formed by the primary key of the strong entity set on which it is existence-dependent.

  • : EMPLOYEE DEPARTMENT

    NameEmp_Id

    D_ID

    D_IDD_Name

    D-Address

  • Question & Answer Differentiate between Foreign Key and Primary

    Key.

    Differentiate between Unique Key and Primary Key.

  • carry

    books indexwrote subjectauthorsSS#

    name

    title

    libraries

    quantity

    address

    isbnSubjectmatter

  • Example

    Consider the example of a database that contains information on the residents of a city. The ER diagram shown in the image bellow contains two entities -- people and cities. There is a single "Lives In" relationship.

  • Cardinality: The number of instances of entity B that can be

    associated with each instance of entity A Minimum Cardinality

    The minimum number of instances of entity B that may be associated with each instance of entity A

    This is also called modality. Maximum Cardinality

    The maximum number of instances of entity B that may be associated with each instance of entity A

  • How do we start an ERD?

    Define Entities: These are usually nouns used in descriptions of the system, in the discussion of business rules, or in documentation.

    Example: Customer ,Supplier ,Faculty , Student.

    Add attributes to the relations; these are determined by the queries e.g. grade; or they may suggest the need for keys or identifiers.

    Registered , Supplied ,orders

  • How do we start an ERD?

    Define Relationships: these are usually verbs used in descriptions of the system or in discussion of the business rules .

    Add cardinality to the relations.

    ERD, but they can be used with clients to discuss business rules.

  • Goal Capture as much of the meaning of the data as

    possible If you know the rules of normalization, referential

    integrity, foreign keys, etc., this is good but not as important now.

    Much more important is to get the organizational data model correct, i.e. to understand the actual data requirements for the organization.

    Result A better design that is scalable and easier to

    maintain

  • Database Modeling and Implementation Process

    Ideas ER Design Relational Schema

    Relational DBMSImplementation

  • Database optimization

    Database optimization is to maximize the use of system resources to perform work as efficiently and rapidly as possible.

    Put the most unique data element first in the index, the element that has the biggest variety of values.

    The index will find the correct page faster. Keep indexes small.

  • Database optimization

    It's better to have an index on just zip code or postal code. (Simple Attributes)

    The smaller the index, the better the response time. For high frequency functions (thousands of times per

    day) it can be wise to have a very large index, so the system does not even need the table for the read function.

  • Database optimization Indexes are used to find rows with specific column

    values fast.

    Without an index, MySQL has to start with the first record and then read through the whole table to find the relevant rows.

    The larger the table, the more this costs.

  • Database optimization

    If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data.

    If a table has 1,000 rows, this is at least 100 times faster than reading sequentially.

  • Database optimization

    For small tables an index is disadvantageous.

    An index slows down additions, modifications and deletes. It's not just the table that needs an update, but the index as well. So, preferably, add an index for values that are often used for a search, but that do not change much. An index on bank account number is better than one on balance.

  • Question & Answer

    1. Primary key column of the Table can accept null values

    a) Trueb) False.

    2. Composite primary key consist of more than one column.

    a) Trueb) False

  • Question & Answer

    3. Primary key is a possible candidate keya) Trueb) False4. Relationships are defined as :a) Diamond symbolb) Rectanglec) Double Rectangled) Oval

  • Question & Answer5. Independence of data / Application program on physical

    storage method of database is called.(a) Logical Data Independence(b) Null Value Treatment(c) Physical Data Independence(d) Table

    6. Multi-Valued Attributes are defined as(A). Double Rectangle(B). Double ellipse(C). Arrow symbol(D). Diamond

  • Question & Answer7. Independence of data/ Application program on

    logical storage of database is called.(a) Logical Data Independence(b) View(c) Physical Data Independence(d) Table.

    8. Foreign Key Meansa) Same as primary keyb) Column of a Table which is related to primary key column of another Table. C) Cant accept null values d) None of the above.

  • Question & Answer9. Attributes are

    i) Properties of relationship ii) Degree to entities iii) Properties of members of an entity set

    (a ) i (b) i and ii (c) i and iii (d) iii10 Candidate Key

    a) Possible Primary key b) Set of similar attributesc) a collection of related entities d) related data

  • Thank You

    Please forward your query

    To: [email protected]: [email protected]

  • PAN African e-Network ProjectPGDIT

    DBMSSemester - II

    Session - 3

    Ms. Archana Singh

  • Module : Database design

    What is Functional Dependency (FD)

    Types of Functional Dependency

    Full FD

    Partial FD

    Transitive DF

  • Definition of Normalization

    Different Normal Forms (NF)

    1 NF : First Normal Form

    2 NF : Second Normal Form

    3 NF : Third Normal Form

  • BCNF: Byoce Codd Normal Forms

    4 NF: Fourth Normal Form 5 NF : Fifth Normal Form

    Examples

    Question & Answer

    Module : Database design

  • Definition: Functional Dependency

    In a given Relation R , including attribute A and B, B is said to be functional dependent on A if, for every valid occurrence, the value A determines the value B.

    Functional Dependency is symbolically represented as :

    A B .

    It is read as B is functionally dependent on A

  • Definition: Functional Dependency FD is an acronyms for Functional Dependency. It represents integrity constraints.

    FDs are checked by the database management system (DBMS) at every update.

    So we are interested in finding the smallest set of FDs that capture the intended meaning of the data.

  • Definition: Functional Dependency

    A more formal definition:Given Relation R, an instance of a relation, and X and Y, arbitrary attribute subsets of R, then Y is functionally dependent on X:X YIf and only if each X-value in R is associated with precisely one Y-value in R.

  • Use functional dependencies:

    We use functional dependencies to test a relations to see if they are legal under a given set of functional dependencies.

    If a relation R is legal under a set S of functional dependencies, we say that R satisfies S.

    Specify constraints on the set of legal relations: we say that S holds on R if all legal relations on R satisfy the set of functional dependencies S.

  • ExampleLet us consider a STUDENT Relation with following set of attributes:

    (S_ID , S_Name , S_Age , S_City ,S_Course )

    How can we select the Determinant of the above Relation.

    Attribute which can uniquely identify each Tupple of the Relation can be considered as Determinant

  • Example

    From the above Relation we can say that

    S_ID S_NAMES_ID S_AGES_ID S_CITYS_ID S_COURSE

    S_ID is determinant

  • Example

    In the previous example S_ID is called the Determinant of the Relation.

    It is also known as Prime attribute or Primary key.

    Remaining attribute of the relation is known as non-prime attribute

  • Functional Dependency: Types

    Functional Dependency are categorized as follows:

    1. Full Functional Dependency

    2. Partial Functional Dependency

    3. Transitive Functional Dependency

  • 1. Full Functional Dependency: A given relation (R ) is said to possess Full functional dependency if with the given two set of attributes:

    One set of attributes are represented by (X ) Another set of attributes are represented by (Y)if for a given values for the set of attribute X attribute set Y has unique value and there is no Z where Z is subset of X on which Y is dependent.

    It is represented as: X Y

  • Example

    Let us Consider a Relation ORDER_INFOWith list of attributes:

    ( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,

    Quantity_ORDERED , TOTAL_ITEM_PRICE )

  • Example In previous example Set X is represented by : ORDER_ID , ITEM_ID

    Set Y is represented by : ITEM_DESCRIPTION , PRICE , Quantity_ORDERED , TOTAL_ITEM_PRICE.

    There also exist a Z where Z is subset of X Set Z is represented Item_ID on which

    Item_Price and ITEM_Description is dependent

  • Example ORDER_INFO( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,

    QNTY_ORDERED , TOTAL_ITEM_PRICE )

    Therefore in this relation Full functional dependency is not exist.There exist Partial dependency in this relation

  • Example

    Relation ORDER_INFO can be converted into Full FDONLY BY DECOMPOSING the Relation into Two

    R1 (ORDER_ID , ITEM_ID, QNTY_ORDERED , TOTAL_ITEM_PRICE )

    R2 ( ITEM_ID, ITEM_DESCRIPTION ,PRICE )

  • Partial Dependency2. Partial Functional Dependency: A given relation (R ) is said to possess Partial Functional Dependency if with the given two set of attributes:

    One set of attributes are represented by (X ) Another set of attributes are represented by (Y)if for a given values for the set of attribute X attribute set Y has unique value and there exist Z where Z is subset of X on which Y is dependent.

  • Example

    ORDER_INFO( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,

    QNTY_ORDERED , TOTAL_ITEM_PRICE )There exist Partial dependency in this relation.

    ITEM_DESCRIPTION and PRICE are partially dependent on Item_id.

  • Example

    Let us think of this relation and Types of Dependency.

    ( S_ID , S_Name , S_City , Subject_ID ,GRADE )

    ?

  • Transitive Dependency

    3. Transitive Functional Dependency:

    A given relation (R ) is said to possess Transitive Functional Dependency if a nonprime attribute is dependent on another non prime attribute.

    A B C

    It means C is transitively dependent on A WHERE B and C are nonprime attributes

  • Example

    Let us think of the relation name Student with list of attributes

    ( S_ID , S_Name , S_City , Dept_ID , HOD)

    S_ID Dept_ID HOD

  • Example Let us think of another relation name

    Faculty with list of attributes

    ( F_ID , F_Name , F_Salary , Dept_ID Dept_Location)

    F_ID Dept_ID Dept_Location

  • Example In the previous example HOD is transitively

    dependent on S_ID

    ?Because HOD is dependent on DEPT_ID and DEPT_ID is dependent on S_ID

  • Example

    In the next example Dept_Location is transitively dependent on F_ID

    Because Dept_Location is dependent on DEPT_ID and DEPT_ID is dependent on F_ID

  • In any Relation There must exist a Full Functional Dependency.

    Transitive and Partial Functional Dependency must be avoided.

    Such Dependency must cause lots of redundancy and data Inconsistency

  • Normalization

    Database normalization It is the step by step process of removing redundant data from the database in order to improve storage efficiency, data integrity, and consistency.

    Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.

  • Normalization

    Decomposition process in Normalization are of two types:

    Lossless Decomposition.Lossy Decomposition.

    Decomposition always should be Lossless Decomposition

  • Normalization Edgar F. Codd originally established three normal

    forms: 1NF, 2NF and 3NF.

    There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications.

    Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form)

  • Normalization

    2NF

    3NF

    4NF

    5NF

    1NF

    RedundancyRedundancy

    Redundancy

  • Normalization

    Normalization is based on the idea that an attribute may depend on another attribute in some way.

    There are 2 different kinds of dependencies involved up to 5 NF Functional dependency Multivalued dependence

  • First Normal Form

    First Normal Form: A relation R is said to be in First Normal Form (1NF) if for a given row each column should have atomic value.

    Atomic value means column should have one and only one value

    Atomic value in the column will help in easy access of data

  • Example: Table 1

    Title Author1 Author2

    ISBN Subject Pages Publisher

    Database System Concepts

    Abraham Silberschatz

    Henry F. Korth

    0072958863 MySQL, Computers

    1168 McGraw-Hill

    Operating System Concepts

    Abraham Silberschatz

    Henry F. Korth

    0471694665 Computers 944 McGraw-Hill

  • Limitations with Table1

    This table is not very efficient with storage.

    This design does not protect data integrity.

    Third, this table does not scale well.

  • In Table 1,

    We have two violations of First Normal Form:

    First : We have more than one author field,

    Second: Our subject field contains more than one piece of information.

    With more than one value in a single field, it would be very difficult to search for all books on a given subject.

  • First Normal Form Table (Table 2)

    Title Author ISBN Subject Pages Publisher

    Database System Concepts

    Abraham Silberschatz

    0072958863 MySQL 1168 McGraw-Hill

    Database System Concepts

    Henry F. Korth 0072958863 Computers 1168 McGraw-Hill

    Operating System Concepts

    Henry F. Korth 0471694665 Computers 944 McGraw-Hill

    Operating System Concepts

    Abraham Silberschatz

    0471694665 Computers 944 McGraw-Hill

  • First Normal Form Table (Table 2)

    Table 2 is in I NF.

    Every column of the Table has atomic value.

    Data Integrity will be maintained in the Table. Problem in Table 1 was removed

  • Example: 1NF

    Order (OrderNumber, OrderDate, {PartNumber, {Supplier}})

    Order (OrderNumber, OrderDate)

    Order-Part (OrderNumber, PartNumber)

    Part (PartNumber, {Supplier})

  • A Relation or Table is said to be in 2 NF ,If no non key attributes are dependent on part of the primary key .

    Table or Relation should not possess Partial Dependency in 2NF.

    Second Normal Form : 2 NF

  • 2nd Normal Form

    No partial dependencies.

    No attribute depends on only some of the attributes of a concatenated key.

    Order-Part[OrderNumber | PartNumber | PartDescription]

    Create a new table with Part Number key.

  • ENO Name Dno DeptName ProjNo ProjName

    E001 Somchai D01 Physic P01 NMR

    E001 Somchai D01 Physic P02 Laser

    E002 Sompong D01 Physic P03 Medical Image processing

    E003 Somchay D02 Computer Science

    P05 Voice ordering

    E003 Somchay D02 Computer Science

    P04 Speech Coding

    E004 SomSiri D02 Computer Science

    P04 Voice ordering

    E004 SomSiri D02 Computer Science

    P06 Speech Synthesis

    KEY = ENO + ProjNo

    Answer is No. Because ProjNo is dependent on

    ProjNo. (not all part of Key)

  • Problem

    ENO Name Dno DeptName ProjNo ProjName

    E001 Somchai D01 Physic P01 NMR

    E001 Somchai D01 Physic P02 Laser

    E002 Sompong D01 Physic P03 Medical Image processing

    E003 Somchay D02 Computer Science P05 Voice ordering

    E003 Somchay D02 Computer Science P04 Speech Coding

    E004 SomSiri D02 Computer Science P04 Voice ordering

    E004 SomSiri D02 Computer Science P06 Speech Synthesis

    We can not insert Project if have not yet assigned project to any employee

  • Result

    ENO Name Dno DeptName

    E001 Somchai D01 Physic

    E003 Somchay D02 Computer Science

    E004 SomSiri D02 Computer Science

    ProjNo

    ProjName

    P01 NMR

    P02 Laser

    P03 Medical Image processing

    P04 Speech Coding

    P05 Voice ordering

    P06 Speech Synthesis

    ProjectPERSON ENO ProjN

    oE001 P01

    E001 P02

    E002 P03

    E003 P04E004 P05E004 P06

    PERSON_Proj

    PERSON(ENO,NAME,Dno,DeptName)PROJECT(ProjNo,ProjName)PERSON_PROJ(ENO,ProjNo)

  • Difference Between 1NF & 2NF

    Relation in 1 NF only removes the non atomicity among the attributes.

    It means Attributes in the relation shoul have atomic value only

    Where as a Relation in 2 NF also takes care of Partial Dependency among the attributes

    Non Key attributes should only dependent on whole primary key .

  • Third Normal Form : 3NF

    3rd Normal Form: No transitive dependencies should exist.

    Relation must be at least in 2 NF

    Transitive dependency means that a non-keyattribute depends on another non-key attribute(s).

  • Transitive dependent

    R(A,B,C,D) ; A is Key, others are non- key

    If A B and B Ccan say

    A B C (C transitive dependent on A)

  • Third Normal Form : 3NF

    Definition:

    A relation is said to be in 3 NF if it is in second normal form and no transitive dependency should exist.

  • 3NF?

    ENO Name Dno DeptName

    E001 Somchai D01 Physic

    E003 Somchay D02 Computer Science

    E004 SomSiri D02 Computer Science

    ProjNo

    ProjName

    P01 NMR

    P02 Laser

    P03 Medical Image processing

    P04 Speech Coding

    P05 Voice ordering

    P06 Speech Synthesis

    ProjectPERSON ENO ProjN

    oE001 P01

    E001 P02

    E002 P03

    E003 P04E004 P05E004 P06

    PERSON_Proj

    Answer is NoBecause DeptName is dependent on Dno

    (has transitive dependent on key)

  • Result

    ENO Name Dno

    E001 Somchai D01

    E003 Somchay D02

    E004 SomSiri D02

    ProjNo

    ProjName

    P01 NMR

    P02 Laser

    P03 Medical Image processing

    P04 Speech Coding

    P05 Voice ordering

    P06 Speech Synthesis

    Project

    PERSON

    ENO ProjNo

    E001 P01

    E001 P02

    E002 P03

    E003 P04E004 P05E004 P06

    PERSON_Proj

    Dno DeptName

    D01 PhysicD02 Computer

    Science

    D02 Computer Science

    Department

  • Example Let us think of another relation name

    Faculty with list of attributes

    ( F_ID , F_Name , F_Salary , Dept_ID Dept_Location)

    Above Relation is not in 3 NF because there exist a Transitive dependency

  • Example Relation can be converted into 3NF by

    Decomposing the above relation into two relation R1 and R2

    R1 (F_ID , F_Name , F_Salary , Dept_ID )R2 ( Dept_ID Dept_Location)

    Transitivity has been removed

  • Difference Between 2NF & 3NF

    Relation in 2 NF only removes the Partial dependency among the attributes.

    Where as a Relation in 3 NF also takes care of Transitivity.

  • Note

    The third normal form is often reached in practice by inspection, in a single step.

    Its meaning seems intuitively clear; it represents a formalization of designers common sense.

    This level of normalization is widely accepted as the initial target for a design which eliminates redundancy.

    However, there are higher normal forms which, although less frequently invoked, highlight further redundancy problems which may affect the designer

  • Boyce-Codd Normal Form : BCNF

    BCNF: Every Determinant is a candidate key.Determinant: any attribute(s) that functionally determine another attributeBCNF means that there are no transitive dependencies involving key or non-key attributes.

    BCNF is a refinement to third normal form, and tightens its duration.

  • Difference Between 3NF & BCNF

    3 NF only checks for Transitivity If a Relation is in 3NF then Transitive

    Dependency should not exit.

    Where as in some complex situation there exist Dependency among the determinant of the RELATION.

    BCNF removes such Dependency

  • Fourth Normal Form: 4 NF

    No multi valued dependenciesA multi valued dependency of column B on column A occurs when a table has a key with three or more attributes, (A, B, C) and each value of A is associated with a collection

    of values of B this collection of values is independent of C

  • B A

    c1c2c3

    a

    b1b2b3b4

  • Example: Multi valued dependence

    One Part has many suppliers. One Part is used in multiple projects. One supplies is supplying in multiple projects

    [Pard_Id | Supplier_ID | Project_ID]

    [ [Part_ID | Supplier_ID] [Part-ID | Project_ID]

  • Fifth Normal Form: 5 NF A relation is said to be in 5 NF , If it is in 4 NF and

    Decomposition of Relation should be lossless.

    It means natural join of all the decomposed Relation should produce the original Relation

  • Example: 5 NF

    [Pard_Id | Supplier_ID | Project_ID]

    [ [Part_ID | Supplier_ID] [Part-ID | Project_ID]We can add one more relation [ Supplier_ID |

    Project_ID]

  • 5 NF

    Thus if we take natural join of all the decomposition relation , It will produce the original Relation.

    Else there will be loss of information

    Such Decomposition is called Lossless Decomposition .

  • 5 NF

    If after joining all the decomposed Relation original Relation is not produced then the decomposition is called lossy decomposition.

    Lossy decomposition or Relation or Table will cause loss of information

  • De Normalization

    De Normalization is reverse of Normalization: Some times to improve the performance of the system we need to de-normalize the RelationsBefore De Normalization following must be considered:-

    Use with caution Normalize first, then de-normalize Use only when you cannot optimize

  • Data Integrity1. Data into the database must be as per predefined

    set of rules, as determined by: The DBA or Application developer.

    2. When an integrity constraint applies to a table, all data in the table must conform to the corresponding rule.

    3. When you issue a SQL statement that modifies data in the table, Oracle Database ensures that the new data satisfies the integrity constraint, without the need to do any checking within your program.

  • Data Integrity

    1. You can enforce rules by defining integrity constraints more reliably than by adding logic to your application.

    2. Oracle Database can check that all the data in a table obeys an integrity constraint faster than an application can

  • Data Integrity

    Example of data integrity:Consider the tables employees and departments and the business rules for the information in each of the tables, As illustrated in Figure : ensure that each employee works for a valid department, first create a rule that all values in the department table are unique and value in the foreign key column must be same as value in primary key column or NULL

  • Types of Data Integrity

    Primary Key Values A rule defined on a column or set of columns that

    specifies that each row in the table can be uniquely identified by the values in the key.

    Referential Integrity Rules A referential integrity rule is a rule defined on a key (a

    column or set of columns) in one table that guarantees that the values in that key match the values in a key in a related table (the referenced value).

  • Question & Answer

    1. Normalization is step by step process of decomposing: a Table b) Database c) Group Data item d) All of the above2. The keys that can have NULL values areA). Primary KeyB). Unique KeyC). Foreign KeyD). Both b and c

  • Question & Answer3. Rows of a relation are called

    a) Tuples b) Column c) a data structure d) an entity

    4. A relation is said to be in 2 NF if i) It is in 1 NF ii) Non-key attributes dependent on key attribute iii) Non-key attributes are independent of one another iv) If it has a composite key, no non-key attribute should

    be dependent on part of the composite key. (a) i, ii, iii (b) i and ii (c) i, ii, iv (d) i, iv

  • Question & Answer5. A relation is said to be in BCNF when

    a) It has overlapping composite keys b) It has no composite keys c) It has no multi valued dependencies d) It has no overlapping composite keys which have

    related attributes

    6. Fourth Normal form (4 NF) relations are needed when. there are multi valued dependencies between

    attributes in composite key there are more than one composite key there are two or more overlapping composite keys there are multi valued dependency between non-key

    attributes

  • Question & Answer7. Transitive Dependency explains :a) Dependency of Key attributes to another Key attributesb) Dependency of Key attributes to another non Key attributesc) Dependency of non Key attributes to another non Key

    attributesd) All of the above8. Partial Dependency exist in a realtion whena) A Non key attribute depends on part of the primary keyb) Priamary Key doesnt existc) A non key attribute is dependent on another non key attriburted) None of the above

  • Question & Answer

    9) Full functional dependency is always desirable in a Relation

    a) Trueb) False

    10) Process of Normalization increases data redundancy and reduces data consistency

    a) Trueb) False

  • Question & Answer

    11. Composite primary key consist of two or more than two non key attributes

    a) True B) False

    12.Partial dependency exist only when primary key is composite in nature.

    a) True B) False

  • Question & Answer13. Decomposition of Relation should always be

    Losslessa) True.b) False.

    14 Multi value Dependency is most desirable Dependency

    a) True.b) False.

  • Case Study A H R Consultancy Firm has hired a Database Designer to

    store manipulate and retrieve the various data related of day to day operations of Recruitment Process. Data includes the various information related to Applicants , Jobs, Companies and Interviewers.

    Database designer has created a single table to perform various operations over the database.

    a) What are the problems with existing design of the Database? b) Suggest a suitable database design which can solve all the

    problems.

  • Thank You

    Please forward your query

    To: [email protected]: [email protected]

  • PAN African e-Network ProjectPGDIT

    DBMSSemester - II

    Session - 4

    By- Mr. Gaurav Dubey

  • Data Recovery & Protection

    CASE STUDY

    Data Recovery

    Types of Recovery :---

    Transaction Recovery

    System Recovery

    Media Recovery

  • Concurrency Control in DBMS

    Concurrency Control Techniques

    Locking

    Types of Locking :

    READ Lock & Write lock

  • Definition: Serializability

    Serial Schedule & Non- Serial Schedule. Examples. Database Security

    Question & Answer

    Data Recovery & Protection

  • CASE STUDY

  • CASE STUDY

    To maintain and operate on various Information related to Student , Faculty , Course & Result the management of a training Institute has hired a Database Designer .

    Data related to the training Institute includes: ( S_Id , S_Name , S_Age , S_Course , C_Id ,

    C_Fee ,C_Duration ,Faculty_Name & S_Grade)

  • CASE STUDY

    Database designer has created a single table to perform various operations over the database.

    a) What are the problems with existing design of the Database?

    b) Suggest a suitable database design which can solve all the problems.

  • CASE STUDY

    Data Duplicity or Data Redundancy Anomalies: --- Insert , Update Delete

    Data Inconsistency

  • Limitation of Existing Design

    If a student has join for multiple coursethen student detail and will be repeated.

    If multiple students has join the same courseInformation about the course will be repeated.

    Course info cant be stored until and unless students are registered for the course

  • Limitation of Existing Design

    Multiple Updating of data is required. There may be loss of data in some cases

    if deletion operation is performed. Data may become inconsistent in case of

    multiple updating.

  • Proposed Solution

    Existing database design should be further decomposed into more than one relation.

    Data into the database should be normalized.

    Decomposition must be lossless.

  • R1 (S_Id , S_Name , S_Age ) S_ID (Primary Key) R2 (C_Id , C_Fee ,C_Duration ,Faculty_Name ) C_ID (Primary Key) R3 (S_ID, C_Id , Grade) S_ID + C_ID (Primary Key) S_ID (Foreign Key) & C_ID (Foreign_Key)

  • Definition:-- Transaction

    .

    A transaction is the basic logical unit of execution in an information system.

    A transaction is a sequence of operations that must be executed as a whole.

    It is process of taking one consistent (& correct) database state into another consistent (& correct) database state.

  • Definition:-- Transaction A collection of actions that make consistent

    transformations of system states while preserving system consistency

    Example: RAED A READ B

    A=A-5000 WRITE A

    B = B+ 5000 WRITE B

  • Schedules of Transactions

    A schedule S of n transactions is a sequential ordering of the operations of the n transactions.

    The transactions are interleaved

  • Schedules of Transactions

    A schedule maintains the order of operations within the individual transaction. For each transaction T if operation a is

    performed in T before operation b, then operation a will be performed before operation b in Schedule S.

    The operations are in the same order as they were before the transactions were interleaved

  • Schedules of Transactions

    Two operations conflict if they belong to different transactions, AND access the same data item AND one of them is a write.

  • ExampleT 1 : T 2 :r e a d _ i t e m ( X ) ;X : = X - N ;

    r e a d _ i t e m ( X ) ;X : = X + M ;

    w r i t e _ i t e m ( X ) ;r e a d _ i t e m ( Y ) ;

    w r i t e _ i t e m ( X ) ;Y : = Y + N ;w r i t e _ i t e m ( Y ) ;

  • Serial Schedules

    Schedule S is said to be serial if: ----

    For every transaction T participating in the schedule, all of T's operations are executed consecutively in the schedule.

    Otherwise it is called non-serial.

  • Example: Serial Schedule

    T 1 : T 2 :r e a d _ i t e m ( X ) ;X := X - N ;w r i t e _ i t e m ( X ) ;r e a d _ i t e m ( Y ) ;Y := Y + N ;w r i t e _ i t e m ( Y ) ;

    r e a d _ i t e m ( X ) ;X := X + M ;w r i t e _ i t e m ( X ) ;

  • Example: Serial Schedule

    T 1 : T 2 :r e a d _ ite m ( X ) ;X := X + M ;w r i te _ i te m ( X ) ;

    r e a d _ ite m ( X ) ;X := X - N ;w r i te _ i te m ( X ) ;r e a d _ ite m ( Y ) ;Y := Y + N ;w r i te _ i te m ( Y ) ;

  • Non-serial Schedules

    Non-serial schedules mean that transactions are interleaved.

    There are many possible orders or schedules.

    Conflicting operations must be taken care of

  • Example: Non-serial Schedules

    T 1 T 2r e a d _ i t e m ( X ) ;X : = X - 1 0 ;w r i t e _ i t e m ( X ) ;

    r e a d _ i t e m ( Y ) ;Y : = Y - 2 0 ;w r i t e _ i t e m ( Y ) ;

    r e a d _ i t e m ( Y ) ;Y : = Y + 1 0 ;w r i t e _ i t e m ( Y ) ;

  • Example: Non-serial SchedulesT 1 : T 2 :r e a d _ i t e m ( X ) ;X : = X - N ;

    r e a d _ i t e m ( X ) ;X : = X + M ;

    w r i t e _ i t e m ( X ) ;r e a d _ i t e m ( Y ) ;

    w r i t e _ i t e m ( X ) ;Y : = Y + N ;w r i t e _ i t e m ( Y ) ;

  • Theory of Serializability

  • Serial and Non-serial Schedules

    Serializability theory attempts to determine the 'correctness' of the schedules.

    A schedule S of n transactions is serialisable if it is equivalent to some serial schedule of the same n transactions.

  • Serializability:---

    Conflicting Serializability

    Non-Conflicting Serializability

  • Example Serializability (Conflicting)

    T 1 : T 2 :r e a d _ i t e m ( X ) ;X : = X - N ;

    r e a d _ i t e m ( X ) ;X : = X + M ;

    w r i t e _ i t e m ( X ) ;r e a d _ i t e m ( Y ) ;

    w r i t e _ i t e m ( X ) ;Y : = Y + N ;w r i t e _ i t e m ( Y ) ;

  • Example of Non Serial Schedules (Not Conflicting)

    T 1 T 2r e a d _ i t e m ( X ) ;X : = X - 1 0 ;w r i t e _ i t e m ( X ) ;

    r e a d _ i t e m ( Y ) ;Y : = Y - 2 0 ;w r i t e _ i t e m ( Y ) ;

    r e a d _ i t e m ( Y ) ;Y : = Y + 1 0 ;w r i t e _ i t e m ( Y ) ;

  • The Transaction Manager

    The transaction manager enforces the ACID properties It schedules the

    operations of transactions

    COMMIT and ROLLBACK are used to ensure atomicity

    Locks or timestamps are used to ensure consistency and isolation for concurrent transactions (next lectures)

    A log is kept to ensure durability in the event of system failure (this lecture)

  • Properties of Transaction

    A Atomicity: a transaction is an atomic unit of processing and it is either performed entirely or not at all

    C Consistency Preservation: a transaction's correct execution must take the database from one correct state to another.

    I Isolation/Independence: the updates of a transaction must not be made visible to other transactions until it is committed

  • Example

    T 1 : T 2 :r e a d _ ite m ( X ) ;X := X - N ;

    r e a d _ ite m ( X ) ;X := X + M ;

    w r ite _ ite m ( X ) ;r e a d _ ite m ( Y ) ;

    w r ite _ ite m ( X ) ;Y := Y + N ;w r ite _ ite m ( Y ) ;

  • Properties of Transaction

    D Durability (or Permanency): if a transaction changes the database and is committed, the changes

    must never be lost because of subsequent failure

    READ AA=A-1500WRITEAREAD B

    B= B+1500WRITE B

  • Concurrency Control

    Most DBMS are multi-user systems.

    The concurrent execution of many different transactions submitted by various users must be organized such that each transaction does not interfere with another transaction with one another in a way that produces incorrect results.

    The concurrent execution of transactions must be such that each transaction appears to execute in isolation

  • Concurrency Problems

    In order to run transactions concurrently we interleave their operations

    Each transaction gets a share of the computing time

    This leads to several sorts of problems Lost updates Uncommitted updates Incorrect analysis

    All arise because isolation is broken

  • Example

  • Locking Techniques for Concurrency Control

    The concept of locking data items is one of the main techniques used for controlling the concurrent execution of transactions.

    A lock is a variable associated with a data item in the database. Generally there is a lock for each data item in the database.

  • Locking Techniques for Concurrency Control

    A lock describes the status of the data item with respect to possible operations that can be applied to that item.

    It is used for synchronizing the access by concurrent transactions to the database items.

    A transaction locks an object before using it When an object is locked by another transaction,

    the requesting transaction must wait

  • Types of Locks

    Binary locks have two possible states:

    1. locked (lock_item(X) operation) and

    2. unlocked (unlock_item(X) operation

  • Types of Locks

    Multiple-mode locks allow concurrent access to the same item by several transactions.

    Three possible states: 1. read locked or shared locked (other transactions are

    allowed to read the item) 2. write locked or exclusive locked (a single

    transaction exclusively holds the lock on the item) and

    3. unlocked.

  • Two-Phasing Locking Basic 2PL

    When a transaction releases a lock, it may not request another lock

    Conservative 2PL or static 2PL A transaction locks all the items it accesses

    before the transaction begins execution Pre-declaring read and write sets

  • Two-Phasing Locking

    obtain lock

    release lock

    lock point

    Phase 1 Phase 2

    BEGIN END

    number of locks

  • Two-Phasing Locking

    Strict 2PL a transaction does not release any of its locks until after it commits or aborts

    leads to a strict schedule for recovery

  • Two-Phasing Locking

    obtain lock

    release lock

    BEGIN ENDTransaction durationperiod of data

    item use

    number of locks

  • Deadlocks and Live locks

    Deadlock prevention protocol:

    conservative 2PL

    transaction stamping (younger transactions aborted)

    no waiting cautious waiting time outs

  • Deadlocks and Live locks

    Deadlock detection (if the transaction load is light or transactions are short and lock only a few items)

    wait-for graph for deadlock detection victim selection cyclic restarts

  • Deadlocks and Live locks

    Live lock: a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally. fair waiting schemes (i.e. first-come-first-

    served)

  • Locking Granularity

    A database item could be a database record a field value of a database record a disk block the whole database

  • Locking Granularity

    Trade-offs Coarse granularity

    the larger the data item size, the lower the degree of concurrency

    Fine granularity the smaller the data item size, the more locks to be

    managed and stored, and the more lock/unlock operations needed.

  • Database Backup and Recovery Concepts

    Backup of Database means to make single or multiple copies of data files, control file, and archived redo logs

    Restoring a Database means copying the physical files that make up the database from a backup medium, typically disk or tape, to their original or to new locations

  • Database Backup and Recovery Concepts

    A backup is either consistent or inconsistent. To make a consistent backup, database must have been shut

    down cleanly and remain closed for the duration of the backup. All committed changes in the redo log are written to the data

    files, so the data files are in a transaction-consistent state. When restoring data files from a consistent backup, you can

    open the database immediately

  • COMMIT and ROLLBACK

    COMMIT signals the successful end of a transaction Any changes made by

    the transaction should be saved

    These changes are now visible to other transactions

    ROLLBACK signals the unsuccessful end of a transaction Any changes made by

    the transaction should be undone

    It is now as if the transaction never existed

  • Recovery

    Transactions should be durable, but we cannot prevent all sorts of failures: System crashes Power failures Disk crashes User mistakes Sabotage Natural disasters

    Prevention is better than cure Reliable OS Security UPS and surge protectors RAID arrays

    Cant protect against everything though

  • Forwards and Backwards

    Backwards recovery We need to undo some

    transactions Working backwards

    through the log we undo any operation by a transaction on the UNDO list

    This returns the database to a consistent state

    Forwards recovery Some transactions need to

    be redone Working forwards through

    the log we redo any operation by a transaction on the REDO list

    This brings the database up to date

  • The Transaction Log

    The transaction log records the details of all transactions Any changes the

    transaction makes to the database

    How to undo these changes

    When transactions complete and how

    The log is stored on disk, not in memory If the system crashes it is

    preserved Write ahead log rule

    The entry in the log must be made before COMMIT processing can complete

  • System Failures

    A system failure means all running transactions are affected

    Software crashes Power failures

    The physical media (disks) are not damaged

    At various times a DBMS takes a checkpoint

    All committed transactions are written to disk

    A record is made (on disk) of the transactions that are currently running

  • Types of Transactions

    Last Checkpoint System Failure

    T1

    T2

    T3

    T4

    T5

  • Transaction Recovery

    T1T2

    T3T4

    T5Checkpoint Failure

    UNDO: T2, T3

    REDO: Last Checkpoint

    Active transactions: T2, T3

  • System Recovery

    Any transaction that was running at the time of failure needs to be undone and restarted

    Any transactions that committed since the last checkpoint need to be redone

    Transactions of type T1 need no recovery

    Transactions of type T3 or T5 need to be undone and restarted

    Transactions of type T2 or T4 need to be redone

  • Transaction as a Recovery Unit

    The database is restored to some state from the past so that a correct stateclose to the time of failurecan be reconstructed from the past state.

    A DBMS ensures that if a transaction executes some updates and then a failure occurs before the transaction reaches normal termination, then those updates are undone.

    The statements COMMIT and ROLLBACK (or their equivalent) ensure Transaction Atomicity

  • Media Failures

    System failures are not too severe Only information since

    the last checkpoint is affected

    This can be recovered from the transaction log

    Media failures (disk crashes etc) are more serious The data stored to disk

    is damaged The transaction log

    itself may be damaged

  • Recovery from Media Failure

    Restore the database from the last backup

    Use the transaction log to redo any changes made since the last backup

    If the transaction log is damaged you cant do step 2 Store the log on a

    separate physical device to the database

    The risk of losing both is then reduced

  • Recovery Methods:1. Mirroring

    keep two copies of the database and maintain them simultaneously

    2. Backupperiodically dump the complete state of the database to some form of tertiary storage

  • Recovery

    3. System Logging the log keeps track of all transaction operations affecting the

    values of database items. The log is kept on disk so that it is not affected by failures except for disk and catastrophic failures

  • Recovery from Transaction FailuresCatastrophic failure Restore a previous copy of the database from archival backup Apply transaction log to copy to reconstruct more current state by

    redoing committed transaction operations up to failure point Incremental dump + log each transaction

    Non-catastrophic failure Reverse the changes that caused the inconsistency by undoing the

    operations and possibly redoing legitimate changes which were lost The entries kept in the system log are consulted during recovery. No need to use the complete archival copy of the database

  • Transaction States

    For recovery purposes the system needs to keep track of when a transaction :

    Starts, terminates and commits.

  • Transaction States Begin_Transaction: Marks the beginning of a transaction

    execution.

    End_Transaction: Specifies that the read and write operations have ended and marks the end limit of transaction execution (but may be aborted because of concurrency control).

    Commit_Transaction: Signals a successful end of the transaction. Any updates executed by the transaction can be safely committed to the database and will not be undone.

  • Transaction States

    Rollback (or Abort): signals that the transaction has ended unsuccessfully. Any changes that the transaction may have applied to the database must be undone.

    Undo: similar to ROLLBACK but it applies to a single operation rather than to a whole transaction.

    Redo: specifies that certain transaction operations must be redone to ensure that all the operations of a committed transaction have been applied successfully to the database.

  • Transaction Execution

    A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log.

    It then writes a [commit, transaction-id].

    If a system failure occurs, searching the log and rollback the transactions that have written into the log a [start_transaction, transaction-id] [write_item, transaction-id, X, old_value, new_value]

    but have not recorded into the log a [commit, transaction-id]

  • Transaction execution

    Transaction Execution:-------

    active partially committed

    committed

    failed terminated

    BEGINTRANSACTION

    READ, WRITE

    ENDTRANSACTION

    ROLLBACK ROLLBACK

    COMMIT

  • Question & Answer

    1. Transaction is defines as (a) Single logical unit of work (b) Repeating same data item (c) Correcting database (d) Removing data duplicity

    2. In 2PL When a transaction releases a lock, it may not request another lock.

    a) Trueb) False

  • Question & Answer3. The concept of locking data items is one of the main

    techniques used for :a) Controlling the concurrent execution of transactionsb) Controlling the data redundancy c) Avoid Transitivityd) None of these

    4. A transaction's correct execution must take the database from one correct state to another is known as:

    a) Atomicityb) Isolationc) Both a and bd) Consistency

  • Question & Answer5. A schedule S is serial if, for every transaction T

    participating in the schedule, all of T's operations are executed consecutively in the schedulea) Trueb) False

    6. A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactionsa) Trueb) False

  • Question & Answer

    7. Which among the following is types of LOCK is known as exclusive lock

    a) Read Lock b) Write LOCK c) Shared Lock d) All of the above

    8. A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log

    a) True b) False

  • Question & Answer 9. Which among the following is data recovery method. a) System Logging b) Mirroring c) Both a and b d) None

    10 A transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally is called

    a) Dead Lock b) Live Lock c) Binary Lock d) Both a and b

  • Thank You

    Please forward your query

    To: [email protected]: [email protected]

  • PAN African e-Network Project

    PGDITData Base Management System

    Semester - II

    Session - 5By Mr. Gaurav Dubey

  • SQL Environment

    D D L Statements

    Data Types

    Constraints

  • VIEW

    D M L Statements

    INSERT , UPDATE

    DELETE

  • A simplified schematic of a typical SQL environment, as described by the SQL-2003 standard

  • 5Writing SQL Statements SQL statements are not case sensitive

    (but criteria within quotation marks are for some RDBMS) SQL statements can be on one or more lines Clauses are usually placed on separate lines Keywords cannot be split across lines Tabs and spaces are allowed to enhance readability Each SQL statement (not line) ends with a semicolon (;)

  • 6Components

    Front end application

    VB, Developer ,Access

    ODBC DriverOracle SybaseAccess

    Oracle Server

    SybaseServer

    Access DB

  • 7Core Database Engine

    ORACLE RDBMS (Oracle Universal server) Integrated Data Dictionary: manage tables

    owned by all users in a system SQL: language to access and manipulate

    data PL/SQL: a procedural extension to SQL

    language

  • 8SQL*Plus

    Command line tool that process users SQL statements Requires Oracle account

    SQLDDL

    DML

    DCL

    Data Definition

    Data Manipulation

    Data Control

  • 9Help command

  • 10

    Typing a SQL command

    Editing SQL command in a file

    Saving SQL command in a file

  • Structured Query Language features

    DQL (Data Query Language)SELECT

    Used to get data from the database and impose ordering upon it.

    DML (Data Manipulation Language)DELETE, INSERT, UPDATE

    Used to change database data. DDL (Data Definition Language)

    DROP, TRUNCATE, CREATE, ALTERUsed to manipulate database structures and definitions.

    RIGHTSREVOKE, GRANT

    Used to give and take access rights to database objects.

  • 12

    DDL, DML, DCL, and the database development process

  • DDL (Data Definitions Language)

    SQL commands are divided into a number of categories, of which the DDL commands are but one part:

    Data Definition Language commands Data Manipulation Language commands Transaction Control commands Session Control commands

  • DDL (Data Definitions Language)

    Data Definition Language commands allow you to perform these tasks:

    Create, Alter, and Drop objects

  • Data Definition examples

    CREATE TABLE TABLE_NAME( COLUMN_NAME DATA_TYPE [(SIZE)]

    COLUMN_CONSTRAINT,[, other column definitions,...][, primary key constraint])

  • Data Definition examples

    ALTER TABLE TABLE_NAME ADD | DROP | MODIFY

    ( COLUMN_NAME DATA_TYPE [(SIZE)] COLUMN_CONSTRAINT,

    [, other column definitions,...])

  • 17

    Create, modify, drop Tables, views, and sequences

    Table empName Type----------------------------------------------------------------EMPID NUMBER(5)FNAME

    VARCHAR2(20)LNAME

    VARCHAR2(20)SEX

    VARCHAR2(1)SSN

    VARCHAR2(9)SALARY NUMBER(8)DEPTNO NUMBER(5)

  • Create, modify, drop Tables, views, and sequences

    CREATE TABLE emp ( empid NUMBER(5), fname VARCHAR2(20), lname VARCHAR2(20), sex VARCHAR2(1), ssn VARCHAR2(9), salary NUMBER(8), deptno NUMBER(5) );

  • 19

    Another table Dept

    CREATE TABLE dept (deptno NUMBER(5) NOT NULL, name

    VARCHAR2(20) NOT NULL, building VARCHAR2(20),

    CONSTRAINT pk_deptno PRIMARY KEY (deptno) );

  • 20

    Insert

    INSERT INTO dept VALUES (4001, 'SHOES', 'BUILDING I');

    INSERT INTO dept VALUES (4002, 'WOMAN CLOTHING', 'BUILDING II');

    INSERT INTO dept VALUES (4003, 'MEN CLOTHING', 'BUILDING II');

    INSERT INTO dept VALUES (4004, 'KITCHEN APPLIANCES', 'MAIN BUILDING');

  • Data Definition examples

    CREATE VIEW VIEW_NAME AS QUERY_NAME ( Select col1 , col2 . From Table_Name Where .. )

    DROP TABLE TABLE_NAME

    DROP INDEX INDEX_NAME ON TABLE_NAME

  • Create Table This command allows the user to create a table, the

    basic structure to hold user data, by specifying the following information:

    column definitions integrity constraints the table's table space storage characteristics data from an arbitrary query

  • 23

    Naming conventions for Table/Fields

    Naming conventions for table names and attributes names Illegal

    spaces hyphens

    Legal letters [a-z A-Z ] + digits _, #, $ first character must be a letter no reserved words in SQL [no attribute called 'by or table]

  • 24

    Data Types

    A table is made up of one or more columns Each column is given a name and a data type that

    reflects the kind of data it will store. Oracle supports four basic data types

    CHAR NUMBER DATE RAW. There are also a few additional variations on the RAW

    and CHAR data types.

  • 25

    VARCHAR2 Character data type. Can contain letters, numbers and punctuation. The syntax : VARCHAR2(size) where size is the maximum number of alphanumeric

    characters the column can hold. In Oracle8, the maximum size of a VARCHAR2 column

    is 4,000 bytes.

    Data Types

  • NUMBER Numeric data type. Can contain integer or floating point numbers

    only. The syntax : NUMBER(precision, scale) where precision is the total size of the number

    including decimal point and scale is the number of places to the right of the decimal.

    For example, NUMBER(6,2) can hold a number between -999.99 and 999.99.

    Data Types

  • 27

    DATE Date and Time data type. Can contain a date and time portion in the

    format: DD-MON-YY HH:MI:SS. No additional information needed when

    specifying the DATE data type. the time of 00:00:00 is used as a default. The output format of the date and time can be

    modified

    Data Types

  • RAW Free form binary data. Can contain binary data up to 255 characters. Data type LONG RAW can contain up to 2

    gigabytes of binary data. RAW and LONG RAW data cannot be indexed

    and can not be displayed or queried in SQL*Plus.

    Only one RAW column is allowed per table.

    Data Types

  • LOB Large Object data types. These include BLOB (Binary Large OBject)

    and CLOB (Character Large OBject). More than one LOB column can appear in a

    table. These data types are the prefferred method

    for storing large objects such as text documents (CLOB), images, or video (BLOB).

    Data Types

  • Create Table (cont)

  • Create Table (cont.)

    schema - is the schema to contain the table.

    If you omit schema, ORACLE creates the table in your own schema.

    table - is the name of the table to be created.

    column - specifies the name of a column of the table. The number of columns in a table can range from 1 to 254.