vinayaka_dbms_m.tech key - copy

Upload: kiransree

Post on 01-Mar-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    1/15

    M.Tech-I Semester Regular/Supply Examination, February 2016

    Subject Code: G0503/R13

    DATABASE MANAGEMENT SYSTEMS

    Detailed Answers

    1.

    a)i) These foreign key constraints are necessary:

    Works : FOREIGN KEY did REFERENCES Dept(did)Dept: FOREIGN KEY managerid REFERENCES Emp(eid)When deleting a Dept tuple, we need to remove the respective Workstuple(s).This can be done with the ON DELETE CASCADE rule.

    2M

    ii) managerId INT NOT NULL 2M

    iii) UPDATE Emp E SET E.salary = E.salary * 1.10 2Mb) The user of SQL has no idea how the data is physically represented in the

    machine. He or she relies entirely on the relation abstraction for querying.Physical data independence is therefore assured. Since a user can define views,logical data independence can also be achieved by using view definitions to hidechanges in the conceptual schema

    6M

    2.

    a) Explanation with Cardinality of the following operations.

    Union

    Intersection

    Set Difference Cross Product

    Union: RUS returns a relation instance containing all tuples that occur in eitherrelation instance R or relation instance S (or both). R and S must be unioncompatible, and the schema of the result is defined to be identical to the schemaof R.

    Intersection: R Intersection S returns a relation instance containing all tuple thatoccur in both R and S. The relations R and S must be union-compatible, and theschema of the result is defined to be identical to the schema of R.

    Set-difference: RS returns a relation instance containing all tuples that occur inR but not in S. The relations R and S must be union-compatible, and the schemaof the result is defined to be identical to the schema of R.

    Cross-product: RXS returns a relation instance whose schema contains all thefields of R (in the same order as they appear in R) followed by all the fields ofS(in the same order as they appear in S).

    6M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    2/15

    b) Any three differences with examples and explanation.

    Relational Algebra is a procedural language that can be used to tell the DBMShow to build a new relation from one or more relations in the database and theRelational Calculus is a non-procedural language that can be used to formulate

    the definition of a relation in terms of one or more database relations.

    1. Relational algebra operations manipulate some relations and provideexpressions in the form of queries where as relational calculus formsqueries on the basis of pairs of expressions.

    2. RA have operator like join, union, intersection, division, difference,projection, selection etc. whereas RC has tuples and domain orientedexpressions.

    3. Expressive power of RA and RC are equivalent. This means any querythat could be expressed in RA could be expressed by formula in RC.

    4. Relational algebra is easy to manipulate and understand than RC.

    5.

    RA queries are more powerful than the RC.6. RC will form WFFs where as RA does not form any formula.7. RA evaluation of query depends on the order of operations and RC do not

    depends on the order of operations8. RA is domain independent where as RC is domain dependent9. RA specifies operations performed on existing relation to obtain new

    relations. In RC operations are performed on the relations in the form offormulae.

    6M

    3.

    a) Explanation with examples.

    A Nested query is a query within another SQL query and embedded within theWHERE clause.

    A Nested query is used to return data that will be used in the main query as acondition to further restrict the data to be retrieved.

    Nested queries can be used with the SELECT, INSERT, UPDATE, and DELETEstatements along with the operators like =, , >=,

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    3/15

    A sub query cannot be immediately enclosed in a set function. The BETWEEN operator cannot be used with a sub query; however, the

    BETWEEN operator can be used within the sub query.

    b) Explanation with importance of Trigger, Syntax and Example

    A triggeris a statement that is executed automatically by the system as a sideeffect of a modification to the database. Triggers are fired implicitly and notcalled by user like procedure and function. To design a trigger mechanism, wemust specify two things:

    Specify the conditions under which the trigger is to be executed

    Specify the actions to be taken when the trigger executes

    Use of Database Triggers

    To access table during regular business hours or on predetermined weekdays

    To keep track of modification of data along with the user name, the operationperformed and the time when the operation was performed

    To prevent invalid transaction Enforces complex security authorization

    Types of Triggers

    Row TriggersA row trigger is fired each time a row in the table is affectedby triggering statement. If the triggering statement affects no rows, the triggeris not executed at all

    Statement TriggersA statement trigger is fired once on behalf of the

    triggering statement, independent of number of rows affected by the triggeringstatement

    Creating a Trigger

    Syntax :CREATE OR REPLACE TRIGGER [ schema. ]< trigger_name >

    { BEFORE, AFTER }{ DELETE, INSERT, UPDATE [ OF column1, . . . ]ON [schema.] < table_name >[ REFERENCING { OLD AS old, NEW AS new} ][ FOR EACH ROW [ WHEN condition ] ]

    DECLARE;;

    BEGIN< PL/SQL sub-program body >;

    Exception< exception PL/SQL block >;

    End;

    Example :CREATE OR REPLACE TRIGGER t_Audit_trail

    BEFORE DELETE OR UPDATE ON

    6M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    4/15

    FOR EACH ROWDECLARE

    oper varchar2(8);BEGIN

    If updating then

    oper :=Updateend if;If deleting then

    oper :=Delete

    end if;insert into audit_cust values (:OLD.custno, :OLD.fname, :OLD.lname,:OLD.address, oper, user, sysdate);End;

    4 Normalization of DatabaseNormalization is a systematic approach of decomposing tables to eliminate dataredundancy and undesirable characteristics like Insertion, Update and Deletion

    Anamolies. It is a two step process that puts data into tabular form by removingduplicated data from the relation tables.Normalization is used for mainly two purpose,

    Eliminating reduntant(useless) data. Ensuring data dependencies make sense i.e data is logically stored.

    Problem Without Normalization

    Without Normalization, it becomes difficult to handle and update the database,without facing data loss. Insertion, Updation and Deletion Anamolies are veryfrequent if Database is not Normalized. To understand these anomalies let us take anexample of Studenttable.

    S_id S_Name S_Address Subject_opted401 Adam Noida Bio

    402 Alex Panipat Maths

    403 Stuart Jammu Maths

    404 Adam Noida Physics

    UpdationAnamoly :To update address of a student who occurs twice or morethan twice in a table, we will have to update S_Addresscolumn in all therows, else data will become inconsistent.

    Insertion Anamoly :Suppose for a new admission, we have a Student

    id(S_id), name and address of a student but if student has not opted for anysubjects yet then we have to insert NULLthere, leading to Insertion Anamoly. Deletion Anamoly :If (S_id) 401 has only one subject and temporarily he

    drops it, when we delete that row, entire student record will be deleted alongwith it.

    Normalization Rule

    Normalization rule are divided into following normal form.

    12M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    5/15

    1. First Normal Form2. Second Normal Form3. Third Normal Form4. BCNF

    First Normal Form (1NF)A row of data cannot contain repeating group of data i.e each column must have aunique value. Each row of data must have a unique identifier i.ePrimary key. Forexample consider a table which is not in First normal form

    Student Table :

    S_id S_Name subject

    401 Adam Biology

    401 Adam Physics

    402 Alex Maths

    403 Stuart Maths

    You can clearly see here that student name Adamis used twice in the table andsubject mathis also repeated. This violates the First Normal form. To reduce abovetable to First Normal formbreak the table into two different tables

    New Student Table :

    S_id S_Name

    401 Adam

    402 Alex

    403 Stuart

    Subject Table :

    subject_id student_id subject

    10 401 Biology11 401 Physics

    12 402 Math

    12 403 Math

    In Student table concatenation of subject_id and student_id is the Primary key. Nowboth the Student table and Subject table are normalized to first normal form

    Second Normal Form (2NF)A table to be normalized to Second Normal Formshould meet all the needs of FirstNormal Formand there must not be any partial dependency of any column onprimary key. It means that for a table that has concatenated primary key, each column

    in the table that is not part of the primary key must depend upon the entireconcatenated key for its existence. If any column depends oly on one part of theconcatenated key, then the table fails Second normal form. For example, consider atable which is not in Second normal form.

    Customer Table :

    customer_id Customer_Name Order_id Order_name Sale_detail

    101 Adam 10 order1 sale1

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    6/15

    101 Adam 11 order2 sale2

    102 Alex 12 order3 sale3

    103 Stuart 13 order4 sale4

    In Customertable concatenation of Customer_id and Order_id is the primary key.This table is in First Normal formbut not in Second Normal formbecause there arepartial dependencies of columns on primary key. Customer_Name is only dependenton customer_id, Order_name is dependent on Order_id and there is no link betweensale_detail and Customer_name.To reduce Customertable to Second Normal formbreak the table into followingthree different tables.

    Customer_DetailTable :

    customer_id Customer_Name

    101 Adam

    102 Alex

    103 Stuart

    Order_DetailTable :

    Order_id Order_Name

    10 Order1

    11 Order2

    12 Order3

    13 Order4

    Sale_DetailTable :

    customer_id Order_id Sale_detail

    101 10 sale1

    101 11 sale2102 12 sale3

    103 13 sale4

    Now all these three table comply with Second Normal form.

    Third Normal Form (3NF)

    Third Normal formapplies that every non-prime attribute of table must be dependenton primary key. The transitive functional dependencyshould be removed from thetable. The table must be in Second Normal form. For example, consider a table withfollowing fields.

    Student_DetailTable :

    Student_id Student_name DOB Street city State Zip

    In this table Student_id is Primary key, but street, city and state depends upon Zip.The dependency between zip and other fields is called transitive dependency. Henceto apply 3NF, we need to move the street, city and state to new table, with Zipasprimary key.

    New Student_DetailTable :

    Student_id Student_name DOB Zip

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    7/15

    Address Table :

    Zip Street city state

    The advantage of removing transtive dependency is, Amount of data duplication is reduced.

    Data integrity achieved.

    Boyce and Codd Normal Form (BCNF)

    Boyce and Codd Normal Formis a higher version of the Third Normal form. Thisform deals with certain type of anamoly that is not handled by 3NF. A 3NF tablewhich does not have multiple overlapping candidate keys is said to be in BCNF.For a table to be in BCNF, following conditions must be satisfied:

    R must be in 3rd Normal Form and, for each functional dependency ( X -> Y ), X should be a super Key.

    5 Concurrent Execution of Transactions

    The DBMS interleaves the actions of different transactions to improve performance,in terms of increased throughput or improved response times for short transactions,but not all interleaving should be allowed.

    12M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    8/15

    The schedule shown in Fig represents an interleaved execution of the two

    transactions. Ensuring transaction isolation while permitting such concurrentexecution is difficult, but is necessary for performance reasons. First, while one

    transaction is waiting for a page to be read in from disk, the CPU can process anothertransaction. This is because I/O activity can be done in parallel with CPU activity in acomputer. Overlapping I/O and CPU activity reduces the amount of time disks andprocessors are idle, and increases system throughput (the average number oftransactions completed in a given time). Second, interleaved execution of a shorttransaction with a long transaction usually allows the short transaction to completequickly. In serial execution, a short transaction could get stuck behind a longtransaction leading to unpredictable delays in response time, or average time taken tocomplete a transaction.

    Serilizability

    A serializable schedule over a set S of committed transactions is a schedulewhose effect on any consistent database instance is guaranteed to be identical to thatof some complete serial schedule over S. That is, the database instance that resultsfrom executing the given schedule is identical to the database instance that resultsfrom executing the transactions in some serial order.

    Executing the transactions serially in different orders may produce differentresults, but all are presumed to be acceptable; the DBMS makes no guarantees aboutwhich of them will be the outcome of an interleaved execution.

    Reading Uncommitted Data (WR Conflicts)

    The first source of anomalies is that a transaction T2 could read a database object A

    that has been modified by another transaction T1, which has not yet committed. Sucha read is called a dirty read.

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    9/15

    A simple example illustrates how such a schedule could lead to an inconsistentdatabase state. Consider two transactions T1 and T2,each of which, run alone,preserves database consistency: T1 transfers $100 from A to B, and T2 incrementsbothA andBby 6 percent (e.g., annual interest is deposited into these two accounts).Suppose that their actions are interleaved so that (1) the account transfer program T1deducts $100 from account A, then (2) the interest deposit program T2 reads thecurrent values of accountsA andB and adds 6 percent interest to each, and then (3) theaccount transfer program credits $100 to accountB.

    Unrepeatable Reads (RWConflicts)

    The second way in which anomalous behavior could result is that a transactionT2 could change the value of an objectA that has been read by a transaction T1, whileT1 is still in progress. This situation causes two problems. First, if T1 tries to read thevalue ofA again, it will get a different result, even though it has not modifiedA in themeantime. This situation could not arise in a serial execution of two transactions; it iscalled an unrepeatable read.

    Overwriting Uncommitted Data (WW Conflicts)

    The third source of anomalous behavior is that a transaction T2 couldoverwrite the value of an objectA, which has already been modified by a transactionT1, while T1 is still in progress. Even if T2 does not read the value ofA written by T1,a potential problem exists.

    6 When the system is restarted after a crash, the recovery manager proceeds in threephases,

    12M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    10/15

    1. Analysis Phase2. Redo Phase3. Undo PhaseAnalysis Phase has three main objectives.

    It determines the point in the log at which to start the Redo pass. It determines (a conservative superset of the) pages in the buffer pool that were

    dirty at the time of the crash.

    It identifies transactions that were active at the time of the crash and mustbeundone.

    The Analysis phase begins by examining the most recent begin checkpoint record,whose LSN is denoted as C in Fig 1, and proceeds forward in the log until the last logrecord.

    The Redo phase follows Analysis and redoes all changes to any page that might havebeen dirty at the time of the crash; this set of pages and the starting point for Redo (thesmallest recLSN of any dirty page) are determined during Analysis. The Undo phasefollows Redo and undoes the changes of all transactions that were active at the time ofthe crash; again, this set of transactions is identified during the Analysis phase.

    Redo Phase

    During the Redophase, ARIES reapplies the updates of all transactions, committed orotherwise. Further, if a transaction was aborted before the crash and its updates wereundone, as indicated by CLRs, the actions described in the CLRs are also reapplied.This repeating history paradigm distinguishes ARIES from other proposedWALbased recovery algorithms and causes the database to be brought to the samestate that it was in at the time of the crash.The Redo phase begins with the log record that has the smallest recLSN of all pages in

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    11/15

    the dirty page table constructed by the Analysis pass because this log record identifiesthe oldest update that may not have been written to disk prior to the crash. Startingfrom this log record, Redo scans forward until the end of the log. For each redoablelog record (update or CLR) encountered, Redo checks whether the logged action mustbe redone.

    Undo Phase

    The Undo phase, unlike the other two phases, scans backward from the end of the log.The goal of this phase is to undo the actions of all transactions that were active at thetime of the crash, that is, to effectively abort them. This set of transactions is identifiedin the transaction table constructed by the Analysis phase.

    The Undo AlgorithmUndo begins with the transaction table constructed by the Analysis phase,

    which identifies all transactions that were active at the time of the crash, and includes

    the LSN of the most recent log record (the lastLSN field) for each such transaction.Such transactions are called loser transactions. All actions of losers must be undone,and further, these actions must be undone in the reverse of the order in which theyappear in the log.

    7 Pseudo Code for Deleting Operation with example and Tracing.( Algorithm 6 M+ Tracing with explanation 6 M)Deleting a Data Entry from a B+ Tree

    1. Start at root, find leafL where entry belongs.2. Remove the entry.

    If L is at least half-full, done!If L has only d-1 entries,

    a) Try to re-distribute, borrowing fromsibling (adjacent node with sameparent as L).

    b) If re-distribution fails, merge L and sibling.3. If merge occurred, must delete entry (pointing toL or sibling) from parent of

    L.4. Merge could propagate to root, decreasing height

    Example:B+ Tree

    12M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    12/15

    Deleting 19

    Deleting 20

    Deleting 19* is easy.

    Deleting 20* is done with re-distribution. Notice how middle key is copied up.

    8 Importance of Active Databases 3M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    13/15

    ACTIVE DATABASES

    A trigger is a procedure that is automatically invoked by the DBMS inresponse to specific changes to the database, and is typically specified by the DBA. Adatabase that has a set of associated triggers is called an active database. An active

    databaseis adatabase that includes anevent-driven architecture (often in the form ofECA rules)which can respond to conditions both inside and outside the database

    Fig 1: ADBMS Architecture

    ADBMS provides regular DBMS primitives, definition of application-defined

    situations, triggering of application-defined reactions as shown in the figure.

    Active databases in Postgres

    Ruleso allow the definition of extra or alternate actions on updates

    Triggerso allow the association of user supplied procedures (functions) with

    database events

    Importance of Hash Based Indexing and explaining any one of the three

    techniques.The basic idea of hash based indexing is to use a hashing function, which maps

    values in a search field into arrange of bucket numbers to find the page on which adesired data entry belongs.

    Static Hashing scheme is like ISAM, suffers from the problem of longoverflow chains, which can affect performance. Two solutions to the problem areExtendible Hashing and Linear Hashing.

    The Extendible Hashing scheme uses a directory to support inserts and deletesefficiently without any overflow pages. The Linear Hashing scheme uses a clever

    5M

    https://en.wikipedia.org/wiki/Databasehttps://en.wikipedia.org/wiki/Event-driven_architecturehttps://en.wikipedia.org/wiki/Event_Condition_Actionhttps://en.wikipedia.org/wiki/Event_Condition_Actionhttps://en.wikipedia.org/wiki/Event-driven_architecturehttps://en.wikipedia.org/wiki/Database
  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    14/15

    policy for creating new buckets and supports inserts and deletes efficiently without theuse of a directory.

    Fig: Static HashingThe Static Hashing scheme is illustrated in Fig. The pages containing the data can beviewed as a collection of buckets, with one primary page and possibly additionaloverflow pages per bucket. A file consists of buckets 0 through N 1, with oneprimary page per bucket initially.

    Explaining ISAM Index Structure with an example

    Fig 1: Index Page

    Fig 2: One Level Index Structure

    4M

  • 7/26/2019 Vinayaka_dbms_M.tech Key - Copy

    15/15

    Example of ISAM Tree

    ISAM stands for Indexed Sequential Access Method, a method forindexing data forfast retrieval. Fig 1 provides index page and fig 2 provides one level index structure.

    Indexing permit access to selected records without searching the entire file.

    Advantages:

    Permits efficient and economic use of sequential processing technique when

    the activity rate is high.

    Permits quick access to records, in a relatively efficient way when this activityis a fraction of the work load.

    Disadvantages:

    Slow retrieval, when compared to other methods.

    Does not use the storage space efficiently.

    Hardware and software used are relatively expensive.

    Prepared by

    Dr P.Kiran Sree, Professor, Dept of CSE, SVECW, BVRM

    https://en.wikipedia.org/wiki/Index_%28database%29https://en.wikipedia.org/wiki/Index_%28database%29