imp questions for second internal

1. What are the two approaches to access SQL from a general purpose programming language?

Sol: The two approaches are as follows Dynamic SQL: A general-purpose program can connect to and communicate with a database server using a collection of functions (for procedural languages) or methods (for object-oriented languages). Dynamic SQL allows the program to construct an SQL query as a character string at runtime, submit the query, and then retrieve the result into program variables a tuple at a time. ThedynamicSQLcomponent of SQL allows programs to construct andsubmitSQLqueries at runtime. We look at two standards for connecting to an SQL database and performing queries and updates. One, JDBC), is an application program interface for the Java language. The other, ODBC is an application program interface originally developed for the C language, and subsequently extended to other languages such as C++, C#, and Visual Basic.

Embedded SQL: Like dynamic SQL, embedded SQ L provides a means by which a program can interact with a database server. However, under embedded SQL, the SQL statements are identified at compile time using a preprocessor. The preprocessor submits the SQL statements to the database system for pre compilation and optimization; then it replaces the SQL statements in the application program with appropriate code and function calls before invoking the programming-language compiler

2. What are prepared statements?

Sol: We can create a prepared statement in which some values are replaced by”?” There by specifying that actual values will be provided later. The database systems compile s the query when it is prepared. Each time the query is executed (with new values to replace the “?”s), the database system can reuse the previously compiled form of the query and apply the new values. The code fragment below shows how prepared statements can be used.

PreparedStatement pStmt = conn.prepareStatement(

"insert into instructor values(?,?,?,?)");

pStmt.setString(1, "88877");

pStmt.setString(2, "Perry");

pStmt.setString(3, "Finance");

pStmt.setInt(4, 125000);

pStmt.executeUpdate();

pStmt.setString(1, "88878");

pStmt.executeUpdate();

3. Explain the concept of recursive Queries.

Sol: We can use recursion to define the set of courses that are prerequisites of a particular course, say CS-347, as follows. The courses that are prerequisites (directly or indirectly) of CS-347 are:

1. Courses that are prerequisites for CS-347.

2. Courses that are prerequisites for those courses those are prerequisites (directly or indirectly) for CS-347.

Note that case 2 is recursive, since it defines the set of courses that are prerequisites of CS-347 in terms of the set of courses that are prerequisites of CS-347. Other examples of transitive closure, such as finding all subparts (direct or indirect) of a given part can also be defined in a similar manner, recursively. Since the SQL: 1999 version, the SQL standard supports a limited form of recursion, using the with recursive clause, where a view (or temporary view) is expressed in terms of itself. Recursive queries can be used, for example, to express transitive closure concisely. Any recursive view must be defined as the union of two sub queries: a base query that is non-recursive and a recursive query that uses the recursive view.

4. Define 1NF.

Sol: A domain is atomic if elements of the domain are considered to be indivisible units. We say that a relation schema R is in first normal form (1NF )if the domains of all attributes of R are atomic.

5. Define BCNF.

Sol: A relation schema R is in BCNF with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the formα→β, where α⊆R and β⊆R, at least one of the following holds:

•α→β is a trivial functional dependency (that is, β⊆α).

•α is a super key for schema R.

A database design is in BCN if each member of the set of relation schemas that

Constitutes the design is in BCNF.

6. Define 3NF

Sol: A relation schema R is in third normal form with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the formα→β, where

α⊆Randβ⊆R, at least one of the following holds:

•α→β is a trivial functional dependency.

•α is a super key for R.

•Each attribute A in β−αis contained in a candidate key for R.

Note that the third condition above does not say that a single candidate key must

contain all the attributes inβ−α; each attribute A inβ−αmay be contained in

a different candidate key.

7. State Armstrong axioms.

Sol: We can use the following three rules to find logically implied functional

dependencies. By applying these rules repeatedly, we can find all of F+, given F.

Reflexivity rule. If αis a set of attributes and β⊆α, then α→β holds.

•Augmentation rule .Ifα →βholds and ϒ is a set of attributes, then

ϒα→ϒβ holds.

•Transitivity rule. Ifα→βholds andβ→ϒ holds, and then α→ϒ holds

Armstrong’s axioms are sound, because they do not generate any incorrect functional dependencies. They are complete, because, for a given set F of functional dependencies, they allow us to generate all F+.

Union rule. Ifα→β holds andα→ϒ holds, thenα→βϒ holds.

•Decomposition rule. I f α→βϒ holds, thenα→βholds andα→ϒ holds.

•Pseudo transitivity rule. Ifα→βholds and ϒβ→δ holds, then αϒ→δ holds.

8.Define the following

a) Hash indices: Based on a uniform distribution of values across a range ofbuckets. The bucket to which a value is assigned is determined by a function, called a hash function.

b) Dense index: In a dense index, an index entry appears for every search-keyValue in the file. In a dense clustering index, the index record contains theSearch-key value and a pointer to the first data record with that search-keyvalue. The rest of the records with the same search-key value would be storedsequentially after the first record, since, because the index is a clustering one,records are sorted on the same search key.In a dense non clustering index, the index must store a list of pointers toall records with the same search-key value.•Sparse index: In a sparse index, an index entry appears for only some of theSearch-key values. Sparse indices can be used only if the relation is stored insorted order of the search key, that is, if the index is a clustering index. Asis true in dense indices, each index entry contains a search-key value and apointer to the first data record with that search-key value. To locate a record,

we find the index entry with the largest search-key value that is less than orequal to the search-key value for which we are looking. We start at the recordpointed to by that index entry, and follow the pointers in the file until we findthe desired record.

c) Transaction: A transaction is a unit of program execution that accesses and possibly updatesvarious data items. A transaction is delimited by statements (or function calls) of the formbegin transaction and end transaction. The transaction consists of all operationsexecuted between the begin transaction and end transaction.

d) A recoverable schedule is one where, for each pair of transactions Ti and Tj such that Tj reads a data item previously written by Ti, the commit operation of Ti appears before the commit operation of Tj.

e) Cascade less schedules :Cascading rollback is undesirable, since it leads to the undoing of a significant amount of work. It is desirable to restrict the schedules to those where cascading rollbacks cannot occur. Such schedules are called cascade less schedules. Formally ,a cascade less schedule is one where, for each pair of transactions Ti and Tj such that Tj reads a data item previously written by Ti, the commit operation of Tiappears before the read operation of Tj. It is easy to verify that every cascade lessschedule is also recoverable.

9. Suppose that there is a database system that never fails. Is a recovery manager required for this system?Sol: Even in this case the recovery manager is needed to perform roll-back of aborted transactions

10. Use Armstrong’s axioms to prove the soundness of the union rule. (Hint: Use the augmentation rule to show that, if α → β, then α → αβ. Apply the augmentation rule again, using α → ϒ , and then apply the transitivityrule.)

Answer: To prove that: if α →β and α → ϒ then α → βϒFollowing the hint, we derive:α → β givenαα →αβ augmentation ruleα → αβ union of identical setsα →ϒ givenαβ → ϒ b augmentation ruleα → βϒ transitivity rule and set union commutativity.

11. Use Armstrong’s axioms to prove the soundness of the pseudo transitivity rule.Answer: Proof using Armstrong’s axioms of the Pseudo transitivity Rule: if α → β and ϒβ → δ, thenαϒ → δ. α → β given αϒ →ϒ β augmentation rule and set union commutativity ϒβ →δ given αϒ → δ transitivity rule.

12. Why are certain functional dependencies called trivial functional dependencies?

13. Differentiate BCNF and 3NF.Sol: Of the two normal forms for relational database schemas,3NF andBCNFthere are advantages to3NFin that we know that it is always possible to obtain a3NFdesign without sacrificing lossless ness or dependency preservation. Nevertheless, there are disadvantages to3NF: We may have to use null values to represent some of the possible meaningful relationships among data items, and there is the problem of repetition of information. Our goals of database design with functional dependencies are:1. BCNF.2. Lossless ness.3. Dependency preservation.

Since it is not always possible to satisfy all three, we may be forced to choose Between BCNF and dependency preservation with3NF.

14. What is the difference between a clustering index and a secondary index?

15. What would the occupancy of each leaf node o f a B+-tree be, if index entries are inserted in sorted order? Explain why.Sol: If the index entries are inserted in ascending order, the new entries get directed to the last leaf node. When this leaf node gets filled, it is split into two. Of the two nodes generated by the split, the left node is left untouched and the insertions take place on the right node. This makes the occupancy of the leaf nodes to about 50 percent, except the las t leaf.If keys that are inserted are sorted in descending order, the above situation would still occur, but symmetrically, with the right node of a split never getting touched again, and occupancy would again be 50 percent for allnodes other than the first leaf.

16. What are the causes of bucket overflow in a hash file organization? Whatcan be done to reduce the occurrence of bucket overflows?Sol: when a record is inserted, the bucket to which it ismapped has space to store the record. Ifthe bucket does not have enough space,abucket overflowis said to occur. Bucket overflow can occur for several reasons:•Insufficient buckets. The number of buckets, which we denote nB,mustbechosen such thatnB >nr/fr,wherenr denotes the total number of recordsthat will be stored and fr denotes the number of records that will fit in abucket. This designation, of course, assumes that the total number of recordsis known when the hash function is chosen.•Skew. Some buckets are assigned more records than are others, so a bucketmay overflow even when other buckets still have space. This situation iscalled bucketskew. Skew can occur for two reasons:

1.Multiple records may have the same search key.2.The chosen hash function may result in nonuniform distribution ofsearch keys.

Essay Questions:

1. Explain the distinction between closed and open hashing. Discuss the relative merits of each technique in database applications.2. Construct a B+-tree for the following set of key values:(2, 3, 5, 7, 11, 17, 19, 23, 29, 31)Assume that the tree is initially empty and values are added in ascendingorder. Construct B+-trees for the cases where the number of pointers thatwill fit in one node is Four. Do the operations for inserting elements a) Insert 9.b. Insert 10. And deletions for Delete 23. Delete 19. 3. Suppose that we are using extendable hashing on a file that contains records with the following search-key values: 2, 3, 5, 7, 11, 17, 19, 23, 29, 31 Show the extendable hash structure for this file if the hash function is h(x) =xmod8andbucketscanholdthreerecords.4. List the ACID properties. Explain the usefulness of each.5. Explain the distinction between the terms serial schedule and serializable schedule.6. Define serializability. Give an example of a serializable schedule with two transactions such that the order in which the transactions commit is different from the serialization order.7. Explain the algorithm to compute closure set of attributes F+ using

functional dependencies

Sol: We say that an attribute B is functionally determined byαif α→β. To test whether a setαis a super key, we must devise an algorithm for computing the set of attributes functionally determined by α. One way of doing this is to compute F+, take all functional dependencies with α as the left-hand side, and take the union of the right-hand sides of all

such dependencies. However, doing so can be expensive, since F+ can be large. An efficient algorithm for computing the set of attributes functionally determined by αis useful not only for testing whether α is a super key, but also for several other tasksF+=Frepeatfor each functional dependency f in F+apply reflexivity and augmentation rules on fadd the resulting functional dependencies to F+for each pair of functional dependenciesf1andf2in F+if f1andf2can be combined using transitivityadd the resulting functional dependency to F+until F+does not change any furtherresult:=α;repeatfor each functional dependencyβ→ϒ in F dobeginifβ⊆result then result:=result ∪ ϒ;enduntil(result does not change)An algorithm to compute+, the closure ofunder F.Let α be a set of attributes. We call the set of all attributes functionally determined by α under a set F of functional dependencies the closure ofαunder F; we denote it by α+. The algorithm, written in pseudo code, to computeα+.The input is a set F of functional dependencies and the setαof attributes. The output is stored in the variable result.

.

imp questions for second internal

Documents