topic name: distributed database system q1:what do you mean by

14
Amit Topic Name: Distributed Database System Q1:What do you mean by Distributed Database Management System? Ans: A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (distributed DBMS) is the software system that permits the management of the distributed database and makes the distribution transparent to the users . The term “distributed database system “ (DDBS) is typically used to refer to the combination of DDB and the distributed DBMS. Q2: Which architecture it is? Ans: Client-Server databases architecture Q3: What is the difference between distributed file system and distributed database system?

Upload: amit-sangale

Post on 17-Nov-2014

451 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Topic Name: Distributed Database System

Q1:What do you mean by Distributed Database Management System?Ans: A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (distributed DBMS) is the software system that permits the management of the distributed database and makes the distribution transparent to the users . The term “distributed database system “ (DDBS) is typically used to refer to the combination of DDB and the distributed DBMS.

Q2: Which architecture it is?

Ans: Client-Server databases architecture

Q3: What is the difference between distributed file system and distributed database system?Distributed file systems simply allow users to access files that are located on machines other than their own. These files have no explicit structure (i.e., they are flat) and the relationships among data in different files are not managed by the system and are the users responsibility. But in Distributed Database is organized according to a schema that defines both the structure of the distributed data, and the relationships among the data.

Page 2: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Q4:What is the advantages and disadvantages of of Distributed Database System?Ans:Advantages: Users can be geographically separate. This is important for large corporations, where business decisions must be made by people in different locations, but those decisions must be based on company-wide data. Multiple machines can improve performance and scalability. Because a client-server system is distributed over several machines, you can improve the performance and scalability in several ways. There might be multiple replicas of a server running on separate machines, so each handles only a fraction of the total number of clients. Heterogeneous systems can use the best tools for each task. Different components of an application can run on hardware that is optimized for a specific task. Distributed systems can reduce maintenance costs. For example, by upgrading an application image on a single server, it is possible to upgrade thousands of clients. Disadvatages:Software: difficult to develop software for distributed systemsNetwork: saturation, lossy transmissionsSecurity: easy access also applies to secrete data.

Q5: What are the goals of Distributed database system?Ans: 1)Transperancy: Access: Hides differences in data representation and invocation mechanisms Location :Hides where an object resides Migration :Hides from an object the ability of a system to change that object’s locationRelocation :Hides from a client the ability of a system to change the location of an object to which the client is boundReplication :Hides the fact that an object or its state may be replicated and that replicas reside at different locations

2)Openness:

Page 3: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Be able to interact with services from other open systems, irrespective of the underlying environment:

3)Scalabilty:Number of users and/or processes(size scalability)Maximum distance between nodes (geographical scalability)Number of administrative domains (administrative scalability)

4)Replication: Make copies of data available at different machines:Replicated file servers (mainly for fault tolerance)Replicated databasesMirrored Web sitesLarge-scale distributed shared memory systems

Topic Name: Distributed Data storage

Q11.Give the different sorting approaches of Distributed Data storage?Ans: There are three main sorting approaches of Distributed Database

i. Replication: The system maintains several identical replicas of the relation,

and store each replicas at different site.ii. Fragmentation: The system partition the relation into several

fragments ,and stores each fragment at a different site.

iii. Transparency: User should not required to know where the data is physically located or how the data can be accessed at the specific local site.

Q12.Give the advantages of data replication in distributed data storage?Ans :i. Availability : If one site fail then data can be found in another site so that system can work continuouslyii. Increased parallelism:

Q13.give the disadvantages of data replication in distributed data storage?Ans :Increase overhead on update: If update result at one site it should agrees in various

Sites.Q14.What is fragmentation?

Page 4: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Ans: Fragmentation consists of breaking a relation into smaller relation or fragments and storing the fragment (instead of relation)possibly at different sites.

Q15.Distinguish between horizontal and vertical fragmentation in distributed data storage?Ans:

Horizontal fragmentation Vertical fragmentation

1.each fragment consist of a subset of rows of the original relation

1. each fragment consist of a subset of columns of the original relation

2.horizontal fragment are identical by a selection query.

2.vertical fragment are identified by a projection query.

Topic Name:

1.Characteristics of distributed database.

-One of the goals in using distributed database is high availability;that is,the database must function almost all the times.-For the distributed system to be robust,it must detect failures.

2.What is the difference in function between the coordinator and its backup?

-The difference is that the backup does not take any action that affects other sites.

3.What is the function of electron algorithm?

-Electron algorithm enables the sites to choose the site for the new coordinator in a decenterlized mannner.

4. What are the advantages of coordinator selection?

-Ability to continue processing immediately.-The backup coordinator approach avoids a substantial amount of delay while the distributed system recovers from a coordinate failure.

Page 5: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

5.What ae the disadvantages of coordinator selection?

-There is problem of overhead of duplicate execution of the coordinator's task.-A coordinator and its backup need to communicate regularly to ensure that their activities are synchronised.

Topic Name: Distributed Query Processing

Q1: What do you mean by Query Processing ? Ans: The process by which a declarative query is translated into low- level data manipulation operations.

Q2:What is objective of query processing in Distributed Systems?Ans: Easy retrival of data To ensure the user query, which is posed as if the database was centralized (i.e. logically integrated), executes correctly and efficiently over data that is distributed.

Q3:Discuss various steps in Query processing.Ans:

Page 6: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Query Parser is parsing and translating a given high-level language query into its immediate form such as relational algebra expressions. The parser need to check for the syntax of the query and also check for the semantic of the query ( it means verifying the relation names, the attribute names in the query are the names of relations and attributes in the database). A parse-tree of the query is constructed and then translated into relational algebra expression.

Q4: In distributed System ,for choosing a strategy for query processing which issues must be taken into account:Ans: The cost of a data transmission over the network .The data transmission depends upon Speed of disk and type of network. The potential gain in performance from having several sites process parts of the query in parallel database

Q5: What are the general approaches to query optimization?Heuristic- based query optimization:Given query expression, perform selection and projection as early as early.Eliminate duplicate computationsCost-based query optimization:Estimate cost of different query expressions using heuristic and algebra manipulation and choose execution plan with lowest cost estimation.

Topic Name: DIRECTORY SYSTEM

Q.1 what is Directory?Ans: A directory is a listing of information about some class or objects. Directories also used to store other information. e.g. web browser store personal bookmarks.

Q.2 what is use of directory?Ans: Directories can be used to find information about specific object or find objects that meet a certain requirements. It also store the necessary information.

Q.3 what are the ways for accessing directory information?Ans: Directory information can be made available through web interfaces. People can access these directory information sometimes, programs also access directory information.

Page 7: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Q.4 what are the reasons for having protocols for accessing directory information?Ans: 1. Directory access protocols are simplified and modified to a limited type of access to data. 2. They can be implemented with database access protocols. 3. It provides simple mechanism for giving name objects in a hierarchical fashion.

Q.5 where DAP protocol is used?Ans: DAP protocol is used in a distributed directory system to specify what information is stored is each to the directory servers.

Topic Name: LDAP

1>What is LDAP?LDAP (Lightweight Directory Access Protocol) for accessing online directory.LDAP (Lightweight Directory Access Protocol) is a protocol for communications between LDAP servers and LDAP clients. LDAP servers store "directories" which are access by LDAP clients.

2>.Why LDAP is called light weight?LDAP is called lightweight because it is a smaller and easier protocol which was derived from the X.500 DAP (Directory Access Protocol) defined in the OSI network protocol stack.

3>.What is the use of LDIF?LDIF is LDAP data interchange format used for storing and exchanging information

4>.How the communication between LDAP server and client takes place?A client starts an LDAP session by connecting to an LDAP server, called a Directory System Agent (DSA), by default on TCP port 389. The client then sends an operation request to the server, and the server sends responses in return. With some exceptions, the client need not wait for a response before sending the next request, and the server may send the responses in any order.

Page 8: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

5>.What are the different operation request made by client when it is connected to ldap server?The client may request the following operations:Start TLS — use the LDAPv3 Transport Layer Security (TLS) extension for a secure connectionBind — authenticate and specify LDAP protocol versionSearch — search for and/or retrieve directory entriesCompare — test if a named entry contains a given attribute valueAdd a new entryDelete an entryModify an entryUnbind — close the connection (not the inverse of Bind)

6>.Difference bet LDAP and database?The largest general difference between directories and databases is complexity. Databases are capable of storing almost any arbitrary set of information and can can be greatly customized for a specific purpose. They also provide a complex query interface, allowing for flexible searches returning customized results. Directories, on the other hand, tend to have very specific implementations that follow a strict pattern or schema. This allows them to be extremely fast, and allows for easy organization and comprehension of the data they store.

7>.What is DIT?Directories are viewed as a tree, like a computer's file system. This overall tree structure is called theDirectory Information Tree (DIT)

8>.What are the Object? What are the different objects of DIT?Each entry in a directory is called an object. These objects are of two types, containers and leafs. A container is like a folder: it contains other containers or leafs. A leaf is simply an object at the end of a tree. A tree cannot contain any arbitrary set of containers and leafs. It must match the schema defined for the directory.

9>.What are the applications of LDAP?Internet Application:

Page 9: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

Centralize or Distributed White pagesISP online subscriber directoryIntranet Application:Internal White pagesCertification and CRL distributionSystem/Network management database

10>.What are the Content of LDAP query?Base : a node within the DIT by giving distinguish nameSearch condition: combination of Boolean condition on individual attributesScope :It can be the just the base or base and its children or the entire sub tree of baseAttributes: Name of attributes which is to be returnLimits on number of results nad resources consumption

Topic Name: Commit Protocol

1:What are the types of Commit Protocol? Ans: Two-phase commit protocol(2PC),Three phase commit protocol(3PC)

2: Explain two phase commit protocol.Ans: When transaction T completes its execution that is when all the sites at which T has executed inform transaction coordinator Ci that T has completed Ci starts the 2PC protocol.

3: What is the main disadvantage of the 2-phase commit protocol?Ans: Coordinator failure may result in blocking where a decision either to commit or to abort transaction may have to be postponed until Ci recovers.

4:Explain three phase commit protocol.Ans: This protocol avoids the blocking of problem under certain assumption that no network partition occurs and not more than n sites fail. Where n is predetermined number. Under these assumptions the protocol avoids blocking by introducing an extra third phase where multiple sites are involved in the decision to commit.

Page 10: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit

5: What are the assumption in 3 phase commit & its disadvantage?Ans : Assumptions 1) there is no network partition occurs, and not more than k sites fail, where k is predetermined number. By this assumption the protocol avoids blocking by introducing an extra third phase where multiple sites are involved in the decision to commit. The protocol has to be carefully implemented to ensure that network partitioning does not result in inconsistencies, where a transaction is committed in one partition, and aborted in one another so that 3PC protocol is not widely used.

Topic Name: Distributed Directory Trees

1) Why directory object is used?Ans:- The directory object is used to store and retrieve information about objects.

2) Why naming tree is called the Directory Information Tree(DIT)?Ans:- As the directory entry is associated with each vertex of this tree, where the entry holds information about the object having the corresponding names.

3) What is Directory System Agent(DSA)?Ans:- A system that maintaines and communicates directory information is called as Directory System Agent.

4) What is Relative Distinguished Name(RDN)?Ans:- The name component added as we move one step down the naming tree is called the Relative Distinguished Name for the corresponding entry.

5)All directory information will be part of one "global directory". true or false?Ans:- True. All directory information will be part of one "global directory. Global in the sense that is world wide, and global in the sense it will be common for all directory uses.

6)How an object is represented?Ans:- An object is represented by an X.500 directory always has so-calles distinguished name structured.

Q: Draw LDAP architecture.

Page 11: Topic Name: Distributed Database System Q1:What Do You Mean By

Amit