ibm db2 case study

34
IBM DB2 CASE STUDY

Upload: ishvitha-badhri

Post on 14-Jun-2015

321 views

Category:

Technology


20 download

TRANSCRIPT

  • 1. CASE STUDY

2. Free version of DB2 Express-C server The C in the name stands for community Free to develop, deploy and distributeno limits No user limits No DB2 instance limits No database size limit What is DB2 Express-C ? 3. Where can DB2 Express-C run? Windows Windows workstation platforms(XP, Vista, Windows 7) Windows server platforms (2003,2008) Linux Red Hat, Suse, Ubuntu and more Mac OS X(beta) Solaris Supported on 32-bit and 64-bit systems 4. What are DB2 Express-C Requirements? Minimum Requirements Memory: (256MB, 512MB(DB2 GUI tools), 1GB( recommended) Disk: -Depends on number and size of database to create -On Linux, 2GB on /tmp is recommended Maximum requirements: System of any size Uses at most 2 cores and 2 GB of RAM 5. DB2 ENTERPRISE EDITION DB2 WORKGROUP EDITION DB2 EXPRESS EDITION DB2 EXPRESS-C + some functionality + some functionality + some functionality DB2 SERVERS 6. DB2 clients and drivers DB2 Runtime Client Merge Modules for Windows: mainly used to embed a DB2 runtime client as part of a Windows application installation IBM Data Server Driver for JDBC and SQLJ: allows Java applications to connect to DB2 servers without having to install a full client IBM Data Server Driver for ODBC and CLI: allows ODBC and CLI applications to connect to a DB2 server without the large footprint of having to install a client IBM Data Server Driver Package: Includes a Windows-specific driver with support for .NET environments in addition to ODBC, CLI and open source. This driver was previously known as the IBM Data Server Driver for ODBC, CLI and .NET. 7. Concurrency and Locking What is a Transaction? Savings account Checking account Balance =$1000 Balance =$200 Transfer $100 from Savings to Checking account Debit $100 from Savings account Credit $100 from Checking account 8. Concurrency Overview APP 1 APP 2 APP 3 APP 4 ID NAME AGE JAMES4 8 12 2 PETER MARY 22 30 28 31 TOM 9. Locking Overview APP 1 APP 2 APP 3 APP 4 ID NAME AGE JAMES4 8 12 2 PETER MARY 22 30 28 31 TOM 10. Concurrency Db2 was designed as a multi-user database Access to data must be coordinated properly and transparently Without some form of concurrency control the following problems may be encountered. Lost update Uncommitted read Non-repeatable read Phantom read 11. LOST UPDATE FLIGHT SEAT P_NAME 412 3A ---------------- 412 3B ---------------- .. .. .. .. .. .. Update Reservations set P_NAME = TOM where SEAT=3B and FLIGHT=412 and P_NAME is NULL 412 3B TOM Update Reservations set P_NAME = JOHN where SEAT=3B and FLIGHT=412 and P_NAME is NULL 412 3B JOHN Reservations 12. UNCOMMITTED READ FLIGHT SEAT P_NAME 412 3A ---------------- 412 3B ---------------- .. .. .. .. .. .. Update Reservations set P_NAME = TOM where SEAT=3B and FLIGHT=412 and P_NAME is NULL 412 3B TOM Reservations Select SEAT from Reservations where P_NAME is NULL 1 43 2 Rollback Incorrect Results 13. NON-REPEATABLE READ FLIGHT SEAT NAME DESTINATIO N ORIGIN 212 7S --------- DENVER DALLAS 309 6A --------- SANJOSE DENVER 134 4B --------- HONOLU SANJOSE . Implies that you cannot obtain the same result after performing the same read after some other operations like deletion. 14. PHANTOM READ FLIGHT SEAT P_NAME 412 3A ---------------- 412 3B TOM .. .. .. .. .. .. 412 3B TOM Reservations Select SEAT from Reservations where P_NAME is NULL 1 3 2 Repetition of one row (3B is now available) Update Reservations set P_NAME = NULL where SEAT=3B and FLIGHT=412 and P_NAME =TOM 15. LOCKING DB2 uses locking to maintain data integrity. Locks are acquired automatically as a need to support transaction and are released when a transaction terminates(commit/rollback) Locks can be acquired on tables and rows Two basic types of locks: Share locks(S locks): acquired when an application wants to read and prevent others from updating the same row Exclusive locks(X locks): acquired when an application updates, deletes or inserts any row 16. ISOLATION LEVELS DB2 provides different levels of protection to isolate data: Uncommitted Read (UR) Cursor Stability (CS) Read Stability (RS) Repeatable Read (RR) Isolation levels can be specified at many levels Session Connection Statement 17. Isolation levels - UNCOMMITTED READ Uncommitted read is also known as DIRTY READ o Lowest level of isolation o Provides highest degree of concurrency No row locks are obtained on read operations Unless other application attempts to drop or alter table Update operations act as if using cursor stability Possible situations o Uncommitted Read o Non-Repeatable Read o Phantom Read Situations Prevented o Loss of Update 18. Isolation levels - Cursor Stability Cursor stability is the default isolation level Minimal degree of locking Locks the current row of a cursor If the row is only read The lock is held until a new row is fetched or the unit of work is terminated If the row is updated The lock is held until the unit of work is terminated Possible Situations Non-Repeatable Read Phantom Read Prevented Situations Loss of update Uncommitted Read 19. Isolation levels Read Stability Locks all the rows, an application retrieves within a unit of work o For a given cursor, it locks all rows that qualify for the result set o Moderate degree of locking Possible Situations o Phantom read Prevented Situations o Loss of update o uncommitted Read o Non-repeatable Read 20. Isolation levels Repeatable Read Highest isolation level, least concurrency Some query issued by the application more than once in a unit of work will give the same result each time High degree of locking Locks held on all rows processed to build the result set i.e. rows not necessarily in the final result set may be locked No other application can update, delete, or insert a row that would affect the result set until the unit of work completes Possible situations None Prevented Situations Loss of Update Uncommitted Read Non-repeatable Read Phantom Read 21. Comparing Isolation Levels 22. Comparison of Isolation level terminology 23. Example Scenario: Application needs to get a rough count of how many rows are in table. Performance is of utmost importance. Cursor stability isolation level is required With the exception of one SQL statement 24. Lock Escalation 25. Lock Escalation When optimizer thinks it is better to have one lock on the entire table rather than multiple row locks Database configuration parameters that affect lock escalation: 1. LOCKLIST- the amount of memory(4k pages) to manage locks for all connected applications (default is 50 * 4k pages on Windows 2. MAXLOCKS Max percentage of the entire lock list that a single application can use up Default is 22 percent The DB2 diagnostic log file (db2diag.log) can be used to determine whether lock escalation is occurring. Lock escalation is not good for performance as it reduces concurrency. 26. LOCK MONITORING View locks currently held by an application UPDATE MONITOR SWITCHES USING LOCK ON GET SNAPSHOT FOR LOCKS FOR APPLICATION AGENT ID 27. Lock Wait By default an application waits indefinitely to obtain any needed locks LOCKTIMEOUT (db cfg): Default value is -1 or infinite wait Change to specify the number of seconds to wait for a lock A database connection also has a user-definable CURRENT LOCK TIMEOUT register By default inherits its value from the LOCKTIMEOUT parameter Use the SET LOCK TIMEOUT statement to change its value Once it is set for a connection, it persists across transactions e.g. SET LOCK TIMEOUT-WAIT n 28. Deadlock causes and detection A deadlock occurs when two or more applications connected to the same database wait indefinitely for a resource. The waiting is never resolved because each application is holding a resource that the other needs. Deadlocks are an application design issue most of the time. 29. Deadlock Settings Deadlocks are an application design issue most of the time DLCHKTIME(db cfg) sets the time interval for checking for deadlocks It defines the frequency that the database manager checks for deadlocks among all the applications connected to a database If you are experiencing many deadlocks, you should re- examine your existing transactions and see if any re- structuring is possible 30. Best Practices Keep transactions as short as possible Issue frequent COMMIT statements even for read-only transactions Log transaction information only when required Purge data quick using: ALTER TABLE ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE Perform data modifications in batches/groups DELETE FORM (SELECT * FROM ABCD WHERE C1=..FETCH FIRST 3000 ROWS ONLY) Use concurrency features in DB2 data movement tools Set the database level LOCKTIMEOUT parameter (usually between 30- 120 seconds) Can also use session-based lock timeout Do not retrieve more data than is required Use the FETCH FIRST n ROWS ONLY clause in SELECT statements 31. How IBM DB2 helpful in Parallel Processing? Parallelism in SMP environments intra-partition parallelism is the subdivision of a single database operation into multiple parts, which are then executed in parallel within a single database partition. use the CURRENT DEGREE special register, or the DEGREE bind option to specify number of queries that can be executed concurrently. Database manager configuration parameters Intra-parallel : to enable or disable parallelism Max_querydegree: sets an upper limit on the degree of parallelism for any query in the database. dft_degree: sets the default value of degree of parallelism Parallelism in non-SMP environments Degree of parallelism can be specified without having an SMP machine The processor might not have to wait for I/O tasks to complete before starting to process a new query. 32. How IBM DB2 helpful in Distributed query processing? The database partitioning feature (DPF) provides distributed query processing across a cluster of database servers. It is only available with InfoSphere Warehouse Editions and allows data to be spread across multiple database partitions or nodes which can reside in several different servers. DPF is based on a shared-nothing architecture where each database partition has a subset of the overall data on its own independent disks. Each computer, as it is added to the database cluster, brings additional data processing power with its own CPUs, memory, and disks, allowing for large tasks and complex queries to be broken down into smaller pieces and distributed across the various database nodes to be executed in parallel. This results in higher concurrency and faster response times than would be possible if the database resided on a single server. 33. BIBLIOGRAPHY Getting_Started_with_DB2_Express_v9.7_p4 INTERNET 34. END OF DOCUMENT THANKS M.ISHVITHA