database administration: the complete guide to practices and procedures

52
Database Administration: The Complete Guide to Practices and Procedures Chapter 5 Application Design

Upload: vea

Post on 23-Feb-2016

88 views

Category:

Documents


8 download

DESCRIPTION

Database Administration: The Complete Guide to Practices and Procedures. Chapter 5 Application Design. Agenda. Database Application Development & SQL Defining Transactions Locking Batch Processing Questions. Database Application Development and SQL. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Database Administration: The Complete Guide to Practices and Procedures

Database Administration:The Complete Guide to Practices and Procedures

Chapter 5Application Design

Page 2: Database Administration: The Complete Guide to Practices and Procedures

Agenda

• Database Application Development & SQL• Defining Transactions• Locking• Batch Processing• Questions

Page 3: Database Administration: The Complete Guide to Practices and Procedures

Database Application Development and SQL

To properly design an application that relies on databases for persistent data storage, the system designer at a minimum will need to understand the following issues:• How data is stored in a relational database• How to code SQL statements to access and modify data in the

database• How SQL differs from traditional programming languages• How to embed SQL statements into a host programming language• How to optimize database access by changing SQL and indexes• Programming methods to avoid potential database processing

problems

Page 4: Database Administration: The Complete Guide to Practices and Procedures

SQL

• SQL is the de facto standard for accessing relational databases

• SQL is a high-level language that provides a greater degree of abstraction than do traditional procedural languages

• SQL is designed such that programmers specify what data is needed. – It does not—indeed it cannot—specify how to

retrieve it

Page 5: Database Administration: The Complete Guide to Practices and Procedures

SQL: English-like

• SQL can be used to retrieve data easily with an English-like syntax.

• It is easier to understand this:

• Than it is to understand C, Java, or most typical programming languages.

Page 6: Database Administration: The Complete Guide to Practices and Procedures

Set-a-Time Processing

• SQL operates on sets• Multiple rows can be retrieved, modified, or

removed in one fell swoop by using a single SQL statement

• Every operation performed on a relational database operates on a table (or set of tables) and results in another table

• This is called relational closure

Page 7: Database Administration: The Complete Guide to Practices and Procedures

SQLStatement

Database Tables

Result Set (table)

Relational Closure

Page 8: Database Administration: The Complete Guide to Practices and Procedures

Embedding SQL in a Program

• Host language code– COBOL, FORTRAN, Assembler, etc.– C/C++, Java, PHP, Visual Basic, etc.

• API– ODBC, JDBC

• IDE– Integrated Development Environment

• Code generators

Page 9: Database Administration: The Complete Guide to Practices and Procedures

SQL Middleware and APIs

• ODBC– ODBC is a call level interface, or CLI– Instead of directly embedding SQL in the program,

ODBC uses callable routines. • to allocate and deallocate resources• control connections to the database• execute SQL statements• obtain diagnostic information• control transaction termination• obtain information about the implementation

Page 10: Database Administration: The Complete Guide to Practices and Procedures

SQL Middleware and APIs

• JDBC– JDBC enables Java to access relational databases. – Similar to ODBC, JDBC consists of a set of classes and

interfaces that can be used to access relational data. – There are several types of JDBC middleware,

including the JDBC-to-ODBC bridge, as well as direct JDBC connectivity to the relational database.

– Anyone familiar with application programming and ODBC (or any call-level interface) can get up and running with JDBC quickly

Page 11: Database Administration: The Complete Guide to Practices and Procedures

Drivers

• ODBC and JDBC rely on drivers– A driver provides an optimized interface for a particular

DBMS implementation• Programs can make use of the drivers to communicate

with any JDBC- or ODBC-compliant database. • The drivers enable a standard set of SQL statements in

any Windows application to be translated into commands recognized by a remote SQL-compliant database.

• There are multiple types of JDBC drivers

Page 12: Database Administration: The Complete Guide to Practices and Procedures

SQL Middleware and APIs

• SQLJ– SQLJ enables developers to embed SQL

statements in Java programs.– A precompiler translates the embedded SQL into

Java code. – The Java program is then compiled into bytecodes,

and a database bind operation creates packaged access routines for the SQL.

Page 13: Database Administration: The Complete Guide to Practices and Procedures

SQL Middleware and APIs

• OLE DB– Object Linking and Embedding Database– OLE DB presents an object-oriented interface for generic data

access.– Based on the COM architecture– OLE DB provides greater flexibility than ODBC because it can

be used to access both relational and nonrelational data. – OLE DB is conceptually divided into consumers and providers.

• consumers are the applications that need access to the data• providers are the software components that implement the

interface and thereby provide the data to the consumer.

Page 14: Database Administration: The Complete Guide to Practices and Procedures

Application Infrastructure

• Application infrastructure is the combined hardware and software environment that supports and enables the application.

• The application infrastructure will vary from organization to organization, and even from application to application within an organization.

Page 15: Database Administration: The Complete Guide to Practices and Procedures

Application Infrastructure

Mainframe• IBM z Series hardware • Running z/OS, DB2, CICS, with

application programs written in COBOL. • Typically, applications consist of both

batch and online workload. • A modern mainframe infrastructure

adds interfaces to non-mainframe clients, as well as WebSphere Application Server and Java programs.

• Most new mainframe development uses IDEs to code modern applications instead of relying upon COBOL programmers.

Distributed• Most modern, distributed,

non-mainframe application development projects typically rely upon application development frameworks.

• The two most commonly-used frameworks are Microsoft .NET and J2EE.

Page 16: Database Administration: The Complete Guide to Practices and Procedures

Microsoft .NET

• ... is a set of Microsoft technologies for connecting people, systems, and devices

• ... allows Internet Servers to expose functions to any client named as .NET web services

• … enables software to be delivered as a service over the web

• … is designed to let many different services and systems interact

Page 17: Database Administration: The Complete Guide to Practices and Procedures

Visual C# Visual Basic Visual J# Visual C+

+ JScript Third Party

Microsoft .NET Framework

ADO.NET ASP.NET User Interfaces

.NET Framework Class Library

CLR (Common Language Runtime)

Microsoft .NET Framework

Page 18: Database Administration: The Complete Guide to Practices and Procedures

Java Alphabet Soup

• J2EE - Java 2 Enterprise Edition– Standard services and specifications for making

Java highly available, secure, reliable, and scalable for enterprise adoption

• EJB - Enterprise Java Beans– Components that contain the business logic for a

J2EE application

Page 19: Database Administration: The Complete Guide to Practices and Procedures

EIS TierClient Tier

Java StandaloneRuntime

Database

Web Tier

JSP Pages

Browser

Business Tier

Servlets

Enterprise JavaBeans

Business Componentsfor Java

JavaApplication

Pure HTML

Applet

J2EE and Java

Page 20: Database Administration: The Complete Guide to Practices and Procedures

Impact of Java on DBA

• Application tuning– Must understand Java

• To provide guidance during design reviews– Is the problem in the SQL or the application

• How can you tune the application if you do not understand the language (Java)?

– Optimizing SQL is not enough since it may be embedded in poor application code

– Must understand the SQL techniques used• JDBC and SQLJ

Page 21: Database Administration: The Complete Guide to Practices and Procedures

Java versus .Net

• ...designed to enable applications to be deployed on any platform as long as they are written in Java

• …designed to enable development in multiple languages as long as the application is deployed on Windows

Page 22: Database Administration: The Complete Guide to Practices and Procedures

Other Application Choices

• There are other choices, including – Ruby on Rails– Ajax– PHP– C/C++– And so on…

• This is not an exhaustive list…

Page 23: Database Administration: The Complete Guide to Practices and Procedures

Object Orientation

• OO programming advantages:– faster program development time– reduced maintenance costs– resulting in a better ROI

• Piecing together reusable objects and defining new objects based on similar object classes can dramatically reduce development time and costs.

Page 24: Database Administration: The Complete Guide to Practices and Procedures

OO, SQL and Databases

• OO and relational databases are not inherently compatible

• The set-based nature of SQL is anathema to the OO techniques practiced by Java and C++ developers.

• All too often insufficient consideration has been given to the manner in which data is accessed, resulting in poor design and faulty performance

Page 25: Database Administration: The Complete Guide to Practices and Procedures

Impedance Mismatch

• When OO programming language is used to access a relational database, you must map objects to relations.– OO programs deal with objects– RDBMSs deal with relations, (that is, tables)

• Applications will not be object-oriented in the “true” sense of the word because the data will not be encapsulated within the method (that is, the program).

Page 26: Database Administration: The Complete Guide to Practices and Procedures

Making OO Programs Work with Relational Databases

• Serialization– Saving data using a flat file representation of the

object. This approach can be slow and difficult to use across applications.

• XML– can be stored natively in many relational database

systems. But XML adds a layer of complexity and requires an additional programming skillset.

• Object-Relational Mapping (ORM)– Most common approach

Page 27: Database Administration: The Complete Guide to Practices and Procedures

Object Relational Mapping• With ORM an object’s attributes are stored in one or more

columns of a relational table. Hibernate is a popular ORM library for Java; NHibernate is an adaptation of Hibernate for the .NET framework.

• Both Hibernate and NHibernate provide capabilities for mapping objects to a relational database by replacing direct persistence-related database accesses with high-level object handling functions.

• Another option is Microsoft LINQ, which stands for Language Integrated Query. LINQ provides a set of .NET framework and language extensions for object-relational mapping.

Page 28: Database Administration: The Complete Guide to Practices and Procedures

Types of SQL• SQL can be planned or unplanned.

– A planned SQL request is typically embedded into an application program, but it might also exist in a query or reporting tool. At any rate, a planned SQL request is designed and tested for accuracy and efficiency before it is run in a production system. Contrast this with the characteristics of an unplanned SQL request. Unplanned SQL, also called ad hoc SQL, is created “on the fly” by end users during the course of business. Most ad hoc queries are created to examine data for patterns and trends that impact business. Unplanned, ad hoc SQL requests can be a significant source of inefficiency and are difficult to tune. How do you tune requests that are constantly written, rewritten, and changed?

• SQL can either be embedded in a program or issued stand-alone. – Embedded SQL is contained within an application program, whereas stand-alone SQL

is run by itself or within a query, reporting, or OLAP tool. • SQL can be dynamic or static.

– A dynamic SQL statement is optimized at run time. Depending on the DBMS, a dynamic SQL statement may also be changed at run time. Static SQL, on the other hand, is optimized prior to execution and cannot change without reprogramming. Favor static SQL to minimize the possibility of SQL injection attacks.

Page 29: Database Administration: The Complete Guide to Practices and Procedures

SQL Usage Considerations

Situation Execution type Program requirement Dynamism

Columns and predicates of the SQL statement

can change during execution.

Planned Embedded Dynamic

SQL formulation does not change. Planned Embedded Static

Highly concurrent, high-performance

transactions.

Planned Embedded Dynamic or static

Ad hoc one-off queries. Unplanned Stand-alone Dynamic

Repeated analytical queries. Planned Embedded or stand-alone Dynamic or static

Quick one-time “fix” programs. Unplanned Embedded or stand-alone Dynamic or static

Page 30: Database Administration: The Complete Guide to Practices and Procedures

SQL Coding for Performance

• It is important to learn how to code SQL for performance

• Generally a good idea to rely on the DBMS to optimize the code

• Let SQL do the work instead of coding it in host language program– The less data brought from the DBMS to the program the

better performance will be• More performance guidelines come later in the

course!

Page 31: Database Administration: The Complete Guide to Practices and Procedures

What is XML?

• XML stands for eXtensible Markup Language. – Like HTML, XML is based on SGML

– HTML uses tags to describe the appearance of data on a page, whereas XML uses tags to describe the data itself, instead of its appearance.

– Allows documents to be self-describing, through the specification of tag sets and the structural relationships between the tags.

• XML is actually a meta language (a language used to define other languages).– These languages are collected

in dictionaries called DTDs Document Type Definitions.

– The DTD stores definitions of tags for specific industries or fields of knowledge.

– The DTD for an XML document can be either part of the document or stored in an external file.

Page 32: Database Administration: The Complete Guide to Practices and Procedures

XML Data• XML uses tags to describe the data itself

<CUSTOMER> <first_name>Craig</first_name> <middle_initial>S.</middle_initial> <last_name>Mullins</last_name> <company_name>Mullins Consulting, Inc.</company_name> <street_address>15 Coventry Ct.</street_address> <city>Sugar Land</city> <state>TX</state> <zip_code>77479</zip_code> <country>USA</country> </CUSTOMER>

http://www.xml.org

Page 33: Database Administration: The Complete Guide to Practices and Procedures

Querying XML

• XQUERY– FLWOR

• FOR, LET, WHERE, ORDER BY, and RETURN.– Not just for querying, it also allows for new XML

documents to be constructed• SQL/XML

– Uses functions to access XML data• XMLDOCUMENT, XMLELEMENT, XMLCONCAT, XMLAGG,

XMLQUERY, XMLTABLE

Page 34: Database Administration: The Complete Guide to Practices and Procedures

Defining Transactions

• A transaction is an atomic unit of work with respect to recovery and consistency.

• When all the steps that make up a specific transaction have been accomplished, a COMMIT is issued.– ROLLBACK before COMMIT to undo transaction’s

work• DBMS maintains transaction log

Page 35: Database Administration: The Complete Guide to Practices and Procedures

ACID Properties of Transactions

• Defining Transactions– Atomicity– Consistency– Isolation– Durability

• Unit of Work– Ensure proper definition and coding

Page 36: Database Administration: The Complete Guide to Practices and Procedures

Unit of Work

• A UOW is a series of instructions and messages that guarantees data integrity.

• Example: bank transaction– Withdrawal of $20– The transaction must involve both the subtraction

of $20 from your account and the delivery of $20 to you

– Only doing one or the other is not a complete unit of work

Page 37: Database Administration: The Complete Guide to Practices and Procedures

TP System Versus DBMS (Stored Procs)

RelationalDBMS

StoredProcedures

Presentation(Client)

DiskRelationalDBMS (1)

TransactionServer

WorkflowController

Presentation(Client)

Disk

RelationalDBMS (2)

Disk

Page 38: Database Administration: The Complete Guide to Practices and Procedures

Application Servers

• An application server combines the features of a transaction server with additional functionality to assist in building, managing, and distributing database applications.

• Examples:– WebSphere (IBM)– Zend Server– Base4 Application Server (open source)

Page 39: Database Administration: The Complete Guide to Practices and Procedures

Transactions and Locking• The DBMS uses a locking mechanism to enable multiple,

concurrent users to access and modify data in the database.

• By using locks, the DBMS automatically guarantees the integrity of data. The DBMS locking strategies permit multiple users from multiple environments to access and modify data in the database at the same time.

• Locking Granularity– Row– Page (or Block)– Table– Table Space– Database

Page 40: Database Administration: The Complete Guide to Practices and Procedures

Granularity of Lock

Acc

ess

Con

curr

ency

Column Row Page Table Tablespace Database

High

Low

Level of Lock Granularity

Page 41: Database Administration: The Complete Guide to Practices and Procedures

Types of Locks• The following types of locks can be taken on database pages or

rows:– Shared Lock

• Taken when data is read with no intent to update it. • If a shared lock has been taken on a row, page, or table, other processes or users

are permitted to read the same data. – eXclusive Lock

• Taken when data is modified. • If an exclusive lock has been taken on a row, page, or table, other processes or

users are generally not permitted to read or modify the same data. – Update Lock

• Taken when data must first be read before it is changed or deleted. • The update lock indicates that the data may be modified or deleted in the

future. • If the data is actually modified or deleted, the DBMS will promote the update

lock to an exclusive lock.

Page 42: Database Administration: The Complete Guide to Practices and Procedures

Intent Locks

• Intent locks are placed on higher-level database objects when a user or process takes locks on the data pages or rows. – Table or Table Space

• An intent lock stays in place for the life of the lower-level locks.

Page 43: Database Administration: The Complete Guide to Practices and Procedures

Lock Compatibility

Page 44: Database Administration: The Complete Guide to Practices and Procedures

Lock Timeouts

error

Page 45: Database Administration: The Complete Guide to Practices and Procedures

Table X

data…data…data...

Process A Process B

. . .Request row 3 . . . .Request row 7

lock

. . .Request row 7 . . .Request row 3

data…data…data...lock

Process A is waiting on Process BProcess B is waiting on Process A

Deadlocks

Page 46: Database Administration: The Complete Guide to Practices and Procedures

Lock Duration

• Lock duration refers to the length of time that a lock is held by the DBMS.

• Two parameters impact lock duration:– Isolation level– Acquire/Release

Page 47: Database Administration: The Complete Guide to Practices and Procedures

Isolation Level• Read uncommitted

– aka dirty read• Read committed

– aka cursor stability• Repeatable read• Serializable

Page 48: Database Administration: The Complete Guide to Practices and Procedures

Acquire/Release Specification

• Controls when Intent locks are acquired and released– Intent locks can be acquired either immediately

when the transaction is requested or iteratively as needed while the transaction executes.

– Intent locks can be released when the transaction completes or when each intent lock is no longer required for a unit of work.

Page 49: Database Administration: The Complete Guide to Practices and Procedures

Lock Escalation

• Lock escalation is the process of increasing the lock granularity for a process or program.

• Typically controlled by system parameters and DDL parameters in CREATE statements.

• For example:– If a threshold is hit for the number of locks being held by

a process (or by the entire DBMS), page locks (or row locks) can be escalated to table locks.

– Can cause concurrency issues• If the entire table is locked other processes cannot access the

data

Page 50: Database Administration: The Complete Guide to Practices and Procedures

Programming Techniques to Minimize Locking Problems

• Avoid deadlocks by coding updates in the same sequence regardless of program– For example, alphabetical order by table name

• Issue data modification SQL statements as close to the end of the UOW as possible– The later in the UOW the update occurs, the

shorter the duration of the lock

Page 51: Database Administration: The Complete Guide to Practices and Procedures

Batch Processing

• Batch Processing– Where programs are scheduled to run at predetermined

times without any user input

• Batch programmers sometimes tend to treat tables like flat file… that is NOT a good idea.– Think relationally instead of file processing

• Plan and implement a COMMIT strategyin all batch application programs– Instead of holding locks until the end of the program– Otherwise you will experience a lot of lock timeouts

Page 52: Database Administration: The Complete Guide to Practices and Procedures

Questions