data retrieval with sql goal: to issue a database query using the select command

58
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command

Upload: ethelbert-strickland

Post on 13-Dec-2015

229 views

Category:

Documents


0 download

TRANSCRIPT

DATA RETRIEVAL WITH SQL

Goal:

To issue a database query using the SELECT command

The SELECT statement retrieves data.

SELECT column name(s)FROM table name

---------------------------------------------------------------+ ¦ ¦ ¦ This statement SELECTS columns of values ¦ ¦ FROM a table of values ¦ ¦ ¦ ¦ The SELECT clause is entered first and is followed ¦ ¦ by the FROM clause. ¦ ¦ ¦ +---------------------------------------------------------------+

+------------------------------------------------------------+ ¦ ¦ ¦ The simplest form of data retrieval is to list ¦ ¦ the entire contents of a table. ¦ ¦ ¦ ¦ The * notation tells DB2 that all columns are ¦ ¦ to be retrieved. ¦ ¦ ¦ +------------------------------------------------------------+

SELECT *FROM EMPLOYEE

+-------------------------------------------------------------+ ¦ SQL allows us to select and order the columns by ¦ ¦ listing the column names in the SELECT clause. ¦ +------------------------------------------------------------+

Instead of using the asterisk (*) to list the contents of table we will list the columns we want in the order we want them to appear.

For example, this query...

SELECT LNAME, DEPTNO, SEX, SALARY FROM EMPLOYEE

...gives the table:+---------------------------------------------------+¦ LNAME ¦DEPTNO¦SEX¦ SALARY ¦+-----------------+----------+-----+--------------¦¦JOHNSON ¦ 100 ¦ F ¦48000.00 ¦¦JONES ¦ 500 ¦ M ¦18000.00 ¦¦DYLER ¦ 300 ¦ F ¦30000.00 ¦¦LEBLANC ¦ 100 ¦ M ¦42000.00 ¦¦RYDZIK ¦ 300 ¦ M ¦28000.00 ¦¦BAKER ¦ 100 ¦ F ¦38000.00 ¦¦SHELDON ¦ 200 ¦ M ¦31000.00 ¦

The result of the query...

SELECT DEPTNO FROM EMPLOYEE gives the following table: +----------+ ¦DEPTNO¦ Note that duplicate rows +----------+ appear in the result ¦ 100 ¦ table. ¦ 500 ¦ ¦ 300 ¦ SQL allows us to eliminate ¦ 100 ¦ duplicate rows from a table. ¦ 300 ¦ ¦ 100 ¦ ¦ 200 ¦ ¦ 500 ¦ ¦ 100 ¦ +-----------+

SELECT DISTINCT DEPTNO FROM EMPLOYEE

+-----------+ We can avoid these duplications ¦DEPTNO ¦ by using the DISTINCT keyword. +-----------+ ¦ 100 ¦ ¦ 200 ¦ Now there is only one entry for ¦ 300 ¦ each DEPTNO. ¦ 500 ¦ +-----------+

+----------------------------------------------+ ¦ Let's look at another way SQL ¦ ¦ allows us to manipulate columns. ¦ ¦ - ARITHMETIC CALCULATIONS ¦ +---------------------------------------------- + SQL provides four arthmetic symbols that can be used in an expression. They are:

+ addition

- subtraction

* multiplication

/ division

SELECT column name(s) FROM table name

So far we have used the SELECT statement to retrieve all or specified columns from a table.

We have also learned how to eliminate duplicate rows by using the DISTINCT keyword. +-----------------------------------------------------------+ ¦ But what if we are only interested in a subset of ¦ ¦ all the rows in a table? ¦ +----------------------------------------------------------+

SELECT column name(s) FROM table name WHERE

The WHERE clause works as a filter to choose only those rows in a table that satisfy the search condition.

For example: If we wanted to list only the employees that are in department 300, from the employee table, the entry would be …

SELECT * FROM EMPLOYEE WHERE DEPTNO = 300

SELECT EMPNO, LNAME, SEX, SALARY FROM EMPLOYEE WHERE SEX = 'F'

gives the table...

+-------------------------------------------------+ ¦EMPNO ¦ LNAME ¦SEX¦ SALARY ¦ +----------+-----------------+----+-------------+ Only those rows where ¦307117 ¦JOHNSON ¦ F ¦48000.00 ¦ SEX = 'F' are included ¦911723 ¦DYLER ¦ F ¦30000.00 ¦ in the table ¦737466 ¦BAKER ¦ F ¦38000.00 ¦ ¦871330 ¦ZWIRNER ¦ F ¦26000.00 ¦ +-------------------------------------------------¦

The comparison operators available to use with the WHERE clause as a search condition are:

= equal to

<> or ¬= not equal to

> greater than

>= greater than or equal to

< less than

<= less than or equal to

For Example, SELECT EMPNO, LNAME, SEX, SALARY FROM EMPLOYEE WHERE SALARY >= 30000

...gives the table:

+----------------------------------------------+ ¦EMPNO ¦ LNAME ¦SEX ¦ SALARY ¦ +----------+-------------+-----+-------------+ Only those rows where ¦307117 ¦JOHNSON ¦ F ¦48000.00 ¦ salary >= $30,000 are ¦911723 ¦DYLER ¦ F ¦30000.00 ¦ included in the table. ¦827611 ¦LEBLANC ¦ M ¦42000.00 ¦ ¦737466 ¦BAKER ¦ F ¦38000.00 ¦ ¦215012 ¦SHELDON ¦ M ¦31000.00 ¦ ¦711674 ¦MAYR ¦ M ¦33000.00 ¦ +-----------------------------------------------+

Sometimes you will need to link more than one search conditon in a WHERE command.

+------------------------------------------------------------------------+ ¦ ¦ ¦ Multiple search conditions are linked by: ¦ ¦ ¦ ¦ AND ? meaning All conditions MUST be met ¦ ¦ ¦ ¦ ¦ ¦ OR ? meaning AT LEAST ONE condition must be met¦ ¦ ¦ +------------------------------------------------------------------------+

for AND SELECT LNAME, SEX, SALARY FROM EMPLOYEE WHERE SALARY >= 30000 AND SEX = 'F'

for OR SELECT EMPNO, LNAME, DEPTNO, SALARY FROM EMPLOYEE WHERE DEPTNO = 200 OR DEPTNO = 300 OR DEPTNO = 500 We can list the rows in a table that meet one of several criteria using the OR operator.

We can achieve the same result (WHERE DEPTNO = 200 OR DEPTNO = 300OR DEPTNO = 500 ) by using the set comparison operator IN.

SELECT EMPNO, LNAME, DEPTNO, SALARY FROM EMPLOYEE WHERE DEPTNO IN (200,300,500)

SELECT EMPNO, LNAME, SALARY FROM EMPLOYEE WHERE SALARY >= 30000 AND SALARY <= 50000

We can list the rows in a table that fall within a specified range of values by using the AND operator.

BETWEEN has the same effectas using >= AND <=.

The BETWEEN operator lets us select rows that contain values within a specified range.

SELECT EMPNO, LNAME, SALARY FROM EMPLOYEE WHERE SALARY BETWEEN 30000 AND 50000

The salaries of $30,000 and$50,000 are included.

+------------------------------------------------------------+ ¦ ¦ ¦ Another form of selection criteria which is useful ¦ ¦ is to compare a column with a specific part of a ¦ ¦ constant. ¦ ¦ ¦ ¦ The SQL operator for this is... LIKE. ¦ ¦ ¦ +------------------------------------------------------------+

The syntax for using LIKE is ...

SELECT ... FROM ... WHERE column name LIKE quoted-string

A quoted-string may contain any string of characters. Special meanings are reserved for the characters _ and %.

_ (underscore) represents any single character.

% represents any string of zero or more characters.

SELECT * FROM EMPLOYEE WHERE LNAME LIKE 'J%'

In order to list all employees whose name begins with a 'J' from the EMPLOYEE table we would use the LIKE command.

+---------------------------------------------------------------------+ ¦ ¦ ¦ We can expand the SELECT statement further by telling ¦ ¦ SQL to order the rows. ¦ ¦ ¦ ¦ To accomplish this we append an ordering criteria to ¦ ¦ the query. ¦ ¦ ¦ +---------------------------------------------------------------------+

SELECT DEPTNO, EMPNO, LNAME, SALARY/12 FROM EMPLOYEE WHERE SEX = 'F'

this gives the entire listing of the selected columns without specificordering

SELECT DEPTNO, EMPNO, LNAME, SALARY/12 FROM EMPLOYEE WHERE SEX = 'F' ORDER BY DEPTNO, LNAME

...gives the table

+-----------------------------------------------------+ Notice that DEPTNO is ¦DEPTNO¦EMPNO ¦ LNAME ¦SALARY/12¦ now ordered by +----------+----------+-------------+---------------+ department number in ¦ 100 ¦737466 ¦BAKER ¦ 3166.67 ¦ in ascending order ¦ 100 ¦307117 ¦JOHNSON ¦ 4000.00 ¦ ¦ 300 ¦911723 ¦DYLER ¦ 2500.00 ¦ ¦ 500 ¦871330 ¦ZWIRNER ¦ 2166.67 ¦ +----------------------------------------------------+

+--------------------------------------------------------------------+¦ SELECT DEPTNO, EMPNO, LNAME, SALARY/12 ¦¦ FROM EMPLOYEE ¦¦ WHERE SEX = 'F' ¦¦ ORDER BY 1, 3 ¦¦ ¦+--------------------------------------------------------------------+

It is possible to use the position of the column in place of its name in an ORDER BY clause. Whether the name or column number is used the result is the same.

The ORDER BY clause can be used to display information in either ascending (ASC) or descending (DESC) order. The default is ASC. SQL assumes ASC if you do not specify a value.

If we had wanted to list the department numbers in descending order the command would be …

SELECT DEPTNO, EMPNO, LNAME, SALARY/12 FROM EMPLOYEE WHERE SEX = 'F' ORDER BY DEPTNO DESC

In all of the examples we have done so far we have performed queries on only one table. But what do we do if the information we want is found in two or more tables? SQL allows you to select data from more than one table. In relational terms, this is called an intersection of tables.

Suppose we wanted the name of the department where SHELDON works.

If we look at the two tables EMPLOYEE and DEPARTMENT some of the required information is contained in each table.

EMPLOYEE+-----------------------------------------------------------------------------+¦EMPNO ¦ FNAME ¦ LNAME ¦DEPTNO¦ WPHONE ¦STARTDT¦SEX¦ SALARY ¦

+-------+----------+-----------+-------+----------+---------+---+---------+ DEPARTMENT+---------------------------------------+¦DEPTNO¦ DEPTNAME ¦MANAGER ¦

+-------+------------------+-----------+

SQL provides us with a JOIN of tables feature to combine information from more than one table into a single result table. The tables must have a common column of data. The columns need not have the same name, but they must contain the same type of data.

SELECT LNAME, DEPTNAME FROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.DEPTNO = DEPARTMENT.DEPTNO

Note that the tables to be joined are listed in the FROM clause.

WHERE specifies the join criteria,the common link between the tables.The data types in the columns fromthe two tables must be the same.

The general form of a select from two tables is:

SELECT column-names FROM table1-name, table2-name WHERE table1-name.column-name = table2-name.column-name

________________________________________________ You can perform calculations using SQL's group functions

SQL supports the group functions

AVERAGE [AVG] COUNT MAXIMUM [MAX] MINIMUM [MIN] SUM

They all return a single value for the rows specified in the argument.

________________________________________________

SELECT group function(column name)

The group function goes in the SELECT statement and is followed by the column to which it applies. The column name must be enclosed in parentheses.

The AVERAGE function-------------------------------

¦ Averages the specified values in the column.

¦ Column selected MUST contain numeric values.

¦ The syntax for the Average function is :

SELECT AVG(column name) FROM table

The AVERAGE Function

If we wanted to find the average salary of all employees from the EMPLOYEE table, the command would be:

SELECT AVG(SALARY) FROM EMPLOYEE You can also find the average of a specified subgroup from the table by including a WHERE clause.

SELECT AVG(SALARY) FROM EMPLOYEE WHERE DEPTNO = 100

The COUNT function--------------------------- ¦ Must be followed by the specification DISTINCT or *.

¦ COUNT(DISTINCT) returns the number of values in the column. No duplicates are counted.

¦ COUNT(*) counts all rows without eliminating duplicates.

¦ The syntax for the Count function is :

SELECT COUNT(DISTINCT column name) or

SELECT COUNT(*)

SELECT COUNT(DISTINCT DEPTNO) FROM EMPLOYEE This is the command we would use to find the number of departments in the EMPLOYEE table. Here is part of the EMPLOYEE table.

DEPTNO Result of the Query will be : ? ------ 100 500 300 100 300 100 200 500 100

MAX and MIN functions--------------------- | The MAX function returns the largest value in a column.

| The syntax for the MAX function is :

SELECT MAX(column name) FROM table

| The MIN function returns the smallest value in a column.

| The syntax for the MIN function is :

SELECT MIN(column name) FROM table

SELECT MIN(SALARY) FROM EMPLOYEE This is the command we would use to find the minimumsalary in the EMPLOYEE table. Here is the EMPLOYEE table. SALARY -------- What will be the result of this query? 48000.00 18000.00 30000.00 42000.00 28000.00 38000.00 31000.00 26000.00 33000.00

The SUM function----------------

¦ Returns the sum of the values in the column.

¦ Column selected MUST contain numeric values.

¦ The syntax for the SUM function is :

SELECT SUM(column name) FROM table

Type a query to get the total salaries in the SALARY column for DEPTNO 100. These are found in the EMPLOYEE table.

SELECT SUM(SALARY)FROM EMPLOYEEWHERE DEPTNO = 100

Let's summarize each built-in function AVG averages the specified values for a given column

COUNT (*) counts all rows selected by a query

COUNT (DISTINCT column name) counts only those rows that have a unique value in the specified column

MAX and MIN display either the highest or lowest value found in the rows welected by the query

SUM calculates the total value for a given column

Forming a summary-only query ---------------------------------------

SELECT ... ¦ must specify a built in function ¦ may select the column in the GROUP BY clause

FROM ... ¦ code these as usual WHERE ...

GROUP BY ... ¦ must follow FROM and WHERE and precede ORDER BY

¦ must specify the column you want to summarize ¦ can only be used when SELECT specifies

a built-in function

SELECT DEPTNO, AVG(SALARY) FROM EMPLOYEE GROUP BY DEPTNO When GROUP BY is used only one row is returned for each group.

The result of this query is a table that gives the average salary for each department. +-------------------------+ ¦DEPTNO¦ SALARY ¦ +----------+-------------+ ¦ 100 ¦40250.00 ¦ ¦ 200 ¦31000.00 ¦ ¦ 300 ¦29000.00 ¦ ¦ 500 ¦22000.00 ¦ +-------------------------+

If the command does not contain a WHERE clause the GROUP BY follows the FROM clause.

SELECT DEPTNO, AVG(SALARY) FROM EMPLOYEE WHERE SEX = 'F' GROUP BY DEPTNO

Otherwise the GROUP BY clause follows the WHERE clause.

Q : Type a query that returns the number of employees in each DEPTNO from the EMPLOYEE table.

SELECT DEPTNO, COUNT(*)FROM EMPLOYEEGROUP BY DEPTNO

You must use a GROUP BY statement when you want a summary table and you can only use GROUP BY if you are summarizing.

GROUP BY followed by a column name tells SQL how the data will be grouped for summarizing.

+---------------------------------------------------------------------------+ ¦ GROUP BY ¦ ¦ ¦ ¦ ¦ can only be used along with built-in functions ¦ ¦ ¦ ¦ ¦ produces a summary report -- one line for each group ¦ ¦ ¦ ¦ ¦ follows FROM and WHERE and precedes ORDER BY ¦ +---------------------------------------------------------------------------+ GROUP BY can also be followed by a HAVING clause, which is in some ways comparable to a WHERE clause.

The GROUP BY clause returns all groups in the result.

Often we are only interested in groups that meet certain criteria. To see only the groups of interest, you can specify a search condition in a HAVING clause.

SELECT DEPTNO, AVG(SALARY) FROM EMPLOYEE GROUP BY DEPTNO HAVING AVG(SALARY) > 30000.00

HAVING determines which groups will be included

SELECT DEPTNO, AVG(SALARY) FROM EMPLOYEE WHERE SEX = 'F' GROUP BY DEPTNO HAVING AVG(SALARY) > 30000.00

WHERE determines which individual rows will be included

Let's review the rules for using a HAVING clause------------------------------------------------

¦ HAVING can only be used with GROUP BY

¦ HAVING must immediately follow the GROUP BY clause

¦ HAVING can only compare built-in functions, not individual columns

The below query will display the average salary for depart- ments with more than 3 employees. SELECT DEPTNO, AVG(SALARY) FROM EMPLOYEE GROUP BY DEPTNO HAVING COUNT (*) > 3

+----------------------------------------------------+ ¦ Subqueries are powerful features of SQL ¦ ¦ that allow you to express complex queries ¦ ¦ out of several simple ones. ¦ +----------------------------------------------------+

The subquery is evaluated first and the results are applied to the main query where they are used to find the answer to your question.

+---------------------------------------------------------------------------------+ ¦ Let's produce a table containing the last name, department number ¦ ¦ and salary of female employees whose SALARY is greater than the¦ ¦ average salary of male employees. ¦ +---------------------------------------------------------------------------------+

2 separate queries.

The first step is to find the average salary of male employees from the EMPLOYEE table :

SELECT AVG(SALARY)FROM EMPLOYEEWHERE SEX = 'M’Say result is 304000.00

The next step is to use this result by placing it in another query to find out which female employees have a salary greater than the average salary for male employees.

SELECT LNAME, DEPTNO, SALARY FROM EMPLOYEE WHERE SEX = 'F' AND SALARY > 30400.00

Instead of such complexity SUBQUERIES allow us to combine these two simple queries.

Recap of Q: +---------------------------------------------------------------------------------+ ¦ Lets produce a table containing the last name, department number ¦ ¦ and salary of female employees whose SALARY is greater than the ¦ ¦ average salary of male employees. ¦ +---------------------------------------------------------------------------------+

Using Subqueries: SELECT LNAME, DEPTNO, SALARY FROM EMPLOYEE WHERE SEX = 'F' AND SALARY > (SELECT AVG(SALARY) The subquery is enclosed in FROM EMPLOYEE parentheses and placed in WHERE SEX = 'M’) the WHERE clause.

SQL Subqueries -------------- ¦ allow you to form complex queries out of several simple queries. ¦ must be enclosed in parentheses. ¦ follow the same general format as normal queries

SELECT ... FROM ... WHERE ...

¦ allows only one column-name in its SELECT clause.

¦ processes the subquery first and passes the result to the main-query which then computes the entire answer.

Subqueries may contain multiple levels. When using more thanone subquery, the same rules apply and the format is the same. SELECT ... FROM ... WHERE ...

Sub- ¦ (SELECT ... query ¦ FROM ... ¦ WHERE ...

Sub- ¦ (SELECT ... query ¦ FROM ... ¦ WHERE ...))

If we wanted to find the manager of the department where Mayr works, these are the steps we would follow:

SELECT LNAME We want to retrieve the last FROM EMPLOYEE name of the manager from the WHERE EMPNO IN EMPLOYEE table, but ...

Sub- ¦ (SELECT MANAGER since the EMPLOYEE table only query ¦ FROM DEPARTMENT lists employee numbers we need ¦ WHERE DEPTNO IN to retrieve the department manager's number, but ...

Sub- ¦ (SELECT DEPTNO before we can do that we query¦ FROM EMPLOYEE must find the department ¦ WHERE LNAME = 'MAYR')) number that MAYR works in.

SELECT LNAME Result FROM EMPLOYEE -------- WHERE EMPNO IN BAKER

Sub- ¦ (SELECT MANAGER Result query ¦ FROM DEPARTMENT -------- ¦ WHERE DEPTNO IN 737466

Sub- ¦ (SELECT DEPTNO Result query ¦ FROM EMPLOYEE -------- ¦ WHERE LNAME = 'MAYR')) 100