rtdb

110
Remove Delete Duplicate Records Or Rows - Sql Server Posted by amiT jaiN Remove or Delete duplicate records or rows from ms sql server database table. In this post i am going to describe different methods of deleting duplicate records or rows from sql server database table. I am using Employees table with FirstName and Department columns. First Method. Delete duplicate records/rows by creating identity column.

Upload: rhvenkat

Post on 21-Jul-2016

8 views

Category:

Documents


0 download

DESCRIPTION

gfg

TRANSCRIPT

Page 1: RTDB

Remove Delete Duplicate Records Or Rows - Sql Server

Posted by amiT jaiN

Remove or Delete duplicate records or rows from ms sql server database table.

In this post i am going to describe different methods of deleting duplicate records or rows from sql server database table.

I am using Employees table with FirstName and Department columns.

First Method.

Delete duplicate records/rows by creating identity column.

duplicate records in table looks like shown in first image.

First of all we need to create a identity column in our table by using code mentioned below.

Page 2: RTDB

And table will look like image on the left.

1ALTER TABLE dbo.Employees ADD ID INT IDENTITY(1,1)

Now write this query to delete duplicate rows.

1DELETE FROM dbo.Employees2WHERE ID NOT IN (SELECT MIN(ID)3FROM dbo.Employees GROUP BY FirstName,Department)

This should remove all duplicate records from table.

Second Method.

Delete duplicate records using Row_Number()

If you do not want to make any changes in table design or don't want to create identity column on table then you can remove duplicate records using Row_Number in sql server 2005 onwards.

for this write below mentioned code and execute.

1WITH DuplicateRecords AS2(3SELECT *,row_number() OVER(PARTITION BY FirstName,Department ORDER BY 4 5FirstName) 6AS RowNumber FROM dbo.Employees7)8DELETE FROM DuplicateRecords WHERE RowNumber>1

This should remove all duplicate records from table.

Third Method.

Remove duplicate rows/Records using temporary table

Use below mentioned code to delete duplicates by moving them to temporary table using DISTINCT.

1SELECT DISTINCT * INTO TempTable FROM dbo.Employees

Page 3: RTDB

2GROUP BY FirstName,Department3HAVING COUNT(FirstName) > 14 5DELETE dbo.Employees WHERE FirstName6IN (SELECT FirstName FROM TempTable)7 8INSERT dbo.Employees SELECT * FROM TempTable9DROP TABLE TempTable

And result will be as shown.

Have fun.

Query to delete duplicate Records in SQL

Suppose we have the situation to delete some duplicate records in our table. Suppose consider one table

create table #Test(EmpID int,EmpName varchar(50))

–Insert the Records into #Test table

insert into #Test values(1,’Daya‘)insert into #Test values(1,’Daya‘)insert into #Test values(1,’Daya‘)

Now i have two duplicate records inserted and i want to delete those records. The following query will delete the duplicate records

–Query to Delete Duplicate Records

WITH Emp AS (SELECT ROW_NUMBER ( ) OVER ( PARTITION BY EmpID, EmpName ORDER BY EmpID ) AS RNUM FROM #Test )DELETE FROM Emp WHERE RNUM > 1

Page 4: RTDB

Max value among three columnsThe following query will fetch you the max value among three columnsCreate Table #test (Name Char(10), Qty1 INT, QTY2 INT, QTY3 INT)

Insert into #Test (Name, QTY1, QTY2, QTY3) Values (‘Monitor’,12,13,11)

Insert into #Test (Name, QTY1, QTY2, QTY3) Values (‘KeyBoard’,8,7,6)

Insert into #Test (Name, QTY1, QTY2, QTY3) Values (‘Mouse’,3,2,5)

–Query to get the max among three columns

select [name],(select max(Qty)

from (select QTY1 as Qty

union all

select QTY2 as Qty

union all

select QTY3) as Qty) as Maxi

from #test

How to find out recently run queries in SQL Server 2005?

Select dmStats.last_execution_time as ‘Last Executed Time’,dmText.text as ‘Executed Query’ from sys.dm_exec_query_stats as dmStats Cross apply sys.dm_exec_sql_text(dmStats.sql_handle) as dmText Order BydmStats.last_execution_time desc

Run the following query to export data from sql server to excel

USE [AdventureWorks]

INSERT INTO OPENROWSET (‘Microsoft.Jet.OLEDB.4.0′, ‘Excel 8.0;Database=c:\Test.xls;’,‘Select * from [Emp$]‘)

Select EmployeeID, Title FROM HumanResources.Employee

Query to find Object Owner?

Page 5: RTDB

SELECTA.NAME AS [OBJECT],B.NAME AS [SCHEMA],USER_NAME(ISNULL(A.PRINCIPAL_ID,B.PRINCIPAL_ID)) as [OWNER]FROM SYS.ALL_OBJECTS A INNER JOIN SYS.SCHEMAS BON A.SCHEMA_ID = B.SCHEMA_IDWHERE A.NAME = ‘object name here’

Query to select distinct records without duplicate

select EmpId,EmpName from Emp group by EmpID,EmpName having count(*)>1

PIVOT

select * from productsProductID ProductName Year TotalDue702 MoutainBlack 2005 124566703 Pen 2005 45321704 Gloppy 2003 454875705 Watch 2003 456586706 Disc 2004 4565456707 Time 2004 4562164

Now you want to Cross-tab year based results

select ProductID,ProductName,SUM(case TheYear when 2003 then TotalDue else 0 end) as [2003],SUM(case TheYear when 2004 then TotalDue else 0 end) as [2003],SUM(case TheYear when 2005 then TotalDue else 0 end) as [2005]from productsgroup by ProductID,ProductName

(or same we can achive using PIVOT )

–pivot

select ProductID,ProductName,[2003],[2004],[2005] from products

pivot

(sum(TotalDue) for TheYear in([2003],[2004],[2005])) as PVT

–Results

ProductID ProductName 2003 2004 2005702 MoutainBlack NULL NULL 124566

Page 6: RTDB

703 Pen NULL NULL 45321704 Gloppy 454875 NULL NULL705 Watch 456586 NULL NULL706 Disc NULL 4565456 NULL707 Time NULL 4562164 NULL

To Find all dependency objects and its type

select ObjectName = object_name(id),b.Type_Desc Object_type, DependObjectName = object_name(depid),c.Type_Desc DependObject_Type from sys.sysdepends a inner join sys.objects b on a.id = b.object_id inner join sys.objects c on a.depid = c.object_id

Few Interesting Questions and concepts

There are 3 tables Titles, Authors and Title-Authors (check PUBS db). Write the query to get the author name and the number of books written by that author, the result should start from the author who has written the maximum number of books and end with the author who has written the minimum number of books.

SELECT authors.au_lname, COUNT(*) AS BooksCount FROM authors INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id INNER JOIN titles ON titles.title_id = titleauthor.title_id GROUP BY authors.au_lname ORDER BY BooksCount DESC

Write a SQL Query to find first day of month?

SELECT DATENAME(dw, DATEADD(dd, - DATEPART(dd, GETDATE()) + 1, GETDATE())) AS FirstDay

There is a table day_temp which has three columns dayid, day and temperature. How do I write a query to get the difference of temperature among each other for seven days of a week?

SELECT a.dayid, a.dday, a.tempe, a.tempe - b.tempe AS Difference FROM day_temp a INNER JOIN day_temp b ON a.dayid = b.dayid + 1

or this query

Select a.day, a.degree-b.degree from temperature a, temperature b where a.id=b.id+1

There is a table which contains the names like this. a1, a2, a3, a3, a4, a1, a1, a2 and their salaries. Write a query to get grand total salary, and total salaries of individual employees in one query.

SELECT empid, SUM(salary) AS salaryFROM employeeGROUP BY empid WITH ROLLUP ORDER BY empid

Update With Case

Page 7: RTDB

EmpID EmpName Gender1 Raja Male2 Rani Female

In the above table using one query u need to change Gender male to female and who is female need to change male.

UPDATE Emp1 SET Gender=CASE Gender WHEN ‘Male’ THEN ‘Female’WHEN ‘female’ THEN ‘Male’END;

Query to find the maximum salary of an employee

Select * from Employee where salary = (Select max(Salary) from Employee)

Query to Find the Nth Maximum Salary

Select * From Employee E1 Where (3-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > E1.Salary)

Query to Find the 2nd Maximum Salary

SELECT SALARY FROM EMPLOYEEWHERE SALARY=(SELECT MAX(SALARY) FROM EMPLOYEE WHERE SALARY <> (SELECT MAX (SALARY) FROM EMPLOYEE))

select max(salary) as Salary from Emplo where salary!=(select max(salary) from Emplo)

SELECT MAX(E1.salary) FROM emplo E1 , emplo E2WHERE E1.salary< E2.salary

Creating a foreign-key constraint between columns of two tables defined with two different datatypes will produce an error

Ans : Yes

Important concepts

What’s the difference between a primary key and a unique key? Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key doesn’t allow NULLs, but unique key allows one NULL only.

Define candidate key, alternate key, composite key. A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys. A key formed by combining at least two or more columns

Page 8: RTDB

is called composite key.What is a transaction and what are ACID properties? A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction. For more information and explanation of these properties, see SQL Server books online or any RDBMS fundamentals text book. Explain different isolation levels An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable. See SQL Server books online for an explanation of the isolation levels. Be sure to read about SET TRANSACTION ISOLATION LEVEL, which lets you customize the isolation level at the connection level. Read Committed – A transaction operating at the Read Committed level cannot see changes made by other transactions until those transactions are committed. At this level of isolation, dirty reads are not possible but nonrepeatable reads and phantoms are possible. Read Uncommitted – A transaction operating at the Read Uncommitted level can see uncommitted changes made by other transactions. At this level of isolation, dirty reads, nonrepeatable reads, and phantoms are all possible. Repeatable Read – A transaction operating at the Repeatable Read level is guaranteed not to see any changes made by other transactions in values it has already read. At this level of isolation, dirty reads and nonrepeatable reads are not possible but phantoms are possible. Serializable – A transaction operating at the Serializable level guarantees that all concurrent transactions interact only in ways that produce the same effect as if each transaction were entirely executed one after the other. At this isolation level, dirty reads, nonrepeatable reads, and phantoms are not possible.What’s the difference between DELETE TABLE and TRUNCATE TABLE commands? DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. TRUNCATE TABLE also deletes all the rows in a table, but it won’t log the deletion of each row, instead it logs the deallocation of the data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back. TRUNCATE TABLE is functionally identical to DELETE statement with no WHERE clause: both remove all rows in the table. But TRUNCATE TABLE is faster and uses fewer system and transaction log resources than DELETE. The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE TABLE removes the data by deallocating the data pages used to store the table’s data, and only the page deallocations are recorded in the transaction log. TRUNCATE TABLE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain. The counter used by an identity for new rows is reset to the seed for the column. If you want to retain the identity counter, use DELETE instead. If you want to remove table definition and its data, use the DROP TABLE statement. You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint; instead, use DELETE statement without a WHERE clause. Because TRUNCATE TABLE is not logged, it cannot activate a trigger. TRUNCATE TABLE may not be used on tables participating in an indexed view

What are the steps you will take to improve performance of a poor performing query? This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables. Some of the tools/ways that help you troubleshooting performance

Page 9: RTDB

problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer

What is a deadlock and what is a live lock? How will you go about resolving deadlocks? Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other’s piece. Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user’s process. A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely. Check out SET DEADLOCK_PRIORITY and “Minimizing Deadlocks” in SQL Server books online

What are statistics, under what circumstances they go out of date, how do you update them? Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not while executing a query. Some situations under which you should update statistics: 1) If there is significant change in the key values in the index 2) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated 3) Database is upgraded from a previous version. Look up SQL Server books online for the following commands: UPDATE STATISTICS, STATS_DATE, DBCC SHOW_STATISTICS, CREATE STATISTICS, DROP STATISTICS, sp_autostats, sp_createstats, sp_updatestats

Index Optimization tips• Every index increases the time in takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be very much. Try to use maximum 4-5 indexes on one table, not more. If you have read-only table, then the number of indexes may be increased.• Keep your indexes as narrow as possible. This reduces the size of the index and reduces the number of reads required to read the index.• Try to create indexes on columns that have integer values rather than character values.• If you create a composite (multi-column) index, the order of the columns in the key are very important. Try to order the columns in the key as to enhance selectivity, with the most selective columns to the leftmost of the key.• If you want to join several tables, try to create surrogate integer keys for this purpose and create indexes on their columns.• Create surrogate integer primary key (identity for example) if your table will not have many insert operations.• Clustered indexes are more preferable than nonclustered, if you need to select by a range of values or you need to sort results set with GROUP BY or ORDER BY.• If your application will be performing the same query over and over on the same table, consider creating a covering index on the table.• You can use the SQL Server Profiler Create Trace Wizard with “Identify Scans of Large Tables” trace to determine which tables in your database may need indexes. This trace will show which tables are being

Page 10: RTDB

scanned by queries instead of using an index.• You can use sp_MSforeachtable undocumented stored procedure to rebuild all indexes in your database. Try to schedule it to execute during CPU idle time and slow production periods.sp_MSforeachtable @command1=”print ‘?’ DBCC DBREINDEX (‘?’)”

Explain about Clustered and non clustered index? How to choose between a Clustered Index and a Non-Clustered Index?There are clustered and nonclustered indexes. A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.A nonclustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf nodes of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.Consider using a clustered index for:o Columns that contain a large number of distinct values.o Queries that return a range of values using operators such as BETWEEN, >, >=, <, and <=.o Columns that are accessed sequentially.o Queries that return large result sets.Non-clustered indexes have the same B-tree structure as clustered indexes, with two significant differences:o The data rows are not sorted and stored in order based on their non-clustered keys.o The leaf layer of a non-clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows. Each index row contains the non-clustered key value and one or more row locators that point to the data row (or rows if the index is not unique) having the key value.o Per table only 249 non clustered indexes

How many types of Joins?Joins can be categorized as:• Inner joins (the typical join operation, which uses some comparison operator like = or <>). These include equi-joins and natural joins.Inner joins use a comparison operator to match rows from two tables based on the values in common columns from each table. For example, retrieving all rows where the student identification number is the same in both the students and courses tables.• Outer joins. Outer joins can be a left, a right, or full outer join.Outer joins are specified with one of the following sets of keywords when they are specified in the FROM clause:• LEFT JOIN or LEFT OUTER JOIN -The result set of a left outer join includes all the rows from the left table specified in the LEFT OUTER clause, not just the ones in which the joined columns match. When a row in the left table has no matching rows in the right table, the associated result set row contains null values for all select list columns coming from the right table.• RIGHT JOIN or RIGHT OUTER JOIN – A right outer join is the reverse of a left outer join. All rows from the right table are returned. Null values are returned for the left table any time a right table row has no matching row in the left table.• FULL JOIN or FULL OUTER JOIN – A full outer join returns all rows in both the left and right tables. Any time a row has no match in the other table, the select list columns from the other table contain null values. When

Page 11: RTDB

there is a match between the tables, the entire result set row contains data values from the base tables.• Cross joins – Cross joins return all rows from the left table, each row from the left table is combined with all rows from the right table. Cross joins are also called Cartesian products. (A Cartesian join will get you a Cartesian product. A Cartesian join is when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself.

What are the difference between a function and a stored procedure? Functions can be used in a select statement where as procedures cannot Procedure takes both input and output parameters but Functions takes only input parameters Functions cannot return values of type text, ntext, image & timestamps where as procedures can Functions can be used as user defined datatypes in create table but procedures cannot***Eg:-create table <tablename>(name varchar(10),salary getsal(name))Here getsal is a user defined function which returns a salary type, when table is created no storage is allotted for salary type, and getsal function is also not executed, But when we are fetching some values from this table, getsal function get’s executed and the returnType is returned as the result set.

What is the basic functions for master, msdb, tempdb databases?Microsoft® SQL Server 2000 systems have four system databases:• master – The master database records all of the system level information for a SQL Server system. It records all login accounts and all system configuration settings. master is the database that records the existence of all other databases, including the location of the database files.• tempdb – tempdb holds all temporary tables and temporary stored procedures. It also fills any other temporary storage needs such as work tables generated by SQL Server. tempdb is re-created every time SQL Server is started so the system starts with a clean copy of the database.By default, tempdb autogrows as needed while SQL Server is running. If the size defined for tempdb is small, part of your system processing load may be taken up with autogrowing tempdb to the size needed to support your workload each time to restart SQL Server. You can avoid this overhead by using ALTER DATABASE to increase the size of tempdb.• model – The model database is used as the template for all databases created on a system. When a CREATE DATABASE statement is issued, the first part of the database is created by copying in the contents of the model database, then the remainder of the new database is filled with empty pages. Because tempdb is created every time SQL Server is started, the model database must always exist on a SQL Server system.• msdb – The msdb database is used by SQL Server Agent for scheduling alerts and jobs, and recording operators.

1st Normal Form (1NF)Definition: A table (relation) is in 1NF if1. There are no duplicated rows in the table.2. Each cell is single-valued (i.e., there are no repeating groups or arrays).3. Entries in a column (attribute, field) are of the same kind.Note: The order of the rows is immaterial; the order of the columns is immaterial.Note: The requirement that there be no duplicated rows in the table means that the table has a key (although the key might be made up of more than one column—even, possibly, of all the columns).

Page 12: RTDB

Rule 1: Eliminate Repeating Groups. Make a separate table for each set of related attributes, and give each table a primary key.2nd Normal Form (2NF)Definition: A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key.Note: Since a partial dependency occurs when a non-key attribute is dependent on only a part of the (composite) key, the definition of 2NF is sometimes phrased as, “A table is in 2NF if it is in 1NF and if it has no partial dependencies.”Rule 2: Eliminate Redundant Data. If an attribute depends on only part of a multi-valued key, remove it to a separate table.3rd Normal Form (3NF)Definition: A table is in 3NF if it is in 2NF and if it has no transitive dependencies.Rule 3: Eliminate Columns Not Dependent on Key. If attributes do not contribute to a description of the key, remove them to a separate table.Boyce-Codd Normal Form (BCNF)Definition: A table is in BCNF if it is in 3NF and if every determinant is a candidate key.4th Normal Form (4NF)Definition: A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.Rule 4: Isolate Independent Multiple Relationships. No table may contain two or more l:n or n:m relationships that are not directly related.5th Normal Form (5NF)Definition: A table is in 5NF, also called “Projection-Join Normal Form” (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table.Rule 5: Isolate Semantically Related Multiple Relationships. There may be practical constraints on information that justify separating logically related many-to-many relationships.Domain-Key Normal Form (DKNF)Definition: A table is in DKNF if every constraint on the table is a logical consequence of the definition of keys and domains.

I have an interesting query to find nth MAX/MIN Salary etc. It goes like this,

SELECT DISTINCT (a.Salary) FROM Emp1 a WHERE &N =(SELECT COUNT (DISTINCT (b.Salary)) FROM Emp1 b WHERE a.Salary<=b.Salary)

Enter value for '&N'

What is the use of COALESCE in SQL Server

Let us understand the use of COALESCE with the help of an example.

In this example, the Candidate table is shown to include three columns with information about a Candidate: 1. Candidate_id2. PrimaryEmail3. SecondaryEmail

Page 13: RTDB

COALESCE in the SELECT statement below, selects the PrimaryEmail if it is not null. If the PrimaryEmail is null then SecondaryEmail will be selected. If both PrimaryEmail and SecondaryEmail is present then only PrimaryEmail is selected. So, COALESCE returns the first nonnull column among the list of columns passed. If both PrimaryEmail and SecondaryEmail is NULL, COALESCE returns NULL.

COALESCE can also be used in joins as shown in the example below. If the Candidate table has a non null value in the Email column, then the value is selected. If the Email column is null in the Candidate Table then, CompanyEmail from CandidateCompany Table is selected.

Page 14: RTDB

Advantages of stored procedures

This is a very common sql server interview question. There are several advantages of using stored procedures over adhoc queries, as listed below.

1. Better Performance : Stored procedures are precompiled and hence run much faster than adhoc queries

2. Better Security : Applications making use of dynamically built adhoc sql queries are highly

susceptible to sql injection attacks, where as Stored Procedures can avoid SQL injection attacks completely.

3. Reduced Network Traffic: Stored procedures can reduce network traffic to a very great extent when compared with adhoc sql queries. With stored procedures, you only need to send the name of the procedure between client and server. Imagine the amount of network bandwith that can be saved especially if the stored procedure contains 1000 to 2000 lines of SQL.

4. Better Maintainance and Reusability: Stored procedures can be used any where in the application. It is easier to maintain a stored procedure that is used on several pages as the modfifcations just need to be changed at one place where the stored procedure is defined. On the other hand, maintaining an adhoc sql

Page 15: RTDB

query that's used on several pages is tedious and error prone, as we have to make modifications on each and every page.

SQL Server Interview Questions on triggers

What is a Trigger in SQL Server?A Trigger is a database object that is attached to a table. In many aspects it is similar to a stored procedure. As a matter of fact, triggers are often referred to as a "special kind of stored procedure." The main difference between a trigger and a stored procedure is that the former is attached to a table and is only fired when an INSERT, UPDATE or DELETE occurs.

What are the two types of Triggers in SQL Server?1. After Triggers : Fired after Insert, Update and Delete operations on a table.2. Instead of Triggers: Fired instead of Insert, Update and Delete operations on a table.

What are the special tables used by Triggers in SQL Server?Triggers make use of two special tables called inserted and deleted. The inserted table contains the data referenced in an INSERT before it is actually committed to the database. The deleted table contains the data in the underlying table referenced in a DELETE before it is actually removed from the database. When an UPDATE is issued both tables are used. More specifically, the new data referenced in the UPDATE statement is contained in inserted table and the data that is being updated is contained in deleted table.

Give a real time example for triggers usage?It is recomended to avoid triggers in a real time environment. There is one scenario I can think of why you may want to use triggers in a real time environment. Let us use an example to understand this.

I have 2 tables, tblPerson and tblGender as shown below. GenderId is the foriegn key in tblPerson table.

Now create a view based on the above 2 tables as shown below.

Page 16: RTDB

Select * from vWPersons will give us the result as shown below.

Now update the view the following query. This will change the Gender Text to Female in tblGender table for Id = 1. This is not what we have expected.

Update tblGender Set Gender='Female' where Id=1

The base tables are updated incorrectly. So, Select * from vWPersons will give us the result as shown below.

To update the base tables correctly, you can create an INSTEAD OF trigger on the view as shown below.

Page 17: RTDB

Now run the query below which will update the underlying base tables correctly.Update vWPersons Set Gender='Female' Where Id=1

Select * from vWPersons, will show the correct result set as shown below. The INSTEAD OF trigger has correctly updated the GenderId in tblPerson table.

So, Instead of triggers can be used to facilitate updating Views that are based on multiple base tables.

SQL Server Interview Questions on Temporary Tables

What are the 2 types of Temporary Tables in SQL Server?1. Local Temporary Tables2. Global Temporary Tables

What is the difference between Local and Global Temporary Tables?Local Temporary Tables:1. Prefixed with a single pound sign (#). 2. Local temporary tables are visible to that session of SQL Server which has created it. 3. Local temporary tables are automatically dropped, when the session that created the temporary tables is closed.

Global Temporary Tables:1. Prefixed with two pound signs (##). 2. Global temporary tables are visible to all the SQL server sessions. 3. Global temporary tables are also automatically dropped, when the session that created the temporary tables is closed.

Can you create foreign key constraints on temporary tables?No

Do you have to manually delete temporary tables?No, temporary tables are automatically dropped, when the session that created the temporary tables is

Page 18: RTDB

closed. But if you maintain a persistent connection or if connection pooling is enabled, then it is better to explicitly drop the temporary tables you have created.However, It is generally considered a good coding practice to explicitly drop every temporary table you create.

In which database, the temporary tables get created?TEMPDB database.

How can I check for the existence of a temporary table?

Usually it is best to create a temp table through the create function.

For instance if you were to check if that temp table existed and then drop/create a new one and insert data it would go something like this.

IF object_ID('tempdb..##temptable') IS NOT NULLDROP TABLE ##TEMPTABLECREATE TABLE ##TEMPTABLE

INSERT ##TEMPTABLESELECT e.EMPLOYEEfrom employee ewhere e.employeename = frank

This would check for the table and drop it if it exists then would create a new global temp table (indicated by the ##) and insert all employees with the name frank into this temp table.

Basic SQL Server Interview Questions

Explain DML, DDL, DCL and TCL statements with examples?DML: DML stands for Data Manipulation Language. DML is used to retrieve, store, modify, delete, insert and

Page 19: RTDB

update data in database.Examples of DML statements: SELECT, UPDATE, INSERT, DELETE statements.

DDL: DDL stands for Data Definition Language. DDL is used to create and modify the structure of database objects.

Examples: CREATE, ALTER, DROP statements.

DCL: DCL stands for Data Control Language. DCL is used to create roles, grant and revoke permissions, establish referential integrity etc.Examples: GRANT, REVOKE statements

TCL: TCL stands for Transactional Control Language. TCL is used to manage transactions within a database.Examples: COMMIT, ROLLBACK statements

What is the difference between Drop, Delete and Truncate statements in SQL Server?Drop, Delete and Truncate - All operations can be rolled back.

Delete is a logged operation, which means deleted rows are written to the transaction log.Truncate is not a logged operation, which means deleted rows are not written to the transaction log.

Hence, truncate is a little faster than Delete. You can have a where clause in Delete statement where as Truncate statement cannot have a where clause. Truncate will delete all the rows in a Table, but the structure of the table remains. Drop would delete all the rows including the structure of the Table.

Please refer to the screen shot below for the differences summary snapshot between Drop, Delete and Truncate statements in SQL Server.

What is Cascading referential integrity constraint?Cascading referential integrity constraints allow you to define the actions Microsoft SQL Server should take when a user attempts to delete or update a key to which an existing foreign keys point.

You can instruct SQL Server to do the following:

Page 20: RTDB

1. No Action: This is the default behaviour. No Action specifies that if an attempt is made to delete or update a row with a key referenced by foreign keys in existing rows in other tables, an error is raised and the DELETE or UPDATE is rolled back.

2. Cascade: Specifies that if an attempt is made to delete or update a row with a key referenced by foreign keys in existing rows in other tables, all rows containing those foreign keys are also deleted or updated.

3. Set NULL: Specifies that if an attempt is made to delete or update a row with a key referenced by foreign keys in existing rows in other tables, all rows containing those foreign keys are set to NULL.

4. Set Default: Specifies that if an attempt is made to delete or update a row with a key referenced by foreign keys in existing rows in other tables, all rows containing those foreign keys are set to default values.

DIfference between primary key and unique key in SQL Server?1. A table can have only one primary key. On the other hand a table can have more than one unique key.2. Primary key column does not accept any null values, where as a unique key column accept one null value.what is the importance of index in a table? Re: Index are only way to make fast retrieve of data.As like in book if we know index Number we can find page easy.Same way if our table contain millions of records it makes easy and fast search.There are two types of indexes cluster and non cluster.Most of the web sites say Truncate cannot be rolled back. That is not true. I have tested this on SQL Server 2008. Truncate can be rolled back. If you want try truncate yourself follow these steps.

Step 1: Truncate the table as part of the transaction, but do not commit.Begin TranTruncate Table tblProduct

Step 2: Now issue a select statement. You will see no rows in the table.Select * from tblProduct

Step 3: Now issue a rollback statement, to rollback the truncate operation.Rollback

Step 4: Now issue a select statement. You will see all the rows in the table. This proves that Truncate can be rolled back.Select * from tblProduct

What is the difference between a User Defined Function (UDF) and a Stored Procedure (SP) in SQL Server

1. Stored Procedure support deffered name resolution where as functions do not support deffered name resolution.

Page 21: RTDB

2. User Defined Function can be used in a select statement where as you cannot use a stored procedure in a select statement.

3. UDF's cannot return Image, Text where as a StoredProcedure can return any datatype.

4. In general User Defined Functions are used for computations where as Stored Procedures are used for performing business logic.

5. UDF should return a value where as Stored Procedure need not.

6. User Defined Functions accept lesser number of input parameters than Stored Procedures. UDF can have upto 1023 input parameters where as a Stored Procedure can have upto 21000 input parameters.

7. Temporary Tables can not be used in a UDF where as a StoredProcedure can use Temporary Tables.

8. UDF can not Execute Dynamic SQL where as a Stored Procedure can execute Dynamic SQL.

9. User Defined Function does not support error handling where as Stored Procedure support error handling. RAISEERROR or @@ERROR are not allowed in UDFs.

You can not write PRINT Statement in UDFYou can not write EXEC('any sql statement') inside UDFThe core difference is that function has no side effect which means that it cannot change anything outside the function body.Funcation can take only input parameter Where as Stored prcedure can take input and ouput parameters.Stored procedures are compiled objects where as UDF are notGETDATE can be used as the input parameter, but could not be used inside the function itself. Other built in SQL functions that cannot be used inside a User Defined Function include: RAND, NEWID, @@CONNCECTIONS, @@TIMETICKS, and @@PACK_SENT. Any built in function that is non-deterministic cannot be used inside the function itself.

Functions can be called from procedure whereas procedures cannot be called from function.

SQL Server Interview Questions on Views

Page 22: RTDB

What is a View in SQL Server?You can think of a view either as a compiled sql query or a virtual table. As a view represents a virtual table, it does not physically store any data. When you query a view, you actually retrieve the data from the underlying base tables.

What are the advantages of using views?Or When do you usually use views?1. Views can be used to implement row level and column level security.

Example 1: Consider the tblEmployee table below. I don't want some of the users to have access to the salary column, but they should still be able to access ID, NAME and DEPT columns. If I grant access to the table, the users will be able to see all the columns. So, to achieve this, I can create a view as shown in Listing 1 below. Now, grant access to the view and not the table. So using views we can provide column level security.

tblEmployee

Listing 1Create View vWEmployeeAsSelect ID, Name, DeptFrom tblEmployee

Example 2: Let us say, we have a few users who should be able to access only IT employee details and not any other dept. To do this, I can create a view as shown in Listing 2 below. Now, grant access only to the view and not the table. So using views we can provide row level security as well.

Listing 2Create View vWITEmployeesAsSelect ID, Name, DeptFrom tblEmployeeWhere Dept = 'IT'

Page 23: RTDB

2. Simplify the database schema to the users. You can create a view based on multiple tables which join columns from all these multiple tables so that they look like a single table.

3. Views can be used to present aggregated and summarized data.

Example 1: Consider the tblEmployee table above. I want to aggregate the data as shown in the image below. To do this I can create a view as shown in Listing 3. Now, you can simply issue a select query against the view rather than writing a complex query every time you want to retrieve the aggregated data.

Listing 3Select Dept, Count(*) As TotalFrom tblEmployeeGroup By Dept

Can you create a view based on other views?Yes, you can create a view based on other views. Usually we create views based on tables, but it also possible to create views based on views.

Can you update views?Yes, views can be updated. However, updating a view that is based on multiple tables, may not update the underlying tables correctly. To correctly update a view that is based on multiple tables you can make use INSTEAD OF triggers in SQL Server.

What are indexed views?OrWhat are materialized views?

A view is a virtual table, it does not contain any physical data. A view is nothing more than compiled SQL query. Every time, we issue a select query against a view, we actually get the data from the underlying base tables and not from the view, as the view itself does not contain any data.

When you create an index on a view, the data gets physically stored in the view. So, when we issue a select query against an indexed view, the data is retrieved from the index without having to go to the underlying table, which will make the select statement to work slightly faster. However, the disadvantage is, INSERT, UPDATE and DELETE operations will become a little slow, because every time you insert or delete a row from the underlying table, the view index needs to be updated. Inshort, DML operations will have negative impact on performance.

Page 24: RTDB

Oracle refers to indexed views as materialized views.

Only the views created with schema binding, can have an Index. Simply adding WITH SCHEMABINDING to the end of the CREATE VIEW statement will accomplish this. However, the effect is that any changes to the underlying tables which will impact the view are not allowed. Since the indexed view is stored physically, any schema changes would impact the schema of the stored results set. Therefore, SQL Server requires that schema binding be used to prevent the view's schema (and therefore the underlying tables) from changing.

The first index for a view must be a UNIQUE CLUSTERED INDEX, after which, it's possible to create non-clustered indexes against the view.

Indexed Views are heavily used in data warehouses and reporting databases that are not highly transactional.

What are the limitations of a View?1. You cannot pass parameters to a view.

2. Rules and Defaults cannot be associated with views.

3. The ORDER BY clause is invalid in views unless TOP or FOR XML is also specified.

4. Views cannot be based on temporary tables.

Basic SQL Server Interview Questions on Joins

What are the different types of joins available in sql server?There are 3 different types of joins available in sql server, and they are1. Cross Join 2. Inner Join or Join 3. Outer Join

Outer Join is again divided into 3 types as shown below.1. Left Outer Join or Left Join 2. Right Outer Join or Right Join 3. Full Outer Join or Full Join

You might have heard about self join, but self join is not a different type of join. A self join means joining a table with itself. We can have an inner self join or outer self join. Read this sql server interview question, to understand self join in a greater detail.

What is cross join. Explain with an example?Let us understand Cross Join with an example. Create 2 tables Company and Candidate. Use the script below to create these tables and populate them. CompanyId column in Candidate Table is a foreign key referencing

Page 25: RTDB

CompanyId in Company Table.

CREATE TABLE Company( CompanyId TinyInt Identity Primary Key, CompanyName Nvarchar(50) NULL)GO

INSERT Company VALUES('DELL')INSERT Company VALUES('HP')INSERT Company VALUES('IBM')INSERT Company VALUES('Microsoft')GO

CREATE TABLE Candidate( CandidateId tinyint identity primary key, FullName nvarchar(50) NULL, CompanyId tinyint REFERENCES Company(CompanyId))GO

INSERT Candidate VALUES('Ron',1)INSERT Candidate VALUES('Pete',2)INSERT Candidate VALUES('Steve',3)INSERT Candidate VALUES('Steve',NULL)INSERT Candidate VALUES('Ravi',1)INSERT Candidate VALUES('Raj',3)INSERT Candidate VALUES('Kiran',NULL)GO

A cross join produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. A query involving a CROSS JOIN for the Candidate and Company Table is shown below.

SELECT Cand.CandidateId,Cand.FullName,Cand.CompanyId, Comp.CompanyId,Comp.CompanyNameFROM Candidate CandCROSS JOIN Company Comp

If we run the above query, we produce the result set shown in the image below.

Page 26: RTDB

Key Points to remember about CROSS JOIN. 1. A cross join produces the Cartesian product of the tables involved in the join.This mean every row in the Left Table is joined to every row in the Right Table. Candidate is LEFT Table and Company is RIGHT Table. In our example we have 28 total number of rows in the result set. 7 rows in the Candidate table multiplied by 4 rows in the Company Table.

2. In real time scenarios we rarley use CROSS JOIN. Most often we use either INNER JOIN or LEFT OUTER JOIN.

3. CROSS JOIN does not have an ON clause with a Join Condition. All the other JOINS use ON clause with a Join Condition.

4. Using an ON clause on a CROSS JOIN would generate a syntax error.

Note: Understanding the above key points will help you answer any follow up interview questions on cross join in sql server.

Page 27: RTDB

Inner Join and left join are the most commonly used joins in real time projects. We will talk about left join in a later article. Now, let us understand Inner join with an example.

Create 2 tables Company and Candidate. Use the script below to create these tables and populate them. CompanyId column in Candidate Table is a foreign key referencing CompanyId in Company Table.

CREATE TABLE Company( CompanyId TinyInt Identity Primary Key, CompanyName Nvarchar(50) NULL)GO

INSERT Company VALUES('DELL')INSERT Company VALUES('HP')INSERT Company VALUES('IBM')INSERT Company VALUES('Microsoft')GO

CREATE TABLE Candidate( CandidateId tinyint identity primary key, FullName nvarchar(50) NULL, CompanyId tinyint REFERENCES Company(CompanyId))GO

INSERT Candidate VALUES('Ron',1)INSERT Candidate VALUES('Pete',2)INSERT Candidate VALUES('Steve',3)INSERT Candidate VALUES('Steve',NULL)INSERT Candidate VALUES('Ravi',1)INSERT Candidate VALUES('Raj',3)INSERT Candidate VALUES('Kiran',NULL)GO

If you want to select all the rows from the LEFT table(In our example Candidate Table) that have a non null foreign key value(CompanyId in Candidate Table is the foreign key) then we use INNER JOIN. A query involving an INNER JOIN for the Candidate and Company Table is shown below.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate Cand

Page 28: RTDB

INNER JOIN Company CompON Cand.CompanyId = Comp.CompanyId

If we run the above query the output will be as shown in the image below. If you look at the out put, we only got 5 rows. We did not get the 2 rows which has NULL value in the CompanyId column. So an INNER JOIN would get all the rows from the LEFT Table that has non null foreign key value.

Inner Join Result

Instead of using INNER JOIN keyword we can just use JOIN keyword as shown below. JOIN or INNER JOIN means the same.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandJOIN Company CompON Cand.CompanyId = Comp.CompanyIdInner Join and left join are the most commonly used joins in real time projects. Click here to read about Inner Join in SQL Server. Now, let us understand Left join with an example.

Create 2 tables Company and Candidate. Use the script below to create these tables and populate them. CompanyId column in Candidate Table is a foreign key referencing CompanyId in Company Table.

CREATE TABLE Company( CompanyId TinyInt Identity Primary Key, CompanyName Nvarchar(50) NULL)GO

INSERT Company VALUES('DELL')INSERT Company VALUES('HP')INSERT Company VALUES('IBM')INSERT Company VALUES('Microsoft')GO

CREATE TABLE Candidate( CandidateId tinyint identity primary key,

Page 29: RTDB

FullName nvarchar(50) NULL, CompanyId tinyint REFERENCES Company(CompanyId))GO

INSERT Candidate VALUES('Ron',1)INSERT Candidate VALUES('Pete',2)INSERT Candidate VALUES('Steve',3)INSERT Candidate VALUES('Steve',NULL)INSERT Candidate VALUES('Ravi',1)INSERT Candidate VALUES('Raj',3)INSERT Candidate VALUES('Kiran',NULL)GO

If you want to select all the rows from the LEFT table ( In our example Candidate Table ) including the rows that have a null foreign key value ( CompanyId in Candidate Table is the foreign key ) then we use LEFT OUTER JOIN. A query involving a LEFT OUTER JOIN for the Candidate and Company Table is shown below.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandLEFT OUTER JOIN Company CompON Cand.CompanyId = Comp.CompanyId

If we run the above query the output will be as shown in below. If you look at the out put, we now got all 7 rows ( All the rows from the Candidate Table ) including the row that has a null value for the CompanyId column in the Candidate Table. So, LEFT OUTER JOIN would get all the rows from the LEFT Table including the rows that has null foreign key value.

Left Join Result

Instead of using LEFT OUTER JOIN keyword we can just use LEFT JOIN keyword as shown below. LEFT OUTER JOIN or LEFT JOIN means the same.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyName

Page 30: RTDB

FROM Candidate CandLEFT JOIN Company CompON Cand.CompanyId = Comp.CompanyIdNow, let us understand Right Outer join with an example.

Create 2 tables Company and Candidate. Use the script below to create these tables and populate them. CompanyId column in Candidate Table is a foreign key referencing CompanyId in Company Table.

CREATE TABLE Company( CompanyId TinyInt Identity Primary Key, CompanyName Nvarchar(50) NULL)GO

INSERT Company VALUES('DELL')INSERT Company VALUES('HP')INSERT Company VALUES('IBM')INSERT Company VALUES('Microsoft')GO

CREATE TABLE Candidate( CandidateId tinyint identity primary key, FullName nvarchar(50) NULL, CompanyId tinyint REFERENCES Company(CompanyId))GO

INSERT Candidate VALUES('Ron',1)INSERT Candidate VALUES('Pete',2)INSERT Candidate VALUES('Steve',3)INSERT Candidate VALUES('Steve',NULL)INSERT Candidate VALUES('Ravi',1)INSERT Candidate VALUES('Raj',3)INSERT Candidate VALUES('Kiran',NULL)GO

If you want to select all the rows from the LEFT Table ( In our example Candidate Table) that have non null foreign key values plus all the rows from the RIGHT table ( In our example Company Table) including the rows that are not referenced in the LEFT Table, then we use RIGHT OUTER JOIN. A query involving a RIGHT OUTER JOIN for the Candidate and Company Table is shown below.

Page 31: RTDB

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandRIGHT OUTER JOIN Company CompON Cand.CompanyId = Comp.CompanyId

If we run the above query the output will be as shown in below. If you look at the out put, we now got 6 rows. All the rows from the Candidate Table that has non null foreign key value plus all the rows from the Company Table including the row that is not referenced in the Candidate Table.

Right Outer Join Results

Instead of using RIGHT OUTER JOIN keyword we can just use RIGHT JOIN keyword as shown below. RIGHT OUTER JOIN or RIGHT JOIN means the same.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandRIGHT JOIN Company CompON Cand.CompanyId = Comp.CompanyIdInner Join and left join are the most commonly used joins in real time projects. It is very important that you understand the basics of joins before reading this article. Please read the articles below if you have not done so already.1. Basics of Joins in SQL Server2. Inner Join3. Left Join4. Right JoinNow, let us understand Full Outer join with an example.

Create 2 tables Company and Candidate. Use the script below to create these tables and populate them. CompanyId column in Candidate Table is a foreign key referencing CompanyId in Company Table.

CREATE TABLE Company( CompanyId TinyInt Identity Primary Key, CompanyName Nvarchar(50) NULL)

Page 32: RTDB

GO

INSERT Company VALUES('DELL')INSERT Company VALUES('HP')INSERT Company VALUES('IBM')INSERT Company VALUES('Microsoft')GO

CREATE TABLE Candidate( CandidateId tinyint identity primary key, FullName nvarchar(50) NULL, CompanyId tinyint REFERENCES Company(CompanyId))GO

INSERT Candidate VALUES('Ron',1)INSERT Candidate VALUES('Pete',2)INSERT Candidate VALUES('Steve',3)INSERT Candidate VALUES('Steve',NULL)INSERT Candidate VALUES('Ravi',1)INSERT Candidate VALUES('Raj',3)INSERT Candidate VALUES('Kiran',NULL)GO

If you want to select all the rows from the LEFT Table ( In our example Candidate Table ) plus all the rows from the RIGHT table ( In our example Company Table ) , then we use FULL OUTER JOIN. A query involving a FULL OUTER JOIN for the Candidate and Company Table is shown below.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandFULL OUTER JOIN Company CompON Cand.CompanyId = Comp.CompanyId

If we run the above query the output will be as shown in below. If you look at the out put, we now got 8 rows. All the rows from the Candidate Table and all the rows from the Company Table.

Page 33: RTDB

Full Outer Join Result

Instead of using FULL OUTER JOIN keyword we can just use FULL JOIN keyword as shown below. FULL OUTER JOIN or FULL JOIN means the same.

SELECT Cand.CandidateId, Cand.FullName, Cand.CompanyId, Comp.CompanyId, Comp.CompanyNameFROM Candidate CandFULL JOIN Company CompON Cand.CompanyId = Comp.CompanyIdThere are 3 different types of joins available in sql server, and they are1. Cross Join 2. Inner Join or Join 3. Outer Join

Outer Join is again divided into 3 types as shown below.1. Left Outer Join or Left Join 2. Right Outer Join or Right Join 3. Full Outer Join or Full Join

I strongly recomend to learn about the basics and types of joins, before reading this article. Read the articles below, before proceeding with self join.1. Basics of Joins2. Inner Join3. Left Outer Join4. Right Outer Join5. Full Outer Join

Self join is not a different type of join. Self join means joining a table with itself. We can have an inner self join or outer self join. Let us try to understand with an example.

To set up the data for the example, use the script below to create Employee Table and populate it with some sample data. We will be using Employee Table to understand Self Join.

CREATE TABLE EMPLOYEE(

Page 34: RTDB

[EMPLOYEEID] INT PRIMARY KEY,[NAME] NVARCHAR(50),[MANAGERID] INT)GO

INSERT INTO EMPLOYEE VALUES(101,'Mary',102)INSERT INTO EMPLOYEE VALUES(102,'Ravi',NULL)INSERT INTO EMPLOYEE VALUES(103,'Raj',102)INSERT INTO EMPLOYEE VALUES(104,'Pete',103)INSERT INTO EMPLOYEE VALUES(105,'Prasad',103)INSERT INTO EMPLOYEE VALUES(106,'Ben',103)GO

We use Self Join, if we have a table that references itself. For example, In the Employee Table below MANAGERID column references EMPLOYEEID column. So the table is said to referencing itself. This is the right scenario where we can use Self Join. Now I want to write a query that will give me the list of all Employee Names and their respective Manager Names. In order to achieve this I can use Self Join. In the Table below,Raj is the manager for Pete,Prasad and Ben. Ravi is the manager for Raj and Mary. Ravi does not have a manager as he is the president of the Company.

The query below is an example of Self Join. Both E1 and E2 refer to the same Employee Table. In this query we are joining the Employee Table with itself.

SELECT E1.[NAME],E2.[NAME] AS [MANAGER NAME]FROM EMPLOYEE E1 INNER JOIN EMPLOYEE E2 ON E2.EMPLOYEEID =E1.MANAGERID

If we run the above query we only get 5 rows out of the 6 rows as shown below.

Inner Self Join

Page 35: RTDB

This is because Ravi does not have a Manager. MANAGERID column for Ravi is NULL. If we want to get all the rows then we can use LEFT OUTER JOIN as shown below.

SELECT E1.[NAME],E2.[NAME] AS [MANAGER NAME]FROM EMPLOYEE E1 LEFT OUTER JOIN EMPLOYEE E2 ON E2.EMPLOYEEID =E1.MANAGERID

If we execute the above query we get all the rows, including the row that has a null value in the MANAGERID column. The results are shown below. The MANAGERNAME for 2nd record is NULL as Ravi does not have a Manager.Left Outer Self Join

Let us now slightly modify the above query using COALESCE as shown below. Read COALESCE function in SQL Server to understand COALESCE in a greater detail.

SELECT E1.[NAME],COALESCE(E2.[NAME],'No Manager') AS [MANAGER NAME]FROM EMPLOYEE E1 LEFT JOIN EMPLOYEE E2 ON E2.EMPLOYEEID =E1.MANAGERID

If we execute the above query the output will be as shown in the image below. This is how COALESCE can be used.

Left Outer Self Join with COALESCE

Page 36: RTDB

Index Scan:Index Scan scans each and every record in the index. Table Scan is where the table is processed row by row from beginning to end. If the index is a clustered index then an index scan is really a table scan. Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Hence, a scan is an efficient strategy only if the table is small.

Index Seek:Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.

ExampleI have an employee table as shown in the diagram below. EmployeeId is the primary key. We have a clustered index on the employeeId column.

Query 1 : Select * from Employee where FirstName='Ben'Query 2 : Select * from Employee where EmployeeId=2

Query 1 will do an Index scan (Table Scan) to retrieve the record as there is no Index on the FirstName column. Query 2 will do an Index seek to retrieve the record as there is an Index on the EmployeeId column.

So from this example, you should have understood that, a query will result into an index seek, only if there is an index on the table to help they query to retrieve the data.

Delete from tables involved in a SQL Join

This question is not asked that often in an interview, but you may have to write a query to delete from tables that are involved in a SQL join in your day to day job routine as a SQL Developer.Let me explain you the question first. Consider the Employees and Departments tables below.

Page 37: RTDB

I want you to write a query using joins which will give me the list of all HR Department employees. The output should be as shown below.

To do this we will write a query using joins as shown below.Select E.Name as [Employee Name],D.Name as [Department Name]from Employees EInner Join Departments DOn E.DeptId = D.IdWhere D.Name = 'HR'

Now I want you to write a query, that deletes all the HR department employees. To do this we have to join Employees and Departments tables and usually we will be tempted to write the query as shown below, but this query will not execute and will give a syntax error.

Incorrect SQL Delete Query to delete from multiple tables invloved in a SQL joinDelete From Employees EInner Join Departments DOn E.DeptId = D.IdWhere D.Name = 'HR'

The query below shows the correct syntax for issuing a SQL delete that involves a SQL join.Delete E From (Employees EInner Join Departments DOn E.DeptId = D.Id

Page 38: RTDB

and D.Name = 'HR')

The following syntax is also valid and has the same effect.Delete Employees From(Employees join Departmentson Employees.DeptId = Departments.Id)where Departments.Name = 'HR'

Advantages and Disadvantages of Stored ProcedureOctober 14, 2005 — mukund

Advantages of stored procedures:

the procedures/functions are stored in the database and are, therefore, executed on the database server which is likely to me more powerful than the clients which in turn means that stored procedures should run faster;the code is stored in a pre-compiled form which means that it is syntactically valid and does not need to be compiled at run-time, thereby saving resources;each user of the stored procedure/function will use exactly the same form of queries which means the queries are reused thereby reducing the parsing overhead and improving the scalability of applications;as the procedures/functions are stored in the database there is no need to transfer the code from the clients to the database server or to transfer intermediate results from the server to the clients. This results in much less network traffic and again improves scalability;when using PL/SQL packages, as soon as one object in the package is accessed, the whole package is loaded into memory which makes subsequent access to objects in the package much fasterstored procedures/functions can be compiled into “native” machine code making them even faster (available with Oracle 9i and above)

Disadvantages:

there is an overhead involved in switching from SQL to PL/SQL, this may be significant in terms of performance but usually this overhead is outweighed by performance advantages of using PL/SQLmore memory may be required when using packages as the whole package is loaded into memory as soon as any object in the package is accessednative compilation can take twice as long as normal compilation

Despite the advantages listed above, there are some situations where the use of stored procedures is not recommended or may be infeasible.

Disadvantages

Page 39: RTDB

Applications that involve extensive business logic and processing could place an excessive load on the server if the logic was implemented entirely in stored procedures. Examples of this type of processing include data transfers, data traversals, data transformations and intensive computational operations. You should move this type of processing to business process or data access logic components, which are a more scalable resource than your database server.

Do not put all of your business logic into stored procedures. Maintenance and the agility of your application becomes an issue when you must modify business logic in T-SQL. For example, ISV applications that support multiple RDBMS should not need to maintain separate stored procedures for each system.

Writing and maintaining stored procedures is most often a specialized skill set that not all developers possess. This situation may introduce bottlenecks in the project development schedule.

State transition testing is used where some aspect of the system can be described in what is called a “finite state machine”. This simply means that the system can be in a (finite) number of different states, and the transitions from one state to another are determined by the rules of the “machine”. This is the model on which the system and the tests are based. Any system where you get a different output for the same input, depending on what has happened before, is a finite state system.

For example, if you request to withdraw £100 from a bank ATM, you may be given cash. Later you may make exactly the same request but be refused the money (because your balance is insufficient). This later refusal is because the state of your bank account had changed from having sufficient funds to cover the withdrawal to having insufficient funds. The transaction that caused your account to change its state was probably the earlier withdrawal. Another example is a word processor. If a document is open, you are able to Close it. If no document is open, then “Close” is not available. After you choose “Close” once, you cannot choose it again for the same document unless you open that document. A document thus has two states: open and closed.

A state transition model has four basic parts:

The states that the software may occupy (open/closed or funded/insufficient funds); The transitions from one state to another (not all transitions are allowed); The events that cause a transition (withdrawing money, closing a file); The actions that result from a transition (an error message, or being given your cash).

Note that a transition does not need to change to a different state; it could stay in the same state. In fact, trying to input an invalid input would be likely to produce an error message as the action, but the transition would be back to the same state the system was in before.

Deriving test cases from the state transition model is a black box approach. Measuring how much you have tested (covered) is getting close to a white box perspective. However, state transition testing is generally regarded as a black box technique.

You can design tests to test every transition shown in the model. If every (valid) transition is tested, this is known as “0-switch” coverage. You could also test a series of transitions through more than one state. If you

Page 40: RTDB

covered all of the pairs of two valid transitions, you would have “1-switch” coverage, covering the sets of 3 transitions would give “2-switch” coverage, etc.

However, deriving tests only from the model may omit the negative tests, where we could try to generate invalid transitions. In order to see the total number of combinations of states and transitions, both valid and invalid, a state table can be used.

Five Ways to Think about Black Box Testing

To help understand the different ways that software testing can be divided between black box and white box techniques, I’ll use the Five-Fold Testing System. It lays out five dimensions that can be used for examining testing: 1. People (who does the testing) 2. Coverage (what gets tested) 3. Risks (why you are testing) 4. Activities (how you are testing) 5. Evaluation (how you know you’ve found a bug) Let’s use this system to understand and clarify the characteristics of black box and white box testing. People: Who does the testing? Some people know how software works (developers) and others just use it (users). Accordingly, any testing by users or other nondevelopers is sometimes called “black box” testing. Developer testing is called “white box” testing. The distinction here is based on what the person knows or can understand. Coverage: What is tested? If we draw the box around the system as a whole, “black box” testing becomes another name for system testing. And testing the units inside the box becomes white box testing. This is one way to think about coverage. Another is to contrast testing that aims to cover all the requirements with testing that aims to cover all the code. These are the two most commonly used coverage criteria. Both are supported by extensive literature and commercial tools. Requirements-based testing could be called “black box” because it makes sure that all the customer requirements have been verified. Code-based testing is often called “white box” because it makes sure that all the code (the statements, paths, or decisions) is exercised. Risks: Why are you testing? Sometimes testing is targeted at particular risks. Boundary testing and other attack-based techniques are targeted at common coding errors. Effective security testing also requires a detailed understanding of the code and the system architecture. Thus, these techniques might be classified as “white box.” Another set of risks concerns whether the software will actually provide value to users. Usability testing

Page 41: RTDB

focuses on this risk, and could be termed “black box.” Activities: How do you test? A common distinction is made between behavioral test design, which defines tests based on functional requirements, and structural test design, which defines tests based on the code itself. These are two design approaches. Since behavioral testing is based on external functional definition, it is often called “black box,” while structural testing—based on the code internals—is called “white box.” Indeed, this is probably the most commonly cited definition for black box and white box testing. Another activity-based distinction contrasts dynamic test execution with formal code inspection. In this case, the metaphor maps test execution (dynamic testing) with black box testing, and maps code inspection (static testing) with white box testing. We could also focus on the tools used. Some tool vendors refer to code-coverage tools as white box tools, and tools that facilitate applying inputs and capturing inputs—most notably GUI capture replay tools—as black box tools. Testing is then categorized based on the types of tools used. Evaluation: How do you know if you’ve found a bug? There are certain kinds of software faults that don’t always lead to obvious failures. They may be masked by fault tolerance or simply luck. Memory leaks and wild pointers are examples. Certain test techniques seek to make these kinds of problems more visible. Related techniques capture code history and stack information when faults occur, helping with diagnosis. Assertions are another technique for helping to make problems more visible. All of these techniques could be considered white box test techniques, since they use code instrumentation to make the internal workings of the software more visible. These contrast with black box techniques that simply look at the official outputs of a program. To summarize, black box testing can sometimes describe user-based testing (people); system or requirements-based testing (coverage); usability testing (risk); or behavioral testing or capture replay automation (activities). White box testing, on the other hand, can sometimes describe developer-based testing (people); unit or code-coverage testing (coverage); boundary or security testing (risks); structural testing, inspection or code-coverage automation (activities); or testing based on probes, assertions, and logs (evaluation). Puzzles So now that we’ve examined some ways to think about the differences between black box and white box testing, let me leave you with a few puzzles. Let’s hear what you think. A. A programmer tests a class to ensure that it meets its functional requirements. Is this black box or white box testing? B. Your company develops software under a contract that stipulates that both white box and black box test techniques will be used. What tests are you obliged to execute? C. A nonprogrammer uses a test tool that automatically instruments the code and then generates tests to

Page 42: RTDB

ensure that a maximal number of lines of code are executed. The tests are considered to pass as long as the software doesn’t crash or hang. Is this black box or white box testing? D. What could it mean to perform “gray box” testing?

In Black Box Testing, the tester tests an application without knowledge of the internal workings of the functionalities being tested. Data are inputted into the application and the outcome is compared with the expected results; what the program does with the input data or how the program arrives at the output data is not a concern for the tester performing black box testing. All that is tested is the behavior of the functionalities being tested.

This is why black box testing is also known as functional testing which tests the functionality of a program. Note we can also have non-functional black box testing, such as performance testing which is a type of black box testing but instead of verifying the behavior of the system, it tests how long it takes for a function to respond to user’s inputs and how long it takes to process the data and generate outputs.

Because black box testing is not concerned with the underlying code, then the techniques can be derived from the requirement documents or design specifications and hence testing can start as soon as the requirements are written.

Some Black Box Testing Techniques are:

Equivalence Partitioning Boundary Value Analysis State Transition Testing Cause / Effect Graphing Classification Tree Method

Advantages of Black Box Testing are:

The test is unbiased because the designer and the tester are independent of each other The tester does not need knowledge of any specific programming languages The test is done from the point of view of the user, not the designer Test cases can be designed as soon as the specifications are complete

Disadvantages of Black Box Testing are:

The test can be redundant if the software designer has already run a test case The test cases are difficult to design Testing every possible input stream is unrealistic because it would take a inordinate amount of time;

therefore, many program paths will go untested

The benefits of Stored Procedures

Page 43: RTDB

Posted on Sunday, 24 June, 2007 by Colin Mackay

There are a number of ways to access data in SQL Server, or any enterprise DBMS. There are lots of books that discuss getting data in and out of databases and the best ways to do that. Many advocate the use of stored procedures to ensure the safety of the data.

The tree main benefits that I see on stored procedures are:

Abstraction Security Performance

Stored Procedures add an extra layer of abstraction in to the design of a software system. This means that, so long as the interface on the stored procedure stays the same, then the underlying table structure can change with no noticable consequence to the application that is using the database.

For instance, if the database has to be denormalised to get a little extra performance in certain situations then the stored procedures can handle the additional updates and inserts necessary to ensure the integrity of the data across the tables. Without this the each of the callers would have ensure that these changes had taken place. Of course, the use of stored procedures does not in anyway grant a waiver from properly designing the data model, but it can help if the perfect normalised model has to give way for performance improvements.

This layer of abstraction also helps put up an extra barrier to would be intruders. If access to the data in SQL Server is only ever permitted via stored procedures then permission does not need to be explicitly set on any of the tables. Therefore none of the tables should ever need to be exposed directly to outside applications. For an outside application to modify the database, it must go through stored procedures.

Stored procedures can be written to validate any input that is sent to them to ensure the integrity of the data beyond the simple constraints otherwise available on the tables. Parameters can be checked for valid ranges. Information can be cross checked with data in other tables.

Even if it is thought that someone attempting to crack into a website will never get this far in, from a security perspective, anything that can reduce the attack surface is beneficial.

Performance can be improved by the use of stored procedures. They are precompiled so when they are run there is no additional lag as the SQL is parsed, compiled, execution plans drawn up and then run, they just run because all that extra work is done at the time the CREATE PROCEDURE or ALTER PROCEDURE commands are run rather than when procedures themselves are run.

Another area in which stored procedures improve performance is that is pushes all the work onto the server in one go. A stored procedure can perform a series of queries and return many tables in, what is to the outside world, one operation. This saves the calling process from making many requests and the additional time of several network roundtrips. It also means that, if the contents of one set of data being returned is

Page 44: RTDB

dependent on the results of a previous set of data that is being retrieved through the same stored procedure, that the data only has to flow from the database server to the application. If stored procedures were not being used it would mean that the data from the first database call has to get sent back to the database for the second call in order for it to continue retrieving the information needed by the application.

For instance. Lets say that Northwind traders send out a quarterly statement to its customers, and that for each statement certain information needs to be extracted from the database. The tables Customer, Order and Order Details are used. This information could be retrieved in several steps by calling the database for each set of information as it is needed to generate the statements. First with a SELECT * FROM Customers WHERE CustomerID = @CustomerID. This gets the details for the head of the statement. Then a SELECT * FROM Orders WHERE CustomerID = @CustomerID AND OrderDate>=@StartDate AND OrderDate<=@EndDate to get the details for each individual order by that customer. And finally a series of calls (one for each of the Order records that were retrieved) like SELECT * FROM [Order Details] WHERE OrderID = @OrderID

Assuming that the customer in question is “Rattlesnake Canyon Grocery” and the period for the statement is Q1 1998 then that is 5 roundtrips to the database and 5 times the database has to parse some SQL. This could be done by a single stored procedure that takes only one trip to the database and is precompiled.

CREATE PROCEDURE GetQuarterlyStatement@CustomerID nvarchar(5),@StartDate datetime,@EndDate datetimeASSELECT * FROM Customers WHERE CustomerID=@CustomerIDSELECT * FROM Orders WHERE CustomerID=@CustomerID AND OrderDate>=@StartDate AND OrderDate<=@EndDate ORDER BY OrderDate DESCSELECT [Order Details].* FROM [Order Details] INNER JOIN Orders ON [Order Details].OrderID = Orders.OrderID WHERE CustomerID=@CustomerID AND OrderDate>=@StartDate AND OrderDate<=@EndDate ORDER BY OrderDate DESCGO

SQL SERVER – Stored Procedures Advantages and Best Advantage

April 13, 2007 by pinaldave

Page 45: RTDB

There are many advantages of Stored Procedures. I was once asked what do I think is the most important feature of Stored Procedure? I have to pick only ONE. It is tough question.I answered : Execution Plan Retention and Reuse (SP are compiled and their execution plan is cached and used again to when the same SP is executed again)

Not to mentioned I received the second question following my answer : Why? Because all the other advantage known (they are mentioned below) of SP can be achieved without using SP. Though Execution Plan Retention and Reuse can only be achieved using Stored Procedure only.

Execution plan retention and reuse Query auto-parameterization Encapsulation of business rules and policies Application modularization Sharing of application logic between applications Access to database objects that is both secure and uniform Consistent, safe data modification Network bandwidth conservation Support for automatic execution at system start-up Enhanced hardware and software capabilities Improved security Reduced development cost and increased reliability Centralized security, administration, and maintenance for common routines

tored procedures offer several distinct advantages over embedding queries in your Graphical User Interface (GUI). Your first thought may be: "Why tolerate the added development overhead?" After seeing the advantages, you may change your mind.

Advantage 1: Stored procedures are modular. This is a good thing from a maintenance standpoint. When query trouble arises in your application, you would likely agree that it is much easier to troubleshoot a stored procedure than an embedded query buried within many lines of GUI code.

Advantage 2: Stored procedures are tunable. By having procedures that handle the database work for your interface, you eliminate the need to modify the GUI source code to improve a query's performance. Changes can be made to the stored procedures--in terms of join methods, differing tables, etc.--that are transparent to the front-end interface.

Advantage 3: Stored procedures abstract or separate server-side functions from the client-side. It is much easier to code a GUI application to call a procedure than to build a query through the GUI code.

Advantage 4: Stored procedures are usually written by database developers/administrators. Persons holding these roles are usually more experienced in writing efficient queries and SQL statements. This frees the GUI application developers to utilize their skills on the functional and graphical presentation pieces of the application.

What is a Trigger in SQL Server?A Trigger is a database object that is attached to a table. In many aspects it is similar to a stored

Page 46: RTDB

procedure. As a matter of fact, triggers are often referred to as a "special kind of stored procedure." The main difference between a trigger and a stored procedure is that the former is attached to a table and is only fired when an INSERT, UPDATE or DELETE occurs.

What are the two types of Triggers in SQL Server?1. After Triggers : Fired after Insert, Update and Delete operations on a table.2. Instead of Triggers: Fired instead of Insert, Update and Delete operations on a table.

What are the special tables used by Triggers in SQL Server?Triggers make use of two special tables called inserted and deleted. The inserted table contains the data referenced in an INSERT before it is actually committed to the database. The deleted table contains the data in the underlying table referenced in a DELETE before it is actually removed from the database. When an UPDATE is issued both tables are used. More specifically, the new data referenced in the UPDATE statement is contained in inserted table and the data that is being updated is contained in deleted table.

Give a real time example for triggers usage?It is recomended to avoid triggers in a real time environment. There is one scenario I can think of why you may want to use triggers in a real time environment. Let us use an example to understand this.

I have 2 tables, tblPerson and tblGender as shown below. GenderId is the foriegn key in tblPerson table.

Now create a view based on the above 2 tables as shown below.

Page 47: RTDB

Select * from vWPersons will give us the result as shown below.

Now update the view the following query. This will change the Gender Text to Female in tblGender table for Id = 1. This is not what we have expected.

Update tblGender Set Gender='Female' where Id=1

The base tables are updated incorrectly. So, Select * from vWPersons will give us the result as shown below.

To update the base tables correctly, you can create an INSTEAD OF trigger on the view as shown below.

Now run the query below which will update the underlying base tables correctly.Update vWPersons Set Gender='Female' Where Id=1

Page 48: RTDB

Select * from vWPersons, will show the correct result set as shown below. The INSTEAD OF trigger has correctly updated the GenderId in tblPerson table.

So, Instead of triggers can be used to facilitate updating Views that are based on multiple base tables.

SQL Tuning or SQL OptimizationSql Statements are used to retrieve data from the database. We can get same results by writing different sql queries. But use of the best query is important when performance is considered. So you need to sql query tuning based on the requirement. Here is the list of queries which we use reqularly and how these sql queries can be optimized for better performance.

SQL Tuning/SQL Optimization Techniques:

1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

For Example: Write the query as

SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:

SELECT * FROM student_details;

 

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes. For Example: Write the query as

SELECT subject, count(subject) FROM student_details WHERE subject != 'Science' AND subject != 'Maths' GROUP BY subject;

Instead of:

Page 49: RTDB

SELECT subject, count(subject) FROM student_details GROUP BY subject HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

 

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query. For Example: Write the query as

SELECT name FROM employee WHERE (salary, age ) = (SELECT MAX (salary), MAX (age) FROM employee_details) AND dept = 'Electronics';

Instead of:

SELECT name FROM employeeWHERE salary = (SELECT MAX(salary) FROM employee_details) AND age = (SELECT MAX(age) FROM employee_details) AND emp_dept = 'Electronics';

 

4) Use operator EXISTS, IN and table joins appropriately in your query. a) Usually IN has the slowest performance. b) IN is efficient when most of the filter criteria is in the sub-query. c) EXISTS is efficient when most of the filter criteria is in the main query.

For Example: Write the query as

Select * from product p where EXISTS (select * from order_items o where o.product_id = p.product_id)

Instead of:

Select * from product p where product_id IN (select product_id from order_items

 

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship. For Example: Write the query as

Page 50: RTDB

SELECT d.dept_id, d.dept FROM dept d WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);

Instead of:

SELECT DISTINCT d.dept_id, d.dept FROM dept d,employee e WHERE e.dept = e.dept;

 

6) Try to use UNION ALL in place of UNION. For Example: Write the query as

SELECT id, first_name FROM student_details_class10 UNION ALL SELECT id, first_name FROM sports_team;

Instead of:

SELECT id, first_name, subject FROM student_details_class10 UNION SELECT id, first_name FROM sports_team;

 

7) Be careful while using conditions in WHERE clause. For Example: Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age != 10;

Write the query as

SELECT id, first_name, age FROM student_details WHERE first_name LIKE 'Chan%';

Instead of:

Page 51: RTDB

SELECT id, first_name, age FROM student_details WHERE SUBSTR(first_name,1,3) = 'Cha';

Write the query as

SELECT id, first_name, age FROM student_details WHERE first_name LIKE NVL ( :name, '%');

Instead of:

SELECT id, first_name, age FROM student_details WHERE first_name = NVL ( :name, first_name);

Write the query as

SELECT product_id, product_name FROM product WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)

Instead of:

SELECT product_id, product_name FROM product WHERE unit_price >= MAX(unit_price) and unit_price <= MIN(unit_price)

Write the query as

SELECT id, name, salary FROM employee WHERE dept = 'Electronics' AND location = 'Bangalore';

Instead of:

SELECT id, name, salary FROM employee WHERE dept || location= 'ElectronicsBangalore';

Use non-column expression on one side of the query because it will be processed earlier.

Write the query as

SELECT id, name, salary FROM employee WHERE salary < 25000;

Instead of:

Page 52: RTDB

SELECT id, name, salary FROM employee WHERE salary + 10000 < 35000;

Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age NOT = 10;

8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause. For Example: Write the query as

SELECT id FROM employee WHERE name LIKE 'Ramesh%' and location = 'Bangalore';

Instead of:

SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee WHERE name LIKE 'Ramesh%';

9) To store large binary objects, first place them in the file system and add the file path in the database.

10) To write queries which provide efficient performance follow the general SQL standard rules.

a) Use single case for all SQL verbsb) Begin all SQL verbs on a new linec) Separate all words with a single space d) Right or left aligning verbs within the initial SQL verb

Important Microsoft SQL Server Queries

Here i am going to tell you all the important queries used in Microsoft SQL Server. Just go through this page and you will become master in Microsoft SQL Server because all the queries used in daily programming are briefly explained here.

To create own database

create database demostart

Page 53: RTDB

To view the database details

sp_helpdb demostart

To open database

use demostart

To create table in the database

create table emp(eno int,ename varchar(10),hiredate datetime,salary int)

To see the table information

sp_help emp

sp_help:-

It is System defined stored procedure which gives us the information about a table.

To insert data in a table

insert into emp values (100,’king’,’10-may-09′,5000)insert into emp values(200,’dd’,’20-april-08′,4300)insert into emp values (102,’ee’,getdate(),6700) — default print date & time

To insert values in a particular column

insert into emp(eno,ename) values (300,’xyx’)

To see all the records

select *from emp

To delete table

drop table emp

To update ename with space where eno=100

Page 54: RTDB

update emp set ename=’ ‘ where eno=100

To add additional column (address and phno) in a table

alter table emp add address varchar(30),phno varchar(15)

To see all the records

select *from emp

To modify column size

alter table emp alter column address varchar(45)

To see the table information

sp_help emp

To modify column name(or rename)

sp_rename ‘emp.eno’,'empno’———– (old name)-(new name)

To rename table

sp_rename ‘emp’,'employee’

To rename of database name

sp_renamedb ‘demostart’,'demoversion’

Drop table column

alter table employee drop column address,phno

To see all the records

select *from employee

To update ename with ‘amir’ where salary is NULL

update employee set ename=’amir’ where salary is NULL

To add value 1000 in salary field (adding temporarly)

Page 55: RTDB

select salary+1000, salary from employee where empno=100

To add value 1000 in salary field (adding permanent)

update employee set salary= salary+1000 where empno=100

–operator in sql server–

1)Arithmatic :

+,-,/,*

2)comparison :

=,<,>,<=,>=,<>,like,between,in, is null.

3)logical :

not,and,or

Comparison operator

To see the list those employee who’s empno is equal to 100

select *from employee where empno=100

To see the list those employee who’s empno is not-equal to 100

select *from employee where empno<>100

To insert one more record

insert into employee values(101,’anuj’,’01-jan-10′,2500)

To see all the records

select *from employee

List of those employee whose salary is less than 6000

select *from employee where salary < 6000

in operator to show only 100 and 300 record not show in between

Page 56: RTDB

select *from employee where empno in(100,300)in :we can search for the data based upon the values that are specified in the brackets.the data will be retrieved if the employee number are 100 or 300)

between operator

The data is retrieved from the table based on a certain range of values starting fromthe lowerbound to the upperbound.select *from employee where empno between 100 and 200

like operator

used for pattern matching.Example:- names starting with a particular character or ending with a particular character.select *from employee where ename like ‘%r’ –’r’ should be the last char. of enameselect *from employee where ename like ‘a%’ –’a’ should be the first char. of ename

pattern matching with exact no. of characters

use _ (underscore operator)one _ sign means exactly 1 character.select *from employee where ename like ‘a___’

Logical operators:-

To show the list of all employeea whose empno is not equal to 100select *from employee where not empno=100

To show the list of all employees whose empno is 200 or ename equal to amir

select *from employee where empno=200 or ename=’amir’

To show the list of all employeea whose empno is 300 and ename equal to amir

select *from employee where empno=300 and ename=’amir’

Identity column

These are those column which are used for autogetneration of interger values in a particular sequence.Ex. we can use identity columns to autogenrate the ID’s in the categoryid column of the category table.There can only be 1 identity column per table.

create table categories(catid int identity(1,1), –first 1 : starting value

Page 57: RTDB

catname varchar(20), –2nd 1 : increamenting valuedescription varchar(20))

To insert records

insert into categories values(‘beverages’,'teas and coffees’)insert into categories values(‘cold drinks’,'fanta pepsi’)

To see all the records

select *from categories

Insert randomly.

set identity_insert categories on — when we identity_insert set is on then we cannot insert dublicate records in a table.set identity_insert categories off — when we identity_insert set is off then we can insert dublicate records in a table.

To know how many datatypes

select *from systypes

To view the names of the user defined tables in a database

select name from sysobjects where xtype=’u’

Constraints

These are the objects used to validate the data Entry in tables columns.Type of constraints :-   

1) primary key :-

checks for dublicate and null values.       primary key can be defined for single or composite columns.       one primary per table.   

2) unique keys :-

checks for dublicate values. one null value is allowed.   

Page 58: RTDB

3) checks :-

The data must obey the rules of comparison opeators(=,<,>,<=,>=,in,between,like).   

4) Rules :-

They are like the checks constraints but have certains differences.

      

Difference b/w check and Rules

       Check constraints are applied with the column definitons.       Rules are created separately and then can be attached with different columns of different tables.       checks constraints check for existing data.       Rules do not check for the existing data.

   

5) Not Null :-

we have to Enter the values on the columns where not null constraints is defined.   

6) Foreign key :-

used for Referential integrity.       if we insert a particular data in a table column, then matching data should be inserted       in the column where the foreign key is defined. The foreign key will check for data by       referencing the column where data is inserted. The Referenced column have primary key or       unique key associated with it.

       constraints can be created using the create table or alter table clause.

To define Constraints :-

create table consdemo(cno int primary key,cname varchar(10) unique,balance int check(balance >5000))

insert records

Page 59: RTDB

insert consdemo values (100,’abc’,4000) –cannot be accepted coz applied check i.e balance>5000.insert consdemo values (100,’abc’,6000) — accepted

insert consdemo values (101,’abc’,7000) –not accepted coz cno is primary key and cname applied, unique (i.e dublicate value not allowed)

To view the names of constraints define on a table

sp_helpconstraint consdemo

To drop the Constraints

alter table consdemo drop constraint — not run

insert records

insert consdemo values (102,’xyz’,54) — not run

put check constraint

alter table consdemo add constraint chk check(balance>50) –not run

use of foreign key

create table publishers(pubid int primary key,pubname varchar(20))

insert publishers values(1,’book bazar’)insert publishers values(2,’wiley Eastern’)

select *from publisherscreate table author(authid int primary key,pubid int references publishers(pubid),authorname varchar(20))insert author values(1,2,’Bill Evjen’)

select *from publishers

select *from author

Page 60: RTDB

insert author values(2,2,’Bill Evjen’)insert author values(3,1,’Bill Evjen’)

creating constraints by giving user defined names :–

create table consdd(cno int constraint pk primary key,cname varchar(10) constraint uk unique,salary int constraint chk1 check(salary>5000))

To view the names of constraints define on a table

sp_helpconstraint consdd

To drop the constraint chk in the table of consdd

alter table consdd drop constraint chk1

To view the names of constraints define on a table

sp_helpconstraint consdd

RULES:–

create & Apply RULE–

create rule xyz as @a>5000— @a : It denotes a variable in sql server known as a.

To Apply the Rule–

sp_bindrule xyz,’consdd.salary’–Note : we assign a rule i.e(@a>5000), so we cannot insert salary<5000)select *from consddinsert consdd values(101,'opk1',6100)-- acceptedinsert consdd values(102,'opk2',100) --not accepted(not follow the cond. i.e @a>5000)sp_helpconstraint consdd

To unbind the rule

sp_unbindrule ‘consdd.salary’insert consdd values(102,’opk2′,100)– after unbind the rule, then the query will be accepted.

Page 61: RTDB

To drop the Rule

drop rule xyz

To view the names of constraints define on a table

sp_helpconstraint consdd

–Another Example of create and applying Rulecreate rule adb as @a between ‘a’ and ‘d’— ‘a’ and ‘d’ is the ist char of cname.sp_bindrule adb, ‘consdd.cname’

insert consdd values(400,’edf’,3000)– not accepted coz starting char should a and d.insert consdd values(400,’daf’,3000)– not accepted coz starting char should a and d.

insert consdd values(500,’aef’,3000)– acceptedinsert consdd values(300,’adf’,3000)– acceptedinsert consdd values(600,’brf’,3000)– acceptedinsert consdd values(700,’cef’,3000)– accepted

select *from consdd

To unbind the rule

sp_unbindrule ‘consdd.cname’insert consdd values(400,’edf’,3000)–accepted (unbind rule)

To drop the Rule

drop rule adb

composite primary key–

— It checks that the particular row that contains a set of values (ex-> 100,abc)cannot be created again.create table comkey(cno int,cname varchar(10),address varchar(40),constraint pkk primary key(cno,cname))

To change the table name(i.e rename)

Page 62: RTDB

sp_rename ‘comkey’, ‘compkey’select *from compkeyinsert compkey values(100,’abd’,’23/sds’)insert compkey values(101,’abd’,’45/df’)

select *from compkey

insert compkey values(101,’bd’,’45/df’)insert compkey values(100,’abd’,’45/df’)

–Note: when we use composite key:- you cannot insert the same record at a time.–ex. we put the record cno=100 and cname=abd at ist time it will be accepted but— 2nd time cannot be accepted. but we put cno=100 and cname=xyz(i.e cname is different)— it will be accepted, or we put cno=200 (i,e cno is diff. value) and cname is same i.e cname=abd— it will be accepted.

Drop primary key

alter table compkey drop constraint pkk

on delete cascade clause–

— It is used on foreign keys when we are assigning the foreign key on a particular column.— It automatically deletes the values from the foreign key cloumn if the primary key value is deleted.

–Ex. If 100 is there in primary key column and also there in the foreign key column, then if 100 is–deleted on primary key column then 100 is automatically deleted from foreign key column.

create table casededemo(sno int constraint pkm primary key,sname varchar(10))

create table cascdata(sno int constraint fk foreign key references casededemo(sno) on delete cascade,sales int)–note: ‘on delete’ used for automatically deleted.

insert into casededemo values(200,’fgh’)insert into casededemo values(100,’xyz’)

select *from casededemo

Page 63: RTDB

insert into cascdata values(200,6000)insert into cascdata values(100,4000)select *from cascdata

delete from casededemo where sno=100 –Note : delete 100 from both table.

select *from cascdataselect *from casededemo

use of order by and group by clause—

–ORDER BY :- It is used to sort data either in ascending or descending order.select *from consdemo

–insert two records more in consdemo tableinsert into consdemo values(104,’xxx’,7000)insert into consdemo values(102,’sss’,8000)insert into consdemo values(103,’bbb’,7600)insert into consdemo values(101,’ccc’,7600)insert into consdemo values(105,’ddd’,7600)insert into consdemo values(106,’eee’,7600)

select *from consdemo

To apply ORDER BY in desc order.

select *from consdemo order by cno desc

GROUP BY—

— It is used to display the data from the aggregate function as well as the data from the table.

To count total balance

select balance,count(balance) from consdemo group by balance

–To count total cno with own column name i.e ‘ count of cno’select cno,count(cno) as “count of cno” from consdemo group by cno

RESTRICTION OF GROUP BY CLAUSES –

–HAVING clause–

select balance,count(balance)as “result” from consdemo group by balance having count(balance)>1

Page 64: RTDB

–use the top keyword in sql queries–

–Top keyword is used to display the data either from the begining or from the end or from any— given position in a table.

–Ex:- If we want to view the first 3 records or the last 3 records or the details of records— in a given range.

–To print the details of employee who earn the three highest salaries,we can use–Top keyword.

select *from consdemo–show top 2 records–select top 2 *from consdemo

–show last 2 recordselect top 2 *from consdemo order by cno desc

–Set operators in sql server–

— These are used to combine the result of 2 or more queries in a single output.–Ex: If we want to view to the details of a column known as salary from first table and also from the— 2nd table, we can use the set operators.

– Two union operators:-

1) union :- All the union values are displayed in the result.2) union All :- The different values (unique or non unique) are displayed in the result.

– To use the set operators:-

–1)Data type must be same for the columns mentioned in the query.–2)The number of columns mentioned in the query.

create table salesdetails(sno int,salary int)insert salesdetails values(300,5000)select *from salesdetails

– Now create another table which is a copy of the salesdetails table.create table salesunion(sno int,

Page 65: RTDB

salary int)insert salesunion values(200,1000)

select * into sales_union from salesdetails

select *from salesunionselect *from salesdetails

–To copy records from another table

select *into sales_details from salesdetailsselect *from sales_details

–To show salary of salesdetails and sno of salesunionselect salary from salesdetails union select sno from salesunion

–To show salary of salesdetails and cno of consdemoselect salary from salesdetails union select cno from consdemo

–temporary tables–create table #emp(eno int,ename varchar(10))insert #emp values(100,’abc’)select *from #empsp_help #emp

–SUBQUERIES–

–These are the sql statements which are Embedded inside another sql statements.–Ex: 1) To find the details of the person who earn the maximum salary or 3rd highest salary–we have to use the subqueries.–subqueries always written in parenthesis.–subqueries evaluate a particular data which can then be used as the input to the main query.–subqueries evaluate first.–subqueries can be nested 21 times.

–TYPES OF SUBQUERIES ––1) Single row subqueries.–2) Multiple Row subqueries.–3) subqueries using the exist operator.

–only 1 row is returned by the subquery.

Page 66: RTDB

–MULTIPLE ROW SUBQUERIES–

–subqueries–in a single row—select *from salesdetails where salary=(select max(salary) from salesdetails)

–subqueries–in a multiple row—select *from salesdetails where salary in(select salary from sales_details)insert sales_details values(300,3000)select *from sales_details

—subquery using exits operatorselect *from salesdetails where exists(select salary from salesunion where sno=100)select *from salesdetails where not exists(select salary from salesunion where sno=1600)

—joins—

create table table1(sno int,salary int)

insert table1 values(100,2000)insert table1 values(200,4000)insert table1 values(300,5000)insert table1 values(400,6000)insert table1 values(500,7000)

select *from table1

create table table2(sno int,salary int)insert table2 values(600,7000)insert table2 values(100,2000)insert table2 values(200,4000)insert table2 values(300,5000)

select *from table2–CROSS JOIN–select *from table1 cross join table2

–INNER JOIN—select s.sno,s.salary,t.sno,t.salary from table1 s inner join table2 t on s.sno=t.sno

Page 67: RTDB

–outer join–select s.sno,s.salary,t.sno,t.salary from table1 s left outer join table1 t on s.sno=t.sno

–right outer join–select s.sno,s.salary,t.sno,t.salary from table1 s right outer join table2 t on s.sno=t.sno

–full outer join–select s.sno,s.salary,t.sno,t.salary from table1 s full outer join table2 t on s.sno=t.sno

–self join–alter table table1 add sname varchar(10)update table2 set salary=300 where salary=2000update table1 set sname=’ddy’ where salary=300select *from table1select *from table2–query of self joinselect s.sno,t.salary,s.sname from table1 s inner join table2 t on s.sno=t.salary

–views–

select *from salesdetailsalter table salesdetails add sname varchar(10)insert salesdetails values(456,800,’ffg’)update salesdetails set sname=’ddy’ where sno=300

create view vm as select *from salesdetails where sno=100select *from vminsert vm values(456,800,’ffg’)–not allowed to add record in salesdetailsalter view vm as select *from salesdetails where sno=100 with check optioninsert vm values(456,800,’ffg’)sp_helptext vmdrop view vm

–PARTITION VIEWS–create view vm as select sno,salary from salesdetails union all select *from salesunionselect *from vm

–READONLY VIEW–create view vwd as select count(*) as “count”from salesdetailsselect *from vwdinsert vwd values (23) –can’t be insert coz it’s read only

—show all database viewsselect *from sysobjects where xtype=’v’

Page 68: RTDB

–to show view in one table

sp_depends salesdetails

–STORED PROCEDURE–

–1)create procedure proceasselect *from empproce

–2)create procedure demo(@a int,@b int)asdeclare@c intset @c=@a+@bprint ‘sum=’+ convert(varchar(10), @c)demo 2,3drop procedure demo–3)create procedure ques1(@a int ,@b int)asdeclare@c intset @c=@a+@bprint @c

ques1 2 ,3sp_helptext demo–4)create procedure ques2(@a varchar(10))asdeclare@b varchar(10)set @b=@aprint ‘b= ‘+@b

ques2 ‘hello pro’

–5)area of circlecreate procedure ques3(@pi float ,@r float )asdeclare@area float

Page 69: RTDB

set @area=@pi*@r*@rprint ‘area of cir=’+convert(varchar(10),@area)

ques3 3.14,2

–6)area of trianglecreate procedure ques4(@b int ,@h int)asdeclare@area intset @area=(@b*@h)/2print ‘area of tri=’+convert(varchar(10),@area)

ques4 3,5

drop procedure ques4sp_helptext ques4

–IF ELSE–

–1)w.a.proc to find the greatest b/w two nos.create procedure ques5(@a int,@b int)as

if @a>@bprint convert(varchar(5),@a)+’ a is greater’elseprint convert(varchar(5),@b)+’ b is greater’

ques5 3,5

–2)w.a.proc to find the greatest b/w three nos.create procedure ques6(@a int ,@b int ,@c int)asif @a>@bif @a>@cprint ‘ a is g’elseprint ‘c is g’elseif @b>@cprint ‘b is g’elseprint ‘c is g’

ques6 2,4,3

Page 70: RTDB

–3)extracting to databasecreate procedure ques7(@a int)asdeclare@nm varchar(10)select @nm= ename from emp where eno=@aprint @nm

ques7 101

select *from empdrop procedure ques7–4)extracting to databasecreate procedure ques8(@a varchar(10))asdeclare@nm varchar(10)select @nm= ename from emp where ename=@aif @nm=@aprint’found’elseprint’not found’

ques8 ‘radhika’

—LOOPING—

–1)while loopcreate procedure whloop (@a int)aswhile @a<20beginset @a=@a+1print @aendwhloop 10

--2)print rev ordercreate procedure whloop1(@no int)asdeclare@rev int,@rem intset @rev=0while @no>0beginset @rem=@no%10

Page 71: RTDB

set @rev=@rev*10+@remset @no=@no/10end

print @revwhloop1 1234 drop procedure whloop1

–3)NESTED PROCEDURE–create procedure firstasprint ‘first procedure’

create procedure secondasprint ’2nd procedure’exec first

second–FREE TEXT SEARCHINGcreate table ftextsearch(fno int primary key,ename varchar(20),address varchar(40))select *from ftextsearch

insert ftextsearch values(50,’xx’,’0-2LN’)select *from ftextsearch where contains (address,’0-2LN’)–select address from ftextsearch where address=’0-2LN’sp_help ftextsearchdrop table ftextsearch

I want an SQL query - select top 10 records except for the top 10 records.Asked by: Jerry_Pang

have some theory, i think it should work, i havnt figure it out yet.  I couldnt get my sql running.

I just need a SQL statement that

SELECT TOP 10 recordsFROM table

Page 72: RTDB

WHERE records NOT (SELECT TOP 10 from table)

its not working and it wont run as i expect it.

select top 10 * employeesfrom employeeswhere NOT employee_id IN (select top 10 employee_id from employee)

no views pls. just sql statement.

basically the query means select the next 10 records after the first top 10 records.

What is normalization? Explain different levels of normalization?

Check out the article Q100139 from Microsoft knowledge base and of course, there's much more information available in the net. It will be a good idea to get a hold of any RDBMS fundamentals text book, especially the one by C. J. Date. Most of the times, it will be okay if you can explain till third normal form.

What is de-normalization and when would you go for it?

As the name indicates, de-normalization is the reverse process of normalization. It is the controlled introduction of redundancy in to the database design. It helps improve the query performance as the number of joins could be reduced.

How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?

One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships. Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.

It will be a good idea to read up a database designing fundamentals text book.

Page 73: RTDB

What's the difference between a primary key and a unique key?

Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where as unique creates a non-clustered index by default. Another major difference is that, primary key does not allow NULLs, but unique key allows one NULL only.

What are user defined data types and when you should go for them?

User defined data types let you extend the base SQL Server data types by providing a descriptive name, and format to the database. Take for example, in your database, there is a column called Flight_Num which appears in many tables. In all these tables it should be varchar(8). In this case you could create a user defined data type called Flight_num_type of varchar(8) and use it across all your tables.

See sp_addtype, sp_droptype in books online.

What is bit data type and what's the information that can be stored inside a bit column?

Bit data type is used to store Boolean information like 1 or 0 (true or false). Until SQL Server 6.5 bit data type could hold either a 1 or 0 and there was no support for NULL. But from SQL Server 7.0 onwards, bit data type can represent a third state, which is NULL.

Define candidate key, alternate key, composite key.

A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys.

Page 74: RTDB

A key formed by combining at least two or more columns is called composite key.

What are defaults? Is there a column to which a default cannot be bound?

A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE DEFAULT in books online.

What is a transaction and what are ACID properties?

A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction. For more information and explanation of these properties, see SQL Server books online or any RDBMS fundamentals text book.

Explain different isolation levels

An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable. See SQL Server books online for an explanation of the isolation levels. Be sure to read about SET TRANSACTION ISOLATION LEVEL, which lets you customize the isolation level at the connection level.

CREATE INDEX myIndex ON myTable (myColumn)

What type of Index will get created after executing the above statement?

Page 75: RTDB

Non-clustered index. Important thing to note: By default a clustered index gets created on the primary key, unless specified otherwise.

What is the maximum size of a row?

8060 bytes. Do not be surprised with questions like 'What is the maximum number of columns per table'. Check out SQL Server books online for the page titled: "Maximum Capacity Specifications".

Explain Active/Active and Active/Passive cluster configurations

Hopefully you have experience setting up cluster servers. But if you do not, at least be familiar with the way clustering works and the two clustering configurations Active/Active and Active/Passive. SQL Server books online has enough information on this topic and there is a good white paper available on Microsoft site.

Explain the architecture of SQL Server

This is a very important question and you better be able to answer it if consider yourself a DBA. SQL Server books online is the best place to read about SQL Server architecture. Read up the chapter dedicated to SQL Server Architecture.

What is Lock Escalation?

Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into higher level locks (like table locks). Every lock is a memory structure too many locks would mean, more memory being occupied by locks. To prevent this from happening, SQL Server escalates the many fine-grain locks to fewer coarse-grain locks. Lock escalation threshold was definable in SQL Server 6.5, but from SQL Server 7.0 onwards it's dynamically managed by SQL Server.

Page 76: RTDB

What's the difference between DELETE TABLE and TRUNCATE TABLE commands?

DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. TRUNCATE TABLE also deletes all the rows in a table, but it will not log the deletion of each row, instead it logs the de-allocation of the data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back.

Explain the storage models of OLAP

Check out MOLAP, ROLAP and HOLAP in SQL Server books online for more information.

What are the new features introduced in SQL Server 2000 (or the latest release of SQL Server at the time of your interview)? What changed between the previous version of SQL Server and the current version?

This question is generally asked to see how current is your knowledge. Generally there is a section in the beginning of the books online titled "What's New", which has all such information. Of course, reading just that is not enough, you should have tried those things to better answer the questions. Also check out the section titled "Backward Compatibility" in books online which talks about the changes that have taken place in the new version.

What are constraints? Explain different types of constraints.

Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you to create triggers, rule or defaults.

Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY

Page 77: RTDB

For an explanation of these constraints see books online for the pages titled: "Constraints" and "CREATE TABLE", "ALTER TABLE"

What is an index? What are the types of indexes? How many clustered indexes can be created on a table? I create a separate index on each column of a table. what are the advantages and disadvantages of this approach?

Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker.

Indexes are of two types. Clustered indexes and non-clustered indexes. When you create a clustered index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and it's row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table.

If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same time, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What is RAID and what are different types of RAID configurations?

RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database servers. There are six RAID levels 0 through 5 offering different levels of performance, fault tolerance. MSDN has some information about RAID levels and for detailed information, check out the RAID advisory board's homepage

What are the steps you will take to improve performance of a poor performing query?

Page 78: RTDB

This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.

Some of the tools/ways that help you troubleshooting performance problems are:

SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.

Download the white paper on performance tuning SQL Server from Microsoft web site.

What are the steps you will take, if you are tasked with securing an SQL Server?

Again this is another open ended question. Here are some things you could talk about: Preferring NT authentication, using server, database and application roles to control access to the data, securing the physical database files using NTFS permissions, using an unguessable SA password, restricting physical access to the SQL Server, renaming the Administrator account on the SQL Server computer, disabling the Guest account, enabling auditing, using multi-protocol encryption, setting up SSL, setting up firewalls, isolating SQL Server from the web server etc.

Read the white paper on SQL Server security from Microsoft website. Also check out My SQL Server security best practices

What is a deadlock and what is a live lock? How will you go about resolving deadlocks?

Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece. Each process  would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.

Page 79: RTDB

A livelock is one, where a  request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.

Check out SET DEADLOCK_PRIORITY and "Minimizing Deadlocks"  in SQL Server books online. Also check out the article Q169960 from Microsoft knowledge base.

What is blocking and how would you troubleshoot it?

Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.

Read up the following topics in SQL Server books online: Understanding and avoiding blocking, Coding efficient transactions.

Explain CREATE DATABASE syntax

Many of us are used to creating databases from the Enterprise Manager or by just issuing the command:

CREATE DATABASE MyDB.

But what if you have to create a database with two file groups, one on drive C and the other on drive D with log on drive E with an initial size of 600 MB and with a growth factor of 15%? That's why being a DBA you should be familiar with the CREATE DATABASE syntax. Check out SQL Server books online for more information.

Page 80: RTDB

How to restart SQL Server in single user mode? How to start SQL Server in minimal configuration mode?

SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very important parameters with which a DBA should be familiar with. -m is used for starting SQL Server in single user mode and -f is used to start the SQL Server in minimal configuration mode. Check out SQL Server books online for more parameters and their explanations.

As a part of your job, what are the DBCC commands that you commonly use for database maintenance?

DBCC CHECKDB, DBCC CHECKTABLE, DBCC CHECKCATALOG, DBCC CHECKALLOC, DBCC SHOWCONTIG, DBCC SHRINKDATABASE, DBCC SHRINKFILE etc.

But there are a whole load of DBCC commands which are very useful for DBAs. Check out SQL Server books online for more information.

What are statistics, under what circumstances they go out of date, how do you update them?

Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not while executing a query.

Some situations under which you should update statistics:

1. If there is significant change in the key values in the index

Page 81: RTDB

2. If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated

3. Database is upgraded from a previous version

Look up SQL Server books online for the following commands:

UPDATE STATISTICS, STATS_DATE, DBCC SHOW_STATISTICS, CREATE STATISTICS, DROP STATISTICS, sp_autostats, sp_createstats, sp_updatestats

What are the different ways of moving data/databases between servers and databases in SQL Server?

There are lots of options available, you have to choose your option depending upon your requirements. Some of the options you have are:

BACKUP/RESTORE, Detaching and attaching databases, Replication, DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.

Explain different types of BACKUPs available in SQL Server? Given a particular scenario, how would you go about choosing a backup plan?

Types of backups you can create in SQL Sever 7.0+ are Full database backup, differential database backup, transaction log backup, filegroup backup. Check out the BACKUP and RESTORE commands in SQL Server

Page 82: RTDB

books online. Be prepared to write the commands in your interview. Books online also has information on detailed backup/restore architecture and when one should go for a particular kind of backup.

What is database replication? What are the different types of replication you can set up in SQL Server?

Replication is the process of copying/moving data between databases on the same or different servers. SQL Server supports the following types of replication scenarios:

    * Snapshot replication    * Transactional replication (with immediate updating subscribers, with queued updating subscribers)    * Merge replication

See SQL Server books online for in-depth coverage on replication. Be prepared to explain how different replication agents function, what are the main system tables used in replication etc.

How to determine the service pack currently installed on SQL Server?

The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed. To know more about this process visit SQL Server service packs and versions.

What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?

Cursors allow row-by-row processing of the resultsets.

Types of cursors:

Page 83: RTDB

Static, Dynamic, Forward-only, Keyset-driven.

See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one round trip, however large the resultset is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Further, there are restrictions on the SELECT statements that can be used with some types of cursors.

Most of the times, set based operations can be used instead of cursors. Here is an example:

If you have to give a flat hike to your employees using the following criteria:

Salary between 30000 and 40000 -- 5000 hikeSalary between 40000 and 55000 -- 7000 hikeSalary between 55000 and 65000 -- 9000 hike

In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above formula. But the same can be achieved by multiple update statements or can be combined in a single UPDATE statement as shown below:

UPDATE tbl_emp SET salary =CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000END

Page 84: RTDB

Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this. This can be achieved using WHILE loop, as long as there is a unique key to identify each row.

Write down the general syntax for a SELECT statements covering all the options.

Here's the basic syntax: (Also checkout SELECT in books online for advanced syntax).

SELECT select_list[INTO new_table_]FROM table_source[WHERE search_condition][GROUP BY group_by__expression][HAVING search_condition][ORDER BY order__expression [ASC | DESC] ]

What is a join and explain different types of joins?

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.

Types of joins:

INNER JOINs, OUTER JOINs, CROSS JOINs

OUTER JOINs are further classified as

Page 85: RTDB

LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

For more information see pages from books online titled: "Join Fundamentals" and "Using Joins".

Can you have a nested transaction?

Yes, very much. Check out BEGIN TRAN, COMMIT, ROLLBACK, SAVE TRAN and @@TRANCOUNT

What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?

An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement. See books online to learn how to create extended stored procedures and how to add them to SQL Server.

Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.

Also see books online for sp_OAMethod, sp_OAGetProperty, sp_OASetProperty, sp_OADestroy.

What is the system function to get the current user's user id?

USER_ID(). Also check out other system functions like

Page 86: RTDB

USER_NAME(), SYSTEM_USER, SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().

What are triggers? How many triggers you can have on a table? How to invoke a trigger on demand?

Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation takes place on a table.

In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE. From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there's no way to control the order in which the triggers fire. In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder

Triggers cannot be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster.

Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also. Search SQL Server 2000 books online for INSTEAD OF triggers.

Also check out books online for 'inserted table', 'deleted table' and COLUMNS_UPDATED()

Page 87: RTDB

There is a trigger defined for INSERT operations on a table, in an OLTP system. The trigger is written to instantiate a COM object and pass the newly inserted rows to it for some custom processing.

What do you think of this implementation? Can this be implemented better?

Instantiating COM objects is a time consuming process and since you are doing it from within a trigger, it slows down the data insertion process. Same is the case with sending emails from triggers. This scenario can be better implemented by logging all the necessary data into a separate table, and have a job which periodically checks this table and does the needful.

 

Ads

 

What is a self join? Explain it with an example.

Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example: Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.

CREATE TABLE emp(empid int,mgrid int,empname char(10))

INSERT emp SELECT 1,2,'Vyas'INSERT emp SELECT 2,3,'Mohan'INSERT emp SELECT 3,NULL,'Shobha'INSERT emp SELECT 4,2,'Shridhar'INSERT emp SELECT 5,2,'Sourabh'

Page 88: RTDB

SELECT t1.empname [Employee], t2.empname [Manager]FROM emp t1, emp t2WHERE t1.mgrid = t2.empid

Here is an advanced query using a LEFT OUTER JOIN that even returns the employees without managers (super bosses)

SELECT t1.empname [Employee], COALESCE(t2.empname, 'No manager') [Manager]FROM emp t1LEFT OUTER JOINemp t2ONt1.mgrid = t2.empid