ado.net 2.0: advanced data access patterns pablo castro dat408 ado.net technical lead microsoft...

27
ADO.NET 2.0: ADO.NET 2.0: Advanced Data Access Advanced Data Access Patterns Patterns Pablo Castro Pablo Castro DAT408 DAT408 ADO.NET Technical Lead ADO.NET Technical Lead Microsoft Corporation Microsoft Corporation

Upload: dylan-harper

Post on 05-Jan-2016

230 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

ADO.NET 2.0: ADO.NET 2.0: Advanced Data Access Advanced Data Access PatternsPatterns

Pablo CastroPablo CastroDAT408 DAT408 ADO.NET Technical LeadADO.NET Technical LeadMicrosoft CorporationMicrosoft Corporation

Page 2: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

2

PRESENTATION

ASP.NETSystem.Web (2.0)CompilationConfigurationHandlersHostingMailManagementSecurityUI (2.0)Util

Presentation FrameworkSystem.Windows (3.0)

AnnotationsAutomationControlsDataDocumentsInkInteropNavigationResourcesSerializationShapesThreading

System.Windows.Media (3.0)

3DAnimationImagine

Windows FormsSystem.Windows.Forms (2.0)

DesignLayoutVisualStyles

Printing SubsystemSystem.Printing (3.0)GDI+System.Drawing

DesignDrawing2DPrinting

ASP.NET Data ManagementSystem.WebCachingProfileSessionState

XML Data System.XmlSchemaSerializationXPathXsl

Windows File SystemSystem.Storage (F)CoreAudioCalendarContactsDocumentsImageMediaMessagesRulesSyncVideo

XPS DocumentsSystem.Windows.Xps (3.0)System.IO.Packaging (3.0)Speech IntegrationSystem.Speech (3.0)RecognitionSynthesis

Language Integrated QuerySystem.Query (F)System.Data.DLinq (F)System.Xml.XLinq (F)System.Expressions (F)

Windows Workflow FoundationSystem.Workflow (3.0)System.Workflow.Activities (3.0)

RulesSystem.Workflow.ComponentModel (3.0)

ComplierDesingerSerializaztion

System.Workflow.Runtime (3.0)HostingMessaging

Windows Communications FoundationSystem.ServiceModel (3.0)

ChannelsConfigurationDiagnosticsIntegrationQueueHelper

System.ServiceModel.Security(3.0)

ProtocolsTokens

Network Class LibrarySystem.Net CacheMail (2.0)Network Information (2.0)Security (2.0)Sockets

.NET RemotingSystem.Runtime.Remoting

ASMX Web ServicesSystem.Web.Services

Identity ManagementMicrosoft.InfoCards (3.0)

MSMQSystem.Messaging

Directory ServicesSystem.DirectoryServices

COMMUNICATION

FUNDAMENTALS

BASE CLASS LIBRARIESSystem

System.CodeDomSystem.ComponentModelSystem.Diagonostics

System.IO (2.0)

System.Resrouces

System.Text

System.ServiceProcess

System.ThreadingSystem.TimersSystem.EnterpriseServicesSystem.Transactions (2.0)

.NET RemotingSystem.Runtime.Remoting

System.TextGeneric (2.0)

System.Reflection

EmailSystem.Configuration

System.TextAccessControl (2.0)Cryptography (2.0)PermissionsPolicyPrincipal (2.0)

Managed Add-In FrameworkSystem.Addins (F)

Contact

Microsoft.Build (2.0)

System.RuntimeCompilerServicesConstrainedExecution(2.0)InteropServicesHostingSerializationVersioning

“ClickOnce” DeploymentSystem.Deployoment (2.0)

WINDOWS VISTA

(2.0) (3.0) - New

(2.0) - Substantially Improved

- Windows Presentation Foundation(formerly codenamed “Avalon”)

- Windows Communication Foundation

(formerly codenamed “Indigo”)

- Windows Workflow Foundation

KEY

ADO.NETSystem.DataCommonOdbcOleDbOracleClientSqlSqlClient

DATA

Page 3: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

3

AgendaAgenda

It’s all about moving data around…It’s all about moving data around…

From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast

From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache

Maintaining a cache in syncMaintaining a cache in sync

From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache

yes, query over DataSet :)yes, query over DataSet :)

Page 4: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

4

Moving Tons of DataMoving Tons of Data

Getting data from the server fast is Getting data from the server fast is easyeasy

DataReader objects do most of the workDataReader objects do most of the work

Going back to the server fast is Going back to the server fast is harderharder

Sending INSERT/UPDATE/DELETE Sending INSERT/UPDATE/DELETE statements separately is very slowstatements separately is very slow

Batching statements helpsBatching statements helpsUpdateBatchSize property in the adapterUpdateBatchSize property in the adapter

SqlClient/OracleClient – DataSet-onlySqlClient/OracleClient – DataSet-only

Parameter arraysParameter arraysOn databases/providers that support itOn databases/providers that support it

Page 5: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

5

Client Client Server: Server: BCP+DMLBCP+DML

Useful when need to send lots of Useful when need to send lots of changeschanges

Use SqlBulkCopy in 2.0Use SqlBulkCopy in 2.0

Update sequence:Update sequence:First, bulk-insert all the changes into the First, bulk-insert all the changes into the serverserver

Use a temporary tableUse a temporary table

Second, run a DML statement per Second, run a DML statement per change typechange type

One INSERTs, one for UPDATEs, one for One INSERTs, one for UPDATEs, one for DELETEsDELETEs

Do everything in the same transactionDo everything in the same transaction

Page 6: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

6

Client Client Server: Server: BCP+DMLBCP+DML

Serious performance gainsSerious performance gainsEven several Even several timestimes faster in some faster in some scenariosscenarios

Fine printFine printFor SQL Server, the database needs to For SQL Server, the database needs to be in simple or bulk-logged recovery be in simple or bulk-logged recovery modemode

tempdb is in simple mode by default so it’s tempdb is in simple mode by default so it’s okok

It works in “full” mode, but BCP is slowerIt works in “full” mode, but BCP is slower

Talk to your DBA about the implicationsTalk to your DBA about the implications

Page 7: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

7

Putting All TogetherPutting All Together

For example… in a web application:For example… in a web application:

Consume HTTP request as a streamConsume HTTP request as a streamBuild a DataReader over the requestBuild a DataReader over the request

Feed the DataReader to BCPFeed the DataReader to BCPSends data to the server in streaming Sends data to the server in streaming modemode

Kick off the DML statementsKick off the DML statementsUse an asynchronous ASP.NET pageUse an asynchronous ASP.NET page

Execute an asynchronous ADO.NET Execute an asynchronous ADO.NET commandcommand

Page 8: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

8

AgendaAgenda

It’s all about moving data around…It’s all about moving data around…

From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast

From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache

Maintaining a cache in syncMaintaining a cache in sync

From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache

yes, query over DataSet :)yes, query over DataSet :)

Page 9: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

9

DataView update time

4000

8000

1600

0

3200

0

6400

0

1280

00

2560

00

Rows

Tim

e

.NET 1.1 .NET 2.0 B2

Using dataset as a cacheUsing dataset as a cache

DataSet has been DataSet has been greatly enhancedgreatly enhanced

Better scalability Better scalability across the board across the board Incremental Incremental index updatesindex updatesLow-overhead Low-overhead DataView DataView maintenancemaintenanceFaster, more Faster, more compact serializatiocompact serializationn

We can now handle We can now handle huge cacheshuge caches

Cache granularity Cache granularity becomes importantbecomes important

Page 10: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

10

SqlDependency For SqlDependency For CachingCaching

SqlDependency can notify of changesSqlDependency can notify of changesTrack queries and fire an event when the Track queries and fire an event when the results would changeresults would change

It tracks whole result-setsIt tracks whole result-sets

SqlDependency and cachingSqlDependency and cachingHaving change notifications helps in Having change notifications helps in cachingcaching

No need for cache expiration policiesNo need for cache expiration policiesExpire cache when the underlying data Expire cache when the underlying data changedchanged

Page 11: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

11

SqlDependency + DataSetSqlDependency + DataSet

Middle-tier cacheMiddle-tier cacheCache data in a DataSetCache data in a DataSet

Track the queries used to load the Track the queries used to load the DataSet using SqlDependencyDataSet using SqlDependency

When change notifications comeWhen change notifications comeSimply get rid of the DataSetSimply get rid of the DataSet

Load a new one on the next requestLoad a new one on the next request

Page 12: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

12

Controlling GranularityControlling Granularity

For large caches, getting rid of all the For large caches, getting rid of all the data in the cache is not an optiondata in the cache is not an option

SqlDependency doesn’t have SqlDependency doesn’t have granularity control, but…granularity control, but…

Page 13: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

13

Controlling GranularityControlling Granularity

Use a large Use a large DataSetDataSet

Load one Load one “segment” at a “segment” at a timetime

e.g., on cache misse.g., on cache miss

On change On change notificationnotification

Use .Select to find Use .Select to find the invalidated the invalidated segmentsegment

Remove it from the Remove it from the cachecache

Next request will fill Next request will fill it up againit up again

DataTableDataTable

SqlDependencSqlDependencyy

SqlDependencSqlDependencyy

SqlDependencSqlDependencyy

SqlDependencSqlDependencyy

ChangeChange SqlDependencSqlDependencyy

Page 14: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

14

AgendaAgenda

It’s all about moving data around…It’s all about moving data around…

From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast

From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache

Maintaining a cache in syncMaintaining a cache in sync

From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache

Yes, query over DataSet :)Yes, query over DataSet :)

Page 15: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

15

Using DataSet as a CacheUsing DataSet as a Cache

DataSet enhancements enable huge DataSet enhancements enable huge cachescaches

e.g. millions of rowse.g. millions of rows

With lots of data…With lots of data…It’s hard to navigate through dataIt’s hard to navigate through data

It’s more practical to query over the dataIt’s more practical to query over the data

DataSet does some query…DataSet does some query………but it’s often not enoughbut it’s often not enough

Page 16: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

16

Query Over DataSetQuery Over DataSet

We get requests for this every We get requests for this every week :)week :)

What does it mean?What does it mean?Today, you can filter/sort over a single Today, you can filter/sort over a single tabletable

Joins are the main missing pieceJoins are the main missing piece

What does it take to do it?What does it take to do it?Most base services are there in .NET 2.0Most base services are there in .NET 2.0

Enough to build basic scenarios, some Enough to build basic scenarios, some cases will be hard to do efficientlycases will be hard to do efficiently

Page 17: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

17

Representing a QueryRepresenting a Query

““Algebra tree” is a simple optionAlgebra tree” is a simple optionNodes represent logical query operatorsNodes represent logical query operators

Free from syntax constructsFree from syntax constructs

Full query syntax is “nice” but not Full query syntax is “nice” but not neededneeded

Easy to extend this sample with syntax Easy to extend this sample with syntax supportsupport

Create a compiler that produces algebraic Create a compiler that produces algebraic treestrees

Page 18: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

18

Algebraic OperatorsAlgebraic Operators

ProjectProject

FilterFilter

JoinJoin

SortSort

……

Project Project (name, date)(name, date)

Join (c.id = Join (c.id = o.customerId)o.customerId)

SELECT c.name, o.dateFROM Customers c INNER JOIN Orders oON c.id = o.customerIdWHERE o.status = ‘Pending’

Scan Scan (Customers)(Customers)

Filter Filter (status=‘…’)(status=‘…’)

ScanScan(Orders)(Orders)

Page 19: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

19

Compiling A QueryCompiling A Query

Project Project (name, (name, date)date)

Join (c.id = Join (c.id = o.customerIo.customerI

d)d)

Scan Scan (Customers)(Customers)

Filter Filter (status=‘…’(status=‘…’

))

ScanScan(Orders)(Orders)

Projection Projection iteratoriterator

Nested-loop Nested-loop join iteratorjoin iterator

Base-table Base-table scan scan

iteratoriterator

Filtered-Filtered-scan scan

iteratoriterator

+ Index + Index (DataVie(DataVie

w)w)

Indexed join Indexed join iteratoriterator

Indexed Indexed accessaccess

Page 20: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

20

Executing A QueryExecuting A Query

Traditional approach using “iterators”Traditional approach using “iterators”

Build an iterator tree from the logical Build an iterator tree from the logical treetree

We’re skipping the physical plan for We’re skipping the physical plan for simplicitysimplicity

Iterators represent physical Iterators represent physical operatorsoperators

I.E. how the operator is implementedI.E. how the operator is implemented

Consume from the top-level iteratorConsume from the top-level iterator

Page 21: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

21

Operators: Operators: Projection/FilterProjection/Filter

Projection (“select”)Projection (“select”)Simply keep track of the required Simply keep track of the required columnscolumns

Expressions implemented with DataSet Expressions implemented with DataSet expressionsexpressions

Filter (“where”)Filter (“where”)Implemented with DataTable.SelectImplemented with DataTable.Select

Indexes created on-demand by .SelectIndexes created on-demand by .SelectIf not already thereIf not already there

A DataView can be used as a permanent A DataView can be used as a permanent indexindex

Page 22: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

22

Operators: SortOperators: Sort

Sort (“order by”)Sort (“order by”)DataSet surfaces this together with filterDataSet surfaces this together with filter

DataTable.Select or a DataViewDataTable.Select or a DataView

Will also use indexesWill also use indexes

Need to collapse operatorsNeed to collapse operatorsOr do one (filter/sort) manuallyOr do one (filter/sort) manually

Page 23: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

23

Operators: JoinOperators: Join

JoinJoinNot supported by DataSet out-of-the-boxNot supported by DataSet out-of-the-box

Different strategiesDifferent strategiesDo a merge join if inputs already sortedDo a merge join if inputs already sorted

Do a relationship-based sort if found a Do a relationship-based sort if found a DataRelationDataRelation

Do a index-based join if one side of the join Do a index-based join if one side of the join has an “index” (DataView)has an “index” (DataView)

Do a nested-loop join as the last resortDo a nested-loop join as the last resort

Optionally, create indexes on demandOptionally, create indexes on demand

Page 24: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

24

Operators: The hard onesOperators: The hard ones

DataView/Select don’t take DataView/Select don’t take comparison operator as inputcomparison operator as input

Only “==“ or “!=“Only “==“ or “!=“

So indexed range queries cannot take So indexed range queries cannot take advantage of indexesadvantage of indexes

Not many workaroundsNot many workaroundsNon-indexed queriesNon-indexed queries

Roll your own index :(Roll your own index :(

Page 25: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

25

SummarySummary

ADO.NET 2.0 provides tons of ADO.NET 2.0 provides tons of flexible servicesflexible services

Easy scenarios are straightforwardEasy scenarios are straightforward

Hard scenarios possible with some extra Hard scenarios possible with some extra codecode

ADO.NET continues to evolve to ADO.NET continues to evolve to match modern application match modern application requirementsrequirements

Page 26: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

26

ResourcesResources

BlogsBlogsTeam blog: Team blog: http://http://blogs.msdn.com/dataaccessblogs.msdn.com/dataaccess

http://http://blogs.msdn.com/aconradblogs.msdn.com/aconrad

http://http://blogs.msdn.com/angelsbblogs.msdn.com/angelsb

http://http://blogs.msdn.com/sushilcblogs.msdn.com/sushilc

Data-access MSDN siteData-access MSDN sitehttp://http://msdn.microsoft.commsdn.microsoft.com/data/data

http://http://msdn.microsoft.com/data/DataAccess/Whidbeymsdn.microsoft.com/data/DataAccess/Whidbey

NewsgroupsNewsgroupsNNTP server: NNTP server: msnews.microsoft.commsnews.microsoft.com

Group: Group: microsoft.public.dotnet.framework.adonetmicrosoft.public.dotnet.framework.adonet

Page 27: ADO.NET 2.0: Advanced Data Access Patterns Pablo Castro DAT408 ADO.NET Technical Lead Microsoft Corporation

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.