1 2 3 the result is all sales territory country

53

Upload: adela-booth

Post on 04-Jan-2016

235 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 1 2 3 The result is ALL Sales Territory Country
Page 2: 1 2 3 The result is ALL Sales Territory Country

Querying in DAX

Marco Russo

DBI-B315

Page 3: 1 2 3 The result is ALL Sales Territory Country

How Many of You…

Have used MDX on a solution?Love MDX? Well… at least like it! Understand cell-by-cell versus block mode evaluation?Can write a medium complex MDX query in a textbox?Understand the FREEZE statement?Believe that MDX is utterly complicated?

Page 4: 1 2 3 The result is ALL Sales Territory Country

Why Another Language?SQL Server 2012 Analysis Services

Multidimensional: MDXTabular: new engineRequires a new languageDAX is the answer

DAX is designed to be SimpleFastNot necessarily easy

Page 5: 1 2 3 The result is ALL Sales Territory Country

Evolution of DAXPowerPivot 2010

Introduction of DAX as a programming languageCalculated Columns, MeasuresFilter Context / Evaluation Context

Excel is the preferred clientMDX is the query language

PowerPivot 2012 / Analysis Services 2012DAX becomes a query languageSQL Server Profiler fires Query Plan eventsOptimization is now feasible, even for normal human beings!

Page 6: 1 2 3 The result is ALL Sales Territory Country

Agenda• Basic DAX Query Language Features• Query Functions in DAX• Handling Complex Filters• Basic Information on the DAX Query Plan• SQL Server Profiler• Optimization Examples

Page 7: 1 2 3 The result is ALL Sales Territory Country

DAX Query Syntax

Page 8: 1 2 3 The result is ALL Sales Territory Country

DAX as a query language

Only one statement

EVALUATEFollowed by any DAX table expressionReturns the result as a dataset

Looks simple!

Page 9: 1 2 3 The result is ALL Sales Territory Country

EVALUATE syntax[DEFINE

{ MEASURE <table>[<col>] = <expression> }]

EVALUATE <Table Expression>

[ORDER BY {<expression> [{ASC | DESC}]} [, …]

[START AT {<value>|<parameter>} [, …]] ]

Page 10: 1 2 3 The result is ALL Sales Territory Country

EVALUATE exampleEVALUATE

'Sales Territory'ORDER BY 'Sales Territory'[Sales Territory Group] ASC, 'Sales Territory'[Sales Territory Country] ASC, 'Sales Territory'[Sales Territory Region]START AT

"North America", "United States"

Page 11: 1 2 3 The result is ALL Sales Territory Country

EVALUATE and FILTEREVALUATE

FILTER ( FILTER ( 'Internet Sales', [Ship Date] > [Due Date] ), 'Internet Sales'[Unit Price] > 2 )

Page 12: 1 2 3 The result is ALL Sales Territory Country

CALCULATETABLEEVALUATE

CALCULATETABLE ( 'Sales Territory', [Sales Territory Country] = "United States"

)

ORDER BY 'Sales Territory'[Sales Territory Country]

Page 13: 1 2 3 The result is ALL Sales Territory Country

CALCULATE evaluation orderEVALUATE

CALCULATETABLE ( CALCULATETABLE ( 'Sales Territory', ALL ( [Sales Territory Country] ) ), [Sales Territory Country] = "United States" )

1

2

3

The result is ALL Sales Territory Country

Page 14: 1 2 3 The result is ALL Sales Territory Country

ADDCOLUMNSEVALUATE

ADDCOLUMNS ( 'Internet Sales', "Margin", ([Sales Amount] - [Total Product Cost]))

Page 15: 1 2 3 The result is ALL Sales Territory Country

SUMMARIZEEVALUATE

SUMMARIZE( 'Internet Sales', 'Internet Sales'[Order Date], "Sales Amount", SUM ( 'Internet Sales'[Sales Amount] ) )

Source Table

GROUP BY

Expression to evaluate

Page 16: 1 2 3 The result is ALL Sales Territory Country

SUMMARIZE with ROLLUPEVALUATE

SUMMARIZE( 'Internet Sales', ROLLUP ( 'Date'[Calendar Year] ), "Sales Amount", SUM('Internet Sales'[Sales Amount]) )

Compute sub-totals at the Year

level

Page 17: 1 2 3 The result is ALL Sales Territory Country

CROSSJOINEVALUATE

ADDCOLUMNS ( CROSSJOIN ( DISTINCT ( 'Product'[Color] ), DISTINCT ( 'Product'[Size] ) ), "Products", COUNTROWS ( RELATEDTABLE ( Product ) ) )

Page 18: 1 2 3 The result is ALL Sales Territory Country

Query measuresDEFINE MEASURE 'Product Category'[Subcategories Count] = COUNTROWS ( RELATEDTABLE ('Product Subcategory' ) ) MEASURE 'Product Category'[Products Count] = COUNTROWS ( RELATEDTABLE ('Product' ) )EVALUATE ADDCOLUMNS ( 'Product Category', "SubCategories", [Subcategories Count], "Products Count", 'Product Category'[Products Count] )

Page 19: 1 2 3 The result is ALL Sales Territory Country

Query measures are fast!MultidimensionalCalculated members defined in the query can be very slowLimitations in the cache usage

TabularNo performance lossCache always active, no query vs global cacheIf Formula Engine is involved, no parallelism happens

Very useful during development

Page 20: 1 2 3 The result is ALL Sales Territory Country

GENERATE and a complex exampleEVALUATE GENERATE ( VALUES ( 'Product Category'[Product Category Name] ), FILTER ( CALCULATETABLE ( ADDCOLUMNS ( TOPN ( 3, VALUES ( 'Product'[Product Name] ), [Internet Total Sales] ), "Sales", [Internet Total Sales] ) ), [Sales] > 0 ) )

Page 21: 1 2 3 The result is ALL Sales Territory Country

Querying Tabular – First recap

DAX as a query languageVery simple: one single instruction makes everything: EVALUATEA query is a function call that returns a tableAny DAX function can be used in the queryQuery-defined measures work very well

At this point, we know DAX as a query language…

Page 22: 1 2 3 The result is ALL Sales Territory Country

Optimizing DAX

Page 23: 1 2 3 The result is ALL Sales Territory Country

Optimizing DAX is important

Data ModelTabular is a simple data modelNot many options to optimize the modelDAX is the key to gain optimal performances

Query PlanMultidimensional has no query plansTabular makes the query plan visible

Page 24: 1 2 3 The result is ALL Sales Territory Country

DAX query architecture

Analysis Services 2012Tabular Model

In-Memory Mode VertiPaq StorageQuery

Storage Engine Query

DAX / MDX query

DirectQuery Mode External Data Sources

Query SQL Query

Process = Read Data from External Sources

Page 25: 1 2 3 The result is ALL Sales Territory Country

Query engines

FE - Formula EngineStill single threadedHandles Complex Expressions

SE - Storage EngineWhat?Storage Engine in an in-memory database?Yes, we like to call it this way! xVelocity in-memory Analytics Engineaka VertiPaq

Optimizing means using the SE and avoiding FE

Page 26: 1 2 3 The result is ALL Sales Territory Country

Microsoft view of Tabular: Rich & Fast DAX VertiPaq Query

Rich

Single thread per query

Designed for expressivity

Simple

One core per segment

Optimized for speed

Page 27: 1 2 3 The result is ALL Sales Territory Country

Formula engine

Page 28: 1 2 3 The result is ALL Sales Territory Country

Formula engine Vertipaq

28

Page 29: 1 2 3 The result is ALL Sales Territory Country

DAX query evaluation flowBuild DAX Expression Tree

Buld DAX Logical Plan

Simplify DAX Logical Plan

Build DAX Physical Plan

Execute DAX Physical Plan

Some Vertipaq queries can be executedin order to gather additional informationabout data that is to be queried

Fire Logical Plan Event

Fire Physical Plan Event

Page 30: 1 2 3 The result is ALL Sales Territory Country

SQL Server Profiler

Catches events from SSASQueries Events Query Processing

Page 31: 1 2 3 The result is ALL Sales Territory Country

DAX Studio – daxstudio.codeplex.comFuture release will capture Dax Query Plan

Page 32: 1 2 3 The result is ALL Sales Territory Country

Monitoring a queryEVALUATE

CALCULATETABLE ( SUMMARIZE ( 'Internet Sales', Geography[State Province Code], "Sales", SUM ( 'Internet Sales'[Sales Amount] ) ), FILTER ( Customer, Customer[Last Name] = "Anand" ))

Page 33: 1 2 3 The result is ALL Sales Territory Country

1° VertiPaq querySELECT [Customer].[CustomerKey]FROM [Customer]WHERE

( PFDATAID( [Customer].[LastName] ) = 81 )

Page 34: 1 2 3 The result is ALL Sales Territory Country

2° VertiPaq querySELECT [Geography].[StateProvinceCode] FROM [Internet Sales]

LEFT OUTER JOIN [Customer] ON [Internet Sales].[CustomerKey] = [Customer].[CustomerKey]

LEFT OUTER JOIN [Geography] ON [Customer].[GeographyKey] = [Geography].[GeographyKey]WHERE

[Customer].[CustomerKey] IN (11096, 11989, 17005, 22513, 28899, 15054, 19626, 20344, 25918, 27141... [74 total values, not all displayed]);

Page 35: 1 2 3 The result is ALL Sales Territory Country

DAX logical query planCalculateTable AddColumns Scan_VertiPaq GroupBy_VertiPaq Scan_VertiPaq Sum_VertiPaq Scan_VertiPaq 'Internet Sales'[Sales Amount] Filter_VertiPaq Scan_VertiPaq 'Customer'[Last Name] = Anand

Page 36: 1 2 3 The result is ALL Sales Territory Country

Query plans side by side

CalculateTable AddColumns Scan_VertiPaq GroupBy_VertiPaq Scan_VertiPaq Sum_VertiPaq Scan_VertiPaq [Sales Amount] Filter_VertiPaq Scan_VertiPaq 'Customer'[Last Name] = Anand

EVALUATECALCULATETABLE( SUMMARIZE( 'Internet Sales', Geography[State Province Code], "Sales", SUM( [Sales Amount] ) ), FILTER( Customer, Customer[Last Name] = "Anand" ))

Page 37: 1 2 3 The result is ALL Sales Territory Country

Query running – Step 1SELECT [Geography].[StateProvinceCode], SUM([Internet Sales].[SalesAmount])FROM [Internet Sales]

LEFT OUTER JOIN [Customer] ON [Internet Sales].[CustomerKey]=[Customer].[CustomerKey]

LEFT OUTER JOIN [Geography] ON [Customer].[GeographyKey]=[Geography].[GeographyKey]WHERE

[Customer].[CustomerKey] IN (11096, 11989, ...[74 total values, not all displayed]) VAND

[Geography].[StateProvinceCode] IN ('VIC', 'BC', ...[21 total values, not all displayed]);

Page 38: 1 2 3 The result is ALL Sales Territory Country

Query running – Step 2SELECT [Geography].[StateProvinceCode] FROM [Internet Sales]

LEFT OUTER JOIN [Customer] ON [Internet Sales].[CustomerKey] = [Customer].[CustomerKey]

LEFT OUTER JOIN [Geography] ON [Customer].[GeographyKey] = [Geography].[GeographyKey]WHERE

[Customer].[CustomerKey] IN (11096, 11989, 17005, 22513, 28899, 15054, 19626, 20344, 25918, 27141... [74 total values, not all displayed]);

Page 39: 1 2 3 The result is ALL Sales Territory Country

Simple query plan4 VertiPaq Queries

2 before execution2 during execution

1 performed GROUPBY and SUM in VertiPaq1 hit the cache

Intermediate results are spooledFinal JOIN performed by Formula Engine

This is a very good query plan

Page 40: 1 2 3 The result is ALL Sales Territory Country

Place extended columns in ADDCOLUMNSSlower query

EVALUATESUMMARIZE ( Product, Product[Product Category Name], "Products", COUNTROWS ( Product ))

Optimized queryEVALUATEADDCOLUMNS ( SUMMARIZE ( Product, Product[Product Category Name] ), "Products", CALCULATE( COUNTROWS( Product ) ))

Always put extended columns in ADDCOLUMNS parameters.Just use SUMMARIZE for its group by and when the evaluation context ofSUMMARIZE is required (i.e. when a TOPN has to be executed for each group)http://www.sqlbi.com/articles/best-practices-using-summarize-and-addcolumns/

Page 41: 1 2 3 The result is ALL Sales Territory Country

Currency conversion SUMX ( DimCurrency, SUMX ( DimDate, CALCULATE ( VALUES ( CurrencyRate[AverageRate] ) * SUM ( FactInternetSales[SalesAmount] ) ) ))

Page 42: 1 2 3 The result is ALL Sales Territory Country

Currency conversion optimizedSUMX( CROSSJOIN( DimCurrency, DimDate ), CALCULATE( VALUES( CurrencyRate[AverageRate] ) * SUM( FactInternetSales[SalesAmount] ) ))

Using one SUMX, the optimizer pushes the

calculation in VertiPaq

Page 43: 1 2 3 The result is ALL Sales Territory Country

Slow measures and filtersDEFINE MEASURE 'Internet Sales'[Sales] = CALCULATE (ROUND (SUM ('Internet Sales'[Sales Amount] ), 0 ) ) MEASURE 'Internet Sales'[YTD Sales] = TOTALYTD ( [Sales] , 'Date'[Date] ) MEASURE 'Internet Sales'[QTD Sales] = TOTALQTD ( [Sales] , 'Date'[Date] )EVALUATE

FILTER ( ADDCOLUMNS ( CROSSJOIN ( VALUES ( 'Date'[Calendar Year] ), VALUES ( 'Date'[Month] ), VALUES ( 'Date'[Month Name] ) ), "Sales", [Sales], "YTD Sales", [YTD Sales], "QTD Sales", [QTD Sales] ), NOT ISBLANK( [Sales] ))ORDER BY 'Date'[Calendar Year], 'Date'[Month]

These two measures are slow to compute

The filter is applied after computation

Page 44: 1 2 3 The result is ALL Sales Territory Country

Filter as soon as you canDEFINE MEASURE 'Internet Sales'[Sales] = CALCULATE (ROUND (SUM ('Internet Sales'[Sales Amount] ), 0 ) ) MEASURE 'Internet Sales'[YTD Sales] = TOTALYTD ( [Sales] , 'Date'[Date] ) MEASURE 'Internet Sales'[QTD Sales] = TOTALQTD ( [Sales] , 'Date'[Date] )EVALUATE

ADDCOLUMNS ( FILTER ( CROSSJOIN ( VALUES ( 'Date'[Calendar Year] ), VALUES ( 'Date'[Month] ), VALUES ( 'Date'[Month Name] ) ), NOT ISBLANK ( [Sales] ) ), "Sales", [Sales], "YTD Sales", [YTD Sales], "QTD Sales", [QTD Sales])ORDER BY 'Date'[Calendar Year], 'Date'[Month]44

This query runs 7 times faster, even if it computes [Sales]

twice

Page 45: 1 2 3 The result is ALL Sales Territory Country

Joins are expensiveEach join in the data model has a cost

But… can you remove joins?From snowflakesFrom lookup tablesAnd most important: from junk dimensions!

Don’t build junk dimensions, store strings in fact table!

Page 46: 1 2 3 The result is ALL Sales Territory Country

Cache usageDAX Query PlansCache all VertiPaq Query Results

MDX QueriesIdentical to SSAS MultidimensionalCaches FE calculated members

Using MDX to query TabularBoth caches work

Using DAX to query TabularOnly VertiPaq results are cached

If you heavily hit Formula Engine, the

cache will not be really useful.

Page 47: 1 2 3 The result is ALL Sales Territory Country

General rulesUse the simplest formulasPrefer ADDCOLUMNS to SUMMARIZEAvoid complex calculations in measuresDon’t use error-handling functions extensivelyThey are sloooooowGood only in measures

Build a correct data modelReducing distinct count of values for columns is the main targetKeep it simple, relational, clear

Push calculations down to the VertiPaq engineNot an easy task to do looking at query plansBut, after some experience, you will learn it

Page 48: 1 2 3 The result is ALL Sales Territory Country

ConclusionsDAX is a simple languageVery effective in expressing complex queriesLooks strange, at the beginning, but it is easier than MDX

Optimizing Tabular means optimizing DAXOptimizing DAX means pushing calculations to VertiPaqSaid in other words… trust the Rain Man, he’s your best friend!

Sometimes, you will change the data model for this

DAX is simple, it is not easy…but this is the fun part of it!

Page 50: 1 2 3 The result is ALL Sales Territory Country

msdn

Resources for Developers

http://microsoft.com/msdn

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources

Sessions on Demand

http://channel9.msdn.com/Events/TechEd

Resources for IT Professionals

http://microsoft.com/technet

Page 51: 1 2 3 The result is ALL Sales Territory Country

Complete an evaluation on CommNet and enter to win!

Page 52: 1 2 3 The result is ALL Sales Territory Country

Evaluate this session

Scan this QR code to evaluate this session and be automatically entered in a drawing to win a prize

Page 53: 1 2 3 The result is ALL Sales Territory Country

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.