………………………………………………………………………………………
Post on 25-Feb-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
………………………………………………………………………………………
SQL Server Analysis Services
Khalid Abu QtaishSr. BI Consultant / Solution DesignerKhalidBI.wordpress.comKhalid.yq@gmail.com
Analysis Services
Agenda Introduction ETL OLAP Terms Storage Modes Queries (MDX) Tools Live Demo Q&A
SQL Server 2008 R2 BI Platform
Introduction Analysis Services provides access to large data
sets Running SQL queries against a 100 million row
table just doesn’t work When your data sets get large, you need a better way of
handling the load Online analytical processing (OLAP) provides all
your answers with speed! Analysis services is an OLAP implementation
Integration & Structure
Managing Information
Data Warehouse
Analysis Services
ERP
CRM
HRMS
BI Developer or Analyst
Integration Services
ETL The Extract, Transform and Load (ETL) process is
vital to the OLAP results If you input junk, you get junk back out! All data that enters the OLAP database must be valid or
your results could be exponentially wrong! Make sure that all possible means are employed
when ensuring only valid data is entered into the system and that it is entered only once! Common issue is data inserted more than once which
causes invalid results
ETL
ETL
Integration Challenge
Data Warehouse
Call Center
Web Apps
Inventory
ERPHR
Finance
CRM
SOA – Enterprise
Service Bus
Source Systems Process real-time transactions Optimized for data modifications
Normalized Limited decision support Commonly called:
Online transaction processing (OLTP) systems Operational systems
HR Finance Inventory
Data Warehouse Provides data for business analysis
Grouped in subject-specific stores called Data Marts
Optimized for rapid ad-hoc information retrieval
Integrates heterogeneous source systems Consistent historical data store
Star Schema
Star Schema Benefits Simple, not-so-normalized model High-performance queries
Especially with Star Join Query Optimization Mature and widely supported Low-maintenance
OLAP Terms
Data Source View Diagrams of star schemas – state cube,
perf cube
Dimensions Describe business entities Contain attributes that provide context to
numerical data Present data organised into hierarchies
Fact Table Fundamentals Collection of measurements associated with a
specific business process Specific column types
Foreign keys to dimensions Measures – numeric and additive Metadata and lineage
Consistent granularity – the most atomic level by which the facts can be defined
Cube (UDM)Unified Dimensional Model
Combination of measures (from facts) and dimensions as one conceptual model
Rich data model enhanced by Calculations Key Performance Indicators (KPIs) Actions Perspectives Translations Partitions
Formally, cube is called a UDM
Cube
2009
Q1
Jan
Feb
Mar
AccessoriesParts
Cars
MeasuresIte
ms Sold
CostSa
les $
Dates
Products
Dicing a Cube
1
3
2
6
252009
Q1
Jan
Feb
Mar
AccessoriesParts
Cars
MeasuresIte
ms Sold
CostSa
les $
Dates
Products
Conceptual Terms
Hierarchies Benefits
View of data at different levels of summarization Path to drill down or drill up
Implementation Denormalized star
schema dimension Normalized snowflake
dimension Self-referencing
relationship
Dimension Relationships Define interaction between dimensions
and measure groups Relationship types
Regular Reference Fact (Degenerate) Many-to-many Data mining
Measure Group Group of measures with same
dimensionality Analogous to a fact table Cube can contain more than one
measure group E.g. Sales, Inventory, Finance
Defined by dimension relationships
Calculations Expressions evaluated at query time for
values that cannot be stored in fact table Types of calculations
Calculated members Named sets Scoped assignments
Calculations are defined using MDX
Time and Date DimensionsCalendar dates and hours of day, with separate attributes for time-folded analysis (values by hour for the past week, for example)
• Date by Month (Year, Quarter, Month, Date)• Date by Week (Year, Week, Date)• Time (Hours, in 12 hour or 24 hour format)
Period CalculationsA dimension that contains dynamic, time-based calculations• Month-to-date• Quarter-to-date• Year-to-date• Prior Period• Year-to-date, Prior Year• Percentage change
Parent-Child Hierarchy A dimension that contains a parent
attribute A parent attribute describes a self-
referencing relationship, or a self-join, within a dimension table
Common examples Organizational charts General Ledger structures Bill of Materials
Parent-Child Hierarchy Example
Brian
Amy
Stacia
Stephen
Shu
Michael
Peter
José
Syed
Slowly Changing Dimensions
Maintain historical context as dimension data changes
Three common ways (there are more): Type 1: Overwrite the existing dimension
record Type 2: Insert a new ‘versioned’ dimension
record Type 3: Track limited history with attributes
Key Performance Indicators (KPI) KPIs are used to show very simply whether a
target is being accomplished Components include the Goal, Value, Status, and Trend Example: Sales and Quality targets
Analysis Services allows you to build MDX expressions off of Cube data to build KPIs KPI values can be queried from client applications Allows for visual display of meaningful data
Actions Actions
Allows client application users to be able to interact with what the data means
Example: Browse to a customer or product via URL Types of Actions:
CommandLine, DataSet, Drillthrough, Html, Custom, Report, URL
Define and assign to objects in the Cube
Perspectives Perspectives work similar to views in a relational
database Shows different users the data they need to see for a
particular role they may be in Used to reduce complexity of cube data Not meant to be security mechanism for data
Storage Modes Data in an Analysis Services database is stored
differently than a relational database Optimized storage provides the OLAP query performance
Partition MOLAP – multidimensional OLAP (fact data and
aggregations are stored in special format) ROLAP – Relational OLAP (fact data and aggregations
remain in relational database) HOLAP – Hybrid OLAP (fact data is relational, aggregations
are stored in special format) Dimension (dimension attributes only)
MOLAP – stored in special format ROLAP – stay in relational format
Querying Cubes SQL is not used in OLAP databases MDX (multidimensional queries) is used for
querying cubes DMX (Data Mining queries) is used for querying
data mining models
MDX Queries Multidimensional Expressions (MDX) are used to
query multidimensional data Some common terms are:
Cell – the space at an intersection of a measure and attribute
Tuple – a unique cell based on a set of attribute members
Set – an order set of tuples with same dimensionality T-SQL & MDX
I need to determine which month had the highest sales, then look at the products and customers that made that month so good
Calculated Members and Named Sets
Calculated Members are used when you need to determine something at query time Can be query or session scoped Value are only stored in memory not on disk
Named Sets are basically predefined MDX queries that can be reused in other queries Used to group dimension members
Partitioning Mechanism to break up large cube into
manageable chunks Partitions can be added, processed,
deleted independently Update to last month’s data does not affect prior months’
partitions Sliding window scenario easy to implement E.g. 24 month window add June 2006 partition and delete
June 2004 Partitions can have different storage settings
What could go wrong? Cubes really are simple things to build and utilize
As simple as they are, it is easy to create a cube that has the wrong data
Always validate that the data that is displayed in the Cube is valid and accurate Never hurts to be overly aggressive when testing Cube
data
Tools
Business Intelligence Development Studio
BI Studio is just Visual Studio with project templates installed
Project templates provide item templates and wizards New 2008 Wizards are much easier to use and more
powerful Context sensitive functionality keeps toolbars
and menus trimmed to what elements you need based on what you are looking at
Because it is Visual Studio it is fully customizable 3rd party add-ons can make it even more powerful
Designing an Analysis Services Database
BI Studio will be used to create new AS Databases
Steps include: Define data source Define data view Create a new cube Define fact tables Define dimensions (data and time) Define measures Aggregate/Run the cube
SSAS 2008 R2 New Features Develop solutions quickly with the new, streamlined Cube
Designer Take advantage of enhanced Dimension and Aggregation
Designers Create attribute relationships easily by using the new
Attribute Relationship Designer Avoid common design problems by using best practice Enable high-performance “what if” scenarios by using
MOLAP enabled write-back Take advantage of enhanced data mining structures and
improved Time Series support Monitor and optimize analytical solutions by using Analysis
SQL Server Analysis Services
Live Demo
SQL Server Analysis Services
Q & A
top related