geek sync i polybase and time travel (temporal tables)

24
Polybase and Time Travel (Temporal Tables) March 2 nd , 2016

Upload: idera-software

Post on 22-Jan-2018

388 views

Category:

Technology


1 download

TRANSCRIPT

Polybase andTime Travel(Temporal Tables)

March 2nd, 2016

About

• Sr. Product Manager with IDERA– Performance Monitoring of Microsoft BI stack

– Backup and Recovery of Microsoft SQL Server

• Geek Sync Presenter

• Blog Contributor

• HSSUG presenter

• Over 25 years experience– BI, Data Architect

– DBA

– Developer

– Data Analyst

Where in the world are we?

3

… data warehousing has reached

the most significant tipping point

since its inception. The biggest,

possibly most elaborate data

management system in IT is

changing.

– Gartner, “The State of Data Warehousing in 2012”

Data sources

ETL

Data warehouse

BI and analytics

The Cool Kid’s Data Warehouse

4

The Data Warehouse of the Future?

• Diverse Big Data

• Workload Centric Approach

• Data stored on multiple platforms

• Physically distributed data

warehouse– data warehouse appliances

– columnar RDBMSs

– NoSQL databases

– MapReduce tools, and HDFS.

5

The Data Warehouse of the Future…Its

Here!

6

SQL Server Technology Drivers

PolyBase

JSON Data

Temporal Tables

In Memory Table

ColumnStore Index

Polybase

8

Polybase

• Use T-SQL to store data in SQL Server from Hadoop or Azure as tables.

• Knowledge of Hadoop or Azure is not required to use.

• Pushes computation to where data resides

• Export relational data into Hadoop or Azure

9

PolyBase - External Tables, Data Sources & File Formats

10

SQL Server w/

PolyBase

Social

Apps

Sensor

&RFID

Mobile

AppsWeb

Apps

Data Scientists,

BI Users, DB Admins

Your

Apps

PowerPivot

PowerView

PolyBase

Split-Based Query

Processing

External Table

External

Data SourceExternal File

Format

Hadoop

Relational DW

PolyBase Scenarios• Querying

– Run T-SQL over HDFS

– Combine data from different Hadoop clusters

– Join relational with non-relational data

• ETL– Subset of Hadoop in Columnar Format

– Enable data aging scenarios to more economic storage

• Allows building of multi-temperate DW platforms– SQL Server acts as hot query engine processing most

recent data sets

– Aged data immediately accessible via external tables

– No need to groom data

• Hybrid (Azure Integration)– Mesh-up on-premise and cloud apps

– Bridge between on-premise and Azure

11

Polybase

1. Create external data source (Hadoop).

2. Create external file format (delimited text file).

3. Create external table pointing to file stored in Hadoop.

12

CREATE EXTERNAL TABLE [dbo].[CarSensor_Data] (

[SensorKey] int NOT NULL,

[CustomerKey] int NOT NULL,

[GeographyKey] int NULL,

[Speed] float NOT NULL,

[YearMeasured] int NOT NULL

)

WITH (LOCATION='/Demo/car_sensordata.tbl',

DATA_SOURCE = hdp2,

FILE_FORMAT = ff2,

REJECT_TYPE = VALUE,

REJECT_VALUE = 0

CREATE EXTERNAL DATA SOURCE hdp2 with (

TYPE = HADOOP,

LOCATION ='hdfs://10.xxx.xx.xxx:xxxx',

RESOURCE_MANAGER_LOCATION='10.xxx.xx.xxx:xxxx')

CREATE EXTERNAL FILE FORMAT ff2 WITH (

FORMAT_TYPE = DELIMITEDTEXT,

FORMAT_OPTIONS (FIELD_TERMINATOR ='|',

USE_TYPE_DEFAULT = TRUE)

Polybase

Who drives faster than 35 Miles > joining structured

customer data stored in SQL Server with sensor data

Ad-Hoc Query joining relational with Hadoopdata

SELECT DISTINCT

Insured_Customers.FirstName,

Insured_Customers.LastName,

Insured_Customers.YearlyIncome,

Insured_Customers.MaritalStatus

into Fast_Customers

from Insured_Customers INNER JOIN

(

select * from CarSensor_Data where Speed > 35

) as SensorD

ON Insured_Customers.CustomerKey = SensorD.CustomerKey

ORDER BY YearlyIncome

CREATE CLUSTERED COLUMNSTORE INDEX CCI_FastCustomers ON

Fast_Customers;

CHANGE THE CONFIG FILE!!!!!

14C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Settings\Hadoop.config

The user that is used to access Hadoop has to

be configured in the Polybase configuration file

Demo

Temporal Tables

• Temporal Table is really two tables.

– Data Table

– Historical Table (PERIOD)

• A temporal table can be defined as a table for which

PERIOD definition exists comprising of system columns

• Slowly Changing Dimension

– Data Table is Type 1

– Historical Table is Type 2

• Recover accidental data changes

16

Temporal Tables

• Slowly Changing Dimensions– Type 1

• Records will be updated with no history log

– Type 2• Keep history with a From Date and To Date

• Set existing record as obsolete (with setting the To Date)

• Create a new record (with new From Date)

– Type 3• Limited History

– Type 4• Similar to Type 2 but broken out as 2 tables.

Temporal Tables

• Requirements/Limitations

– Primary Key

– Two columns (start and end date as datetime2)

– In-Memory tables cannot be used

– INSERT and UPDATE not allowed on SYSTEM_TIME period columns

– History Table data cannot be changed.

– Regular queries only affect data in the current table.

18

Temporal Tables

Example:

19

CREATE TABLE dbo.TestTemporal(ID int primary key,A int,B int,C AS A*B,SysStartTime datetime2 GENERATED ALWAYS AS ROW START NOT NULL,SysEndTime datetime2 GENERATED ALWAYS AS ROW END NOT NULL,PERIOD FOR SYSTEM_TIME (SysStartTime, SysEndTime)) WITH (SYSTEM_VERSIONING = ON);

Temporal Tables

20

• Creating through SSMS

Temporal Tables• The SELECT statement FROM <table> clause

has a new clause FOR SYSTEM_TIME with four temporal-specific sub-clauses to query data across the current and history tables.

– Point in time: AS OF <date_time>

– Exclusive bounds: FROM <start_date_time> TO <end_date_time>

– Inclusive lower bound, exclusive upper bound: BETWEEN <start_date_time> AND <end_date_time>

– Inclusive bounds: CONTAINED IN (<start_date_time> , <end_date_time>)

21

Temporal Tables

• For example, if you want to look at the values active for customer 27 on the first of the year:

… FROM Customer FOR SYSTEM_TIME AS OF '2015-1-1' WHERE CustomerID = 27

• If instead you want to see every version of the users records for that day you could write:

… FROM Customer FOR SYSTEM_TIME BETWEEN '2015-1-1' AND '2015-1-2'WHERE CustomerID = 27

22

Demo

Try any of our tools for free!

Twitter: @MSBI_Stan

Email: [email protected]

www.idera.com