accelerating performance with

Post on 15-Jan-2022

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Accelerating performance with materialized views

A N T 4 1 7

Vuk Ercegovac

Principal Engineer

Amazon Web Services

Yannis Papakonstantinou

Senior Principal Scientist

Amazon Web Services

Amazon Redshift Materialized Views: Preview

1. Speed up queries by orders of magnitude

• For predictable workloads

• Save work with precomputed, materialized views

Speedup: Original Query

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00

store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

Loc total_sales

SF 12.00

sales store_info

Join-Aggregate

Speedup: Materialized Views Precomputed Results

loc total_sales

SF 12.00

Chicago 10.00

loc_sales

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00

store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

sales store_info

Materialized Views Speedup Query

loc total_sales

SF 12.00

loc total_sales

SF 12.00

Chicago 10.00

loc_sales

Amazon Redshift Materialized Views: Preview

1. Speed up

2. Simplify and accelerate maintenance of precomputed results

• Fast refresh: Efficient, incremental

• Example: ETL/BI pipelines

Fast Refresh: Amazon Redshift Incrementally Maintains

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00

store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

loc total_sales

SF 12.00

Chicago 10.00 sales

store_info

loc_sales

i1 s3 c3 5.00

i2 s2 c4 8.00

Fast Refresh: Amazon Redshift Incrementally Maintains

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00

store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

loc total_sales

SF 12.00

Chicago 10.00 sales

store_info

loc_sales

i1 s3 c3 5.00

i2 s2 c4 8.00

Fast Refresh: Amazon Redshift Incrementally Maintains

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

loc total_sales

SF 12.00+5.00

Chicago 10.00+8.00

sales

store_info

loc_sales

i1 s3 c3 5.00

i2 s2 c4 8.00

Fast Refresh: Amazon Redshift Incrementally Maintains

item_key store_key cust_key price

i1 s1 c1 12.00

i2 s2 c1 3.00

i3 s2 c2 7.00store_key owner loc

s1 Joe SF

s2 Ann Chicago

s3 Lisa SF

loc total_sales

SF 17.00

Chicago 18.00

sales

store_info

loc_sales

i1 s3 c3 5.00

i2 s2 c4 8.00

Example: Customer ETL Pipeline

1. Regular ingest of 3.8 M records per day

2. Strategy 1: CTAS (CREATE TABLE AS)

3. Strategy 2: Materialized Views

➢ Faster ingest times!

➢ Time to ingest new records is constant

➢ Smaller ingest batch → faster ingest

CTAS MV Speedup

Steady-State

Average371s 49s 7.9x

Materialized View Refresh Results

0

50

100

150

200

250

300

350

400

450

500

350 450 550 650 750 850 950 1,050

Qu

ery

Tim

e (

seco

nd

s)

Base Table Row Count (millions)

CTAS vs. MV Refresh Time (lower is better)

CTAS

RMV 16M

RMV 3.8M

Linear (CTAS)

Linear (RMV 16M)

Linear (RMV 3.8M)

Amazon Redshift Materialized Views: Preview

1. Speed up queries by orders of magnitude

3. Easier and faster migration to Amazon Redshift

Amazon Redshift Materialized Views: Functionality

➢ User creates materialized views that use one or more tables✓ CREATE MATERIALIZED VIEW loc_sales AS (

SELECT si.loc, SUM(s.price) FROM sales s, sales_info si WHERE s.store_key = si.store_keyGROUP BY si.loc);

➢ Speed up queries by accessing materialized views✓ SELECT * FROM loc_sales WHERE loc = ‘SF’;

➢ REFRESH command incrementally maintains materialized views✓ REFRESH MATERIALIZED VIEW loc_sales;

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Learn big data with AWS Training and Certification

Visit aws.amazon.com/training/paths-specialty/

New free digital course, Data Analytics Fundamentals, introduces

Amazon S3, Amazon Kinesis, Amazon EMR,

AWS Glue, and Amazon Redshift

Validate expertise with the AWS Certified Big Data - Specialty exam or the new AWS Certified Data Analytics - Specialty beta exam

Resources created by the experts at AWS to help you build and validate data analytics skills

Classroom offerings, including Big Data on AWS, feature AWS expert instructors and hands-on labs

Thank you!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Yannis Papakonstantinou

yannip@amazon.com

Vuk Ercegovac

vercego@amazon.com

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Additional Topics

➢How to use materialized views for physical

design tuning?

➢What part of SQL allows for incremental maintenance?

✓ INNER JOINs

✓ WHERE, HAVING, GROUP BY

✓ Associative Aggregate functions

✓ All other SQL is refreshed by recomputing the materialized view

Additional Topics: Coming Soon

➢Expand incremental maintenance to more SQL

➢More features coming soon. Stay tuned!

top related