aws atlanta meetup 2/ 2017 redshift wlm

28
Redshift WLM Work Load Management

Upload: adam-book

Post on 14-Apr-2017

79 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: AWS Atlanta meetup  2/ 2017 Redshift WLM

Redshift WLM Work Load Management

Page 2: AWS Atlanta meetup  2/ 2017 Redshift WLM

Sponsors

Page 3: AWS Atlanta meetup  2/ 2017 Redshift WLM

Find me on LinkedIn

AWS Certifications

Presented by Adam Book

Page 4: AWS Atlanta meetup  2/ 2017 Redshift WLM

Redshift Use Cases

Enterprise Data Warehouse

• Reduce costs by extending DW rather then extending HW

• Migrate completely from existing DW systems

• Respond Faster to business

Companies with Big Data

• Improve performance by an order of magnitude

• Make more data available for analysis

• Access business data via standard reporting tools

SaaS Companies

• Add analytic functionality to applications

• Scale DW capacity as demand grows

• Reduce HW & SW costs by an order of magnitude

Page 5: AWS Atlanta meetup  2/ 2017 Redshift WLM

Amazon Redshift System Architecture

Page 6: AWS Atlanta meetup  2/ 2017 Redshift WLM

Amazon Redshift Node Types

3.7 vCPU ECU Mem Storage I/O

dc1.large 2 7 15 0.16 TB SSD 0.20 GB /s

dc1.8xlarge 32 104 244 2.56 TB SSD 3.70 GB/s

vCPU ECU Mem Storage I/O

ds2.xlarge 4 14 31 2TB HDD 0.40 GB /s

ds2.8xlarge 36 116 244 16 TB HDD 3.30 GB /s

Dense Compute

Dense Storage

Page 7: AWS Atlanta meetup  2/ 2017 Redshift WLM

Columnar Storage

SSN Name Age City ST

101259797 SMITH 88 JUNO AL

892375862 CHIN 37 POMONA CA

318370701 HANDU 12 CHICAGO IL

Columnar storage for database storage tables is an important factor in optimizing analytic query performance because it drastically reduces the overall disk I/O requirements and reduces the amount of data you need to load from disk

101259797 SMITH 88 JUNO AL 892375862 CHIN 37 POMONA CA 318370701 HANDU 12 CHICAGO IL

Block 1 Block 2 Block 3

Page 8: AWS Atlanta meetup  2/ 2017 Redshift WLM

What is WLM?

WLM (Work Load Management) is a feature for managing queues while running queries on Amazon Redshift.

Page 9: AWS Atlanta meetup  2/ 2017 Redshift WLM

Steps to WLM ?

• Defining Query Queues• Modifying the WLM Configuration • WLM Queue Assignment Rules• Assigning Queries to Queues• WLM Dynamic and Static Configuration

Properties• WLM System Tables and Views

Page 10: AWS Atlanta meetup  2/ 2017 Redshift WLM

Tuned Queries still running S-L-O-W?

Image by http://www.pexels.com/

Page 11: AWS Atlanta meetup  2/ 2017 Redshift WLM

The Query Submission

• Leader Node– Parses query and creates logical tree– Query rewritten for optimization– Compiled to C++– Submitted to Compute Nodes

Page 12: AWS Atlanta meetup  2/ 2017 Redshift WLM

Managing workloads

• How many queries can run concurrently in each queue• How much memory is allocated among the queues• How queries are routed to queues, based on criteria such as the

user who is running the query or the query label • Query timeout settings for a queue

Page 13: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Queue Assignment Rules

When a user runs a query, WLM assigns the query to the first matching que based on the rules.

Page 14: AWS Atlanta meetup  2/ 2017 Redshift WLM

Redshift Queues

• Super User queue: For Administrators (1 concurrent query)

• Worker queue: For things like bi queries (5 concurrent queries)

• Processing queue: for background processing

Page 15: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Queues

Queue Concurrency User Groups Query Groups

Superuser 1 superuser

1 4 UG1

2 4 QGB

3 4 UG3 QGC

Default 4

The following table shows a WLM configuration with the Superuser queue and four user-defined queues.

Page 16: AWS Atlanta meetup  2/ 2017 Redshift WLM

Creating Queues(WLM configuration)

• Add a new queue in the WLM configuration – Set the concurrently level – Add User group name – Allocate Memory Percentage– Set the Timeout in milliseconds (optional)

Page 17: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Dynamic Memory Allocation

The workload manager uses the following process to manage the transition.

1. WLM recalculates the memory allocation for each new query slot.2. If a query slot is not actively being used by a running query, WLM removes the slot, which makes that

memory available for new slots.3. If a query slot is actively in use, WLM waits for the query to finish.4. As active queries complete, the empty slots are removed and the associated memory is freed.5. As enough memory becomes available to add one or more slots, new slots are added.6. When all queries that were running at the time of the change finish, the slot count equals the new

concurrency level, and the transition to the new WLM configuration is complete.

Page 18: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Settings tips:

One of the cases where you can gain the most from setting WLM is when you have time-consuming queries filling your queue. Without WLM, such time-consuming queries will clog your queue. Even the queries that run fast will take longer if it has to wait longer if it has to wait for these slower queries to finish running.

Separate out time-consuming queries

Page 19: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Settings tips:

Since the resource of each node is shared among concurrent processes, increasing the concurrency level total too high will limit the overall throughput. In the AWS development guide, they recommend setting the concurrency level total (the total of the concurrency levels for all queues) to 15 or lower.

Set the total concurrency levels to 15 or lower

Page 20: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Settings tips:

With many queues, the amount of allocated memory for each queue becomes smaller. On top of that, there is the recommended limit of a total of 15 concurrency levels, so it becomes difficult ot maximize Redshift performance if you have too many queues.

Avoid adding too many queues

Page 21: AWS Atlanta meetup  2/ 2017 Redshift WLM

WLM Settings tips:

Redshift has reserved a “superuser” queue for running commands related to the system, troubleshooting, or for manual operations in emergency situations. So, you won’t need to set another queue for such purposes.

Use superuser for troubleshooting

set query_group to ‘superuser’; analyze; vacuum;reset query_group;

Page 22: AWS Atlanta meetup  2/ 2017 Redshift WLM

Query: Existing WLM config

SELECT service_class, num_query_tasks, evictable, eviction_threshold, name FROM stv_wlm_service_class_configWHERE service_class > 5;

Run the following query to check the existing WLM configuration.

Service_class | num_query_tasks| evictable|eviction_threshold| name-------------------------------------------------------------------------------- 6| 8| false| 0| Service class #1 7| 6| true| 125163520| Service class #2 8| 10| true| 300000000| Service class #3 9| 1| false| 0| Service class #4

Page 23: AWS Atlanta meetup  2/ 2017 Redshift WLM

Query: After last restart

SELECT service_class, num_executing_queries, num_executed_queries FROM stv_wlm_service_class_stateWHERE service_class > 5ORDER BY service_class;

Run the following query after the last cluster restart to show the number or queries that have either executed or have completed executing.

Service_class | num_executing_queries| num_executed_queries-------------------------------------------------------------------------------- 6| 2| 7023 7| 1| 2057 8| 1| 2970 9| 0| 0

Page 24: AWS Atlanta meetup  2/ 2017 Redshift WLM

Quick Redshift Tuning Tips

• Number of nodes, processors or slices• Node Types• Data Distribution• Data Sort Order• Dataset size• Concurrent operations• Query structure • Code compilation

Factors Affecting Query Performance:

Page 25: AWS Atlanta meetup  2/ 2017 Redshift WLM

Compression EncodingsEncoding Type Keyword in CREATE TABLE and

ALTER TABLEData Types

Raw (no compression)

RAW All

Byte Dictionary BYTEDICT All except Boolean

Delta Delta

Delta32K

SMALLINT, INT, BIGINT, DATE, TIMESTAMP, DECIMAL

INT, BIGINT, DATE, TIMESTAMP, DECIMALLZO LZO All except BOOLEAN, REAL, and DOUBLE

Precision

Run-length RUNLENGTH ALL

Page 26: AWS Atlanta meetup  2/ 2017 Redshift WLM

Compression EncodingsEncoding Type Keyword in CREATE TABLE and

ALTER TABLEData Types

Mostlyn MOSTLY8

MOSTLY16

MOSTLY32

SMALLINT, INT, BIGINT, DECIMAL

INT, BIGINT, DECIMAL

BIGINT, DECIMALText TEXT255

TEXT32K

VARCHAR only

VARCHAR only

Zstandard ZSTD All

Page 27: AWS Atlanta meetup  2/ 2017 Redshift WLM

Questions?

Image by http://www.gratisography.com/

Page 28: AWS Atlanta meetup  2/ 2017 Redshift WLM

Interested in SponsoringAWS Atlanta?

Image by http://www.gratisography.com/