amazon rds for postgresql deep dive - citus...

Grant McAlister – Senior Principal Engineer - RDS

November 2015

Amazon RDS for PostgreSQL

Lessons Learned and Deep Dive on New Features

What’s new in storage

6TB storage

• PIOPS has 30K IOPS max

• GP2 increase storage above 3TB = increase throughput & IOPS

Encryption at rest

• Uses the AWS Key Management Service (KMS) part of AWS

Identity and Access Management (IAM)

• Your own key

• Use a default one

• Includes all data files, log files, log backups, and snapshots

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

2 Threads 4 Threads 8 Threads 16 Threads 32 Threads 64 Threads

PG Bench - Read Only - In Memory

Regular

Encrypted

Encryption at rest overhead

No measureable overhead

2 Threads 4 Threads 8 Threads 16 Threads 32 Threads 64 Threads

nsacti

PG Bench - Read & Write

Regular

Encrypted

Encryption at rest overhead

5 to 10% Overhead on heavy write

Unencrypted Snapshot Sharing

Instance

Snapshot

Prod Account

Test Account

Snapshot

Instance

Snapshot

Share with account

Share to Public

Major version upgrade

9.4 pg_upgrade

Backup Backup

No PITR

9.4 pg_upgrade

Restore to a test instance

Application

Testing

RDS Version updates

• Extensions – pgstattuple, pg_buffercache, ip4r

• Autovacuum Logging

• rds_superuser_reserved_connections

• New PLV8 & PostGIS versions

9.3.10

• New PLV8 & PostGIS versions

Additional Logging

rds.force_admin_logging_level

• Logs actions by the RDS admin user in the customer’s databases.

rds.force_autovacuum_logging_level

• Logs auto vacuum worker operations in all databases.

RDS autovacuum logging

log_autovacuum_min_duration = 5000 (i.e. 5 secs)

rds.force_autovacuum_logging_level = LOG

…[14638]:ERROR: canceling autovacuum task

…[14638]:CONTEXT: automatic vacuum of table "postgres.public.pgbench_tellers"

…[14638]:LOG: skipping vacuum of "pgbench_branches" --- lock not available

pgstattuple sbtest=> select * from pgstattuple('public.sbtest30');

table_len | tuple_count |tuple_percent | dead_tuple_percent | free_percent

-----------+-------------+---------------+--------------------+--------------

234364928 | 1000000 | 91.31 | 2.04 | 3.45

sbtest=> vacuum public.sbtest30;

VACUUM

sbtest=> select * from pgstattuple('public.sbtest30');

table_len | tuple_count | tuple_percent | dead_tuple_percent | free_percent

-----------+-------------+---------------+--------------------+--------------

234364928 | 1000000 | 91.31 | 0 | 5.5

sbtest=> vacuum full public.sbtest30;

VACUUM

sbtest=> select * from pgstattuple('public.sbtest30');

table_len | tuple_count | tuple_percent | dead_tuple_percent | free_percent

-----------+-------------+---------------+--------------------+--------------

221413376 | 1000000 | 96.65 | 0 | 0.3

rds_superuser_reserved_connections

max_connections=20

rds admin superuser_reserved_connections=3

Application

Connections

Application

Connections

Application

Connections

rds superuser rds_superuser_reserved_connections=3

Application must connect as a non rds_superuser

pg_buffercache

select case

when pg_buffercache.relfilenode is null

then ‘free'

else 'used'

end as status,

round(count(*) / (SELECT pg_settings.setting FROM pg_settings WHERE

pg_settings.name = 'shared_buffers')::numeric, 4)*100 as PCTfrom pg_buffercache group by 1 ;

status | pct

--------+---------

used | 31.1200

free | 68.8800

(2 rows)

Operating System (OS) metrics

5 second granularity

Coming

Soon cpuUtilization

• guest

• irq

• system

• wait

• idl:

• user

• total

• steal

• nice

diskIO

• writeKbPS

• readIOsPS

• await

• readKbPS

• rrqmPS

• util

• avgQueueLen

• tps

• readKb

• writeKb

• avgReqSz

• wrqmPS

• writeIOsPS

memory

• writeback

• cached

• free

• inactive

• dirty

• mapped

• active

• total

• slab

• buffers

• pageTable

• cached

• total

• free

• sleeping

• zombie

• running

• stopped

• total

• blocked

fileSys

• used

• usedFiles

• usedFilePercent

• maxFiles

• total

• usedPercent loadAverageMinute

• fifteen

• five

• one

uptime

processList

• name

• cpuTime

• parentID

• memoryUsedPct

• cpuUsedPct

• id

• rss

• vss

OS metrics

Data movement

Move data to the same or different database engine

Keep your apps running during the migration

Start your first migration in 10 minutes or less

Replicate within, to, or from AWS EC2 or RDS

Database Migration

Service

Customer

Premises

Application Users

Internet

Start a replication instance

Connect to source and target databases

Select tables, schemas, or databases

Let the AWS Database Migration

Service create tables and load data

Uses change data capture to keep

them in sync

Switch applications over to the target

at your convenience

Keep your apps running during the migration

AWS Database

Migration Service

AWS Database Migration Service - PostgreSQL

• Source - on premises or Amazon EC2 PostgreSQL (9.4)

• Destination can be EC2 or RDS

• Initial bulk copy via consistent select

• Uses PostgreSQL logical replication support to provide

change data capture

https://aws.amazon.com/dms/preview/

Loading data

• Disable backups – backup_retention=0

• Disable Multi-AZ & autovacuum

• pg_dump –Fc (compressed) pg_restore –j (parallel)

• Increase maintenance_work_mem

• Increase checkpoint_segments & checkpoint_timeout

• Disable FSYNC

• Disable synchronous_commit

both on fsync=0 sync commit=0 fsync=0 & sync commit=0

32 thread insert- fsync vs sync commit

16 segments 256 segments

both on fsync=0 sync commit=0 fsync=0 & sync commit=0

Bulk load 2GB of data -fsync vs sync commit

16 segments 256 segments

Scale and availability

shared_buffers parameter

244GB RAM

PG processes

shared_buffers

pagecache

select of data – check for buffer in shared_buffers

if not in shared_buffers load from pagecache/disk

shared_buffers = working set size

10,000

12,000

3% 6% 13% 25% 50% 75%

shared_buffers as a percentage of system memory

pgbench write workload on r3.8xlarge working set = 10% of memory

25 threads

50 threads

100 threads

200 threads

400 threads

800 threads

10,000

12,000

13% 25% 50% 75%

shared_buffers as a percentage of system memory

pgbench write workload on r3.8xlarge working set = 50% of memory

25 threads

50 threads

100 threads

200 threads

400 threads

800 threads

Availability – Read and Write – Multi-AZ

Physical

Synchronous

Replication

AZ1 AZ2

cname update

Primary Update

Read Replicas = Availability

Replication

Multi-AZ

Async Replication

Read Replica promotion

AZ1 AZ2 AZ3

Read Replicas = Scale

AZ1 AZ2 AZ3

Replication parameters

wal_keep_segments

xlog1 xlog2

xlog3 xlog99

pg_stat_replication

benchdb=> select * from pg_stat_replication;

-[ RECORD 1 ]----+--------------------------------------------

pid | 40385

usesysid | 16388

usename | rdsrepladmin

application_name | walreceiver

client_addr | 10.22.132.253

client_hostname | ip-10-22-132-253.us-west-2.compute.internal

client_port | 22825

backend_start | 2014-10-29 21:44:58.080324+00

state | streaming

sent_location | 98/7A000900

write_location | 98/7A000900

flush_location | 98/7A000900

replay_location | 98/7A000900

sync_priority | 0

sync_state | async

Replication parameters – continued

vacuum_defer_cleanup_age

max_standby_archive_delay

max_standby_streaming_delay

hot_standby_feedback

A - Foo

A- Bar

Source

A - Foo

A- Bar

Replica

vacuum_defer_cleanup_age

on primary

default is 0

# of transactions

Table T1

t1 – foo, bar

t2 – foo, car

t3 – foo, dar

t4 – foo, ear

t5 – foo, far

t6 – foo, gar

t1 – foo, bar

t2 – foo, car

t3 – foo, dar

t4 – foo, ear

t5 – foo, far

max_standby_archive/streaming_delay

Not all sessions will see the max delay

hot_standby_feedback

select * from t1 select * from t1

pg_stat_database_conflicts

benchdb=> select * from pg_stat_database_conflicts;

-------+-----------+------------------+------------+----------------+-----------------+----------------

12891 | template0 | 0 | 0 | 0 | 0 | 0

16384 | rdsadmin | 0 | 0 | 0 | 0 | 0

1 | template1 | 0 | 0 | 0 | 0 | 0

12896 | postgres | 0 | 0 | 0 | 0 | 0

16394 | benchdb | 0 | 0 | 0 | 0 | 0

32810 | bench2 | 0 | 0 | 1 | 0 | 0

pg_stat_statements

Change parameter shared_preload_libraries=pg_stat_statements

=>create extenstion pg_stats_statements

=>select query, calls, total_time, rows, shared_blks_read from

pg_stat_statements where total_time > 100 and query like '%usertable%';

query | calls | total_time | rows | shared_blks_read

-------------------------------------------------------------------------------------------+----------+------------------+------------+-----------------

SELECT * FROM usertable WHERE YCSB_KEY = $1 | 71356782 | 8629119.24887683 | 71356780 | 28779668

SELECT * FROM usertable WHERE YCSB_KEY >= $1 LIMIT ? | 12068394 | 62530609.930002 | 1206839246 | 171093346

UPDATE usertable SET FIELD1=$1 WHERE YCSB_KEY = $2 | 7048967 | 35813107.3580354 | 7048967 | 3825857

analyze usertable; | 1 | 2129.84 | 0 | 15679

SELECT * FROM usertable WHERE YCSB_KEY >= $1 AND md5(YCSB_KEY) = md5(YCSB_KEY) LIMIT ? | 15441280 | 39356905.8080029 | 1544127640 | 230668106

Burst mode: GP2 and T2

T2 – Amazon EC2 instance with burst capability

• Base performance + burst

• Earn credits per hour when below base performance

• Can store up to 24 hours worth of credits

• Amazon CloudWatch metrics to see credits and usage

GP2 – SSD-based Amazon EBS storage

• 3 IOPS per GB base performance

• Earn credits when usage below base

• Burst to 3000+ IOPS

T2 – CPU credits

Burst mode: what’s new

db.t2.large

• 60 CPU Initial Credit

• 36 CPU Credit earned per hour

• Base Performance – 60%

• 8 GB RAM

• Increased IO bandwidth

• Encryption at rest support

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

100% Read - 20GB data

db.m1.medium + 200GB standard

Burst mode vs. classic vs. Provisioned IOPS

$0.58 per hour

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

db.m3.medium + 200G + 2000 IOPS

$0.58 per hour $0.40 per hour

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

db.m3.large + 200G + 2000 IOPS

$0.50 per hour

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

db.t2.medium + 200GB gp2

Burst mode vs. Classic vs. Provisioned IOPS

$0.10 per hour

$0.50 per hour

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

db.t2.medium + 1TB gp2

$0.10 per hour

$0.58 per hour

$0.23 per hour

$0.40 per hour

$0.50 per hour

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

db.t2.medium + 1TB gp2

db.t2.large + 1TB gp2

$0.10 per hour

$0.58 per hour

$0.23 per hour

$0.40 per hour

$0.50 per hour

$0.30 per hour

amazon rds for postgresql deep dive - citus...

Documents

srs real estate limited · 5 kms from srs multiplex, sector...

home to the rising sun · 2017. 9. 13. · novotel - accor...

kms-3000/kms-2500 owner's manual - kr.yamaha.com ·...

th may 2017) ch/tc kms. kms. kms. tiempo hora tc/ss t.c...

codroy valley 33 kms legend · 2011-05-15 · codroy valley...

Što nam donosi windows server vnext (technical preview)?...

kms brochure

magicbricks · 2017. 9. 13. · bangalore central - 7 kms...

how heroku postgres works - citus...

metro station o maharani bagh 1 dnd flyover apollo hospital...

20181129 utimaco webinar - cryptoserver cloud, hsm as a...

siemens kms

since 1953 - emc1400 kms for pgcil. 400kv: completed over...

“citus et certus” 86 aw/ja 480-5911/5912. “citus et...

home for gracious living - ankurnirman.com · dsp - 3.00...

untitled-1 [img.staticmb.com] · 2020-01-08 · dholeshwar...

novi vmm... što je ustvari novo? tomica kaniški citus...

discerning architecture - property junction...• vijaya...

affordable homes - magicbricks · 5 kms from srs multiplex,...

12 grounded dipole tem transmitter - kms technologies ·...