amazon rds for postgresql deep dive - citus...
Post on 06-Feb-2018
245 Views
Preview:
TRANSCRIPT
Grant McAlister – Senior Principal Engineer - RDS
November 2015
Amazon RDS for PostgreSQL
Lessons Learned and Deep Dive on New Features
What’s new in storage
6TB storage
• PIOPS has 30K IOPS max
• GP2 increase storage above 3TB = increase throughput & IOPS
Encryption at rest
• Uses the AWS Key Management Service (KMS) part of AWS
Identity and Access Management (IAM)
• Your own key
• Use a default one
• Includes all data files, log files, log backups, and snapshots
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
2 Threads 4 Threads 8 Threads 16 Threads 32 Threads 64 Threads
Tra
nsa
cti
on
s P
er
Se
co
nd
(T
PS
)
PG Bench - Read Only - In Memory
Regular
Encrypted
Encryption at rest overhead
No measureable overhead
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
2 Threads 4 Threads 8 Threads 16 Threads 32 Threads 64 Threads
Tra
nsacti
on
s P
er
Seco
nd
(T
PS
)
PG Bench - Read & Write
Regular
Encrypted
Encryption at rest overhead
5 to 10% Overhead on heavy write
Unencrypted Snapshot Sharing
DB
Instance
Snapshot
Prod Account
Test Account
Snapshot
DB
Instance
Snapshot
Share with account
Share to Public
Major version upgrade
Prod
9.3
Prod
9.4 pg_upgrade
Backup Backup
No PITR
Test
9.3
Test
9.4 pg_upgrade
Restore to a test instance
Application
Testing
RDS Version updates
9.4.5
• Extensions – pgstattuple, pg_buffercache, ip4r
• Autovacuum Logging
• rds_superuser_reserved_connections
• New PLV8 & PostGIS versions
9.3.10
• New PLV8 & PostGIS versions
Additional Logging
rds.force_admin_logging_level
• Logs actions by the RDS admin user in the customer’s databases.
rds.force_autovacuum_logging_level
• Logs auto vacuum worker operations in all databases.
RDS autovacuum logging
log_autovacuum_min_duration = 5000 (i.e. 5 secs)
rds.force_autovacuum_logging_level = LOG
…[14638]:ERROR: canceling autovacuum task
…[14638]:CONTEXT: automatic vacuum of table "postgres.public.pgbench_tellers"
…[14638]:LOG: skipping vacuum of "pgbench_branches" --- lock not available
pgstattuple sbtest=> select * from pgstattuple('public.sbtest30');
table_len | tuple_count |tuple_percent | dead_tuple_percent | free_percent
-----------+-------------+---------------+--------------------+--------------
234364928 | 1000000 | 91.31 | 2.04 | 3.45
sbtest=> vacuum public.sbtest30;
VACUUM
sbtest=> select * from pgstattuple('public.sbtest30');
table_len | tuple_count | tuple_percent | dead_tuple_percent | free_percent
-----------+-------------+---------------+--------------------+--------------
234364928 | 1000000 | 91.31 | 0 | 5.5
sbtest=> vacuum full public.sbtest30;
VACUUM
sbtest=> select * from pgstattuple('public.sbtest30');
table_len | tuple_count | tuple_percent | dead_tuple_percent | free_percent
-----------+-------------+---------------+--------------------+--------------
221413376 | 1000000 | 96.65 | 0 | 0.3
rds_superuser_reserved_connections
max_connections=20
rds admin superuser_reserved_connections=3
20
Application
Connections
17
Application
Connections
14
Application
Connections
rds superuser rds_superuser_reserved_connections=3
Application must connect as a non rds_superuser
pg_buffercache
select case
when pg_buffercache.relfilenode is null
then ‘free'
else 'used'
end as status,
round(count(*) / (SELECT pg_settings.setting FROM pg_settings WHERE
pg_settings.name = 'shared_buffers')::numeric, 4)*100 as PCTfrom pg_buffercache group by 1 ;
status | pct
--------+---------
used | 31.1200
free | 68.8800
(2 rows)
Operating System (OS) metrics
5 second granularity
Coming
Soon cpuUtilization
• guest
• irq
• system
• wait
• idl:
• user
• total
• steal
• nice
diskIO
• writeKbPS
• readIOsPS
• await
• readKbPS
• rrqmPS
• util
• avgQueueLen
• tps
• readKb
• writeKb
• avgReqSz
• wrqmPS
• writeIOsPS
memory
• writeback
• cached
• free
• inactive
• dirty
• mapped
• active
• total
• slab
• buffers
• pageTable
swap
• cached
• total
• free
tasks
• sleeping
• zombie
• running
• stopped
• total
• blocked
fileSys
• used
• usedFiles
• usedFilePercent
• maxFiles
• total
• usedPercent loadAverageMinute
• fifteen
• five
• one
uptime
processList
• name
• cpuTime
• parentID
• memoryUsedPct
• cpuUsedPct
• id
• rss
• vss
OS metrics
Data movement
Move data to the same or different database engine
Keep your apps running during the migration
Start your first migration in 10 minutes or less
Replicate within, to, or from AWS EC2 or RDS
AWS
Database Migration
Service
Customer
Premises
Application Users
EC2
or
RDS
Internet
VPN
Start a replication instance
Connect to source and target databases
Select tables, schemas, or databases
Let the AWS Database Migration
Service create tables and load data
Uses change data capture to keep
them in sync
Switch applications over to the target
at your convenience
Keep your apps running during the migration
AWS Database
Migration Service
AWS Database Migration Service - PostgreSQL
• Source - on premises or Amazon EC2 PostgreSQL (9.4)
• Destination can be EC2 or RDS
• Initial bulk copy via consistent select
• Uses PostgreSQL logical replication support to provide
change data capture
https://aws.amazon.com/dms/preview/
Loading data
• Disable backups – backup_retention=0
• Disable Multi-AZ & autovacuum
• pg_dump –Fc (compressed) pg_restore –j (parallel)
• Increase maintenance_work_mem
• Increase checkpoint_segments & checkpoint_timeout
• Disable FSYNC
• Disable synchronous_commit
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
both on fsync=0 sync commit=0 fsync=0 & sync commit=0
Tran
sact
ion
s p
er S
eco
nd
32 thread insert- fsync vs sync commit
16 segments 256 segments
0
20
40
60
80
100
120
140
160
both on fsync=0 sync commit=0 fsync=0 & sync commit=0
Tim
e -
Sec
on
ds
Bulk load 2GB of data -fsync vs sync commit
16 segments 256 segments
Scale and availability
shared_buffers parameter
244GB RAM
PG processes
shared_buffers
Linux
pagecache
select of data – check for buffer in shared_buffers
if not in shared_buffers load from pagecache/disk
EBS
1/4
shared_buffers = working set size
0
2,000
4,000
6,000
8,000
10,000
12,000
3% 6% 13% 25% 50% 75%
tra
nsa
cti
on
s p
er
sec
on
d (
TP
S)
shared_buffers as a percentage of system memory
pgbench write workload on r3.8xlarge working set = 10% of memory
25 threads
50 threads
100 threads
200 threads
400 threads
800 threads
0
2,000
4,000
6,000
8,000
10,000
12,000
13% 25% 50% 75%
tra
nsa
cti
on
s p
er
sec
on
d (
TP
S)
shared_buffers as a percentage of system memory
pgbench write workload on r3.8xlarge working set = 50% of memory
25 threads
50 threads
100 threads
200 threads
400 threads
800 threads
Availability – Read and Write – Multi-AZ
Physical
Synchronous
Replication
AZ1 AZ2
DNS
cname update
Primary Update
Read Replicas = Availability
Sync
Replication
Multi-AZ
Async Replication
Read Replica promotion
AZ1 AZ2 AZ3
Read Replicas = Scale
AZ1 AZ2 AZ3
Replication parameters
wal_keep_segments
xlog1 xlog2
xlog3 xlog99
xlog1
xlog1
pg_stat_replication
benchdb=> select * from pg_stat_replication;
-[ RECORD 1 ]----+--------------------------------------------
pid | 40385
usesysid | 16388
usename | rdsrepladmin
application_name | walreceiver
client_addr | 10.22.132.253
client_hostname | ip-10-22-132-253.us-west-2.compute.internal
client_port | 22825
backend_start | 2014-10-29 21:44:58.080324+00
state | streaming
sent_location | 98/7A000900
write_location | 98/7A000900
flush_location | 98/7A000900
replay_location | 98/7A000900
sync_priority | 0
sync_state | async
Replication parameters – continued
vacuum_defer_cleanup_age
max_standby_archive_delay
max_standby_streaming_delay
hot_standby_feedback
A - Foo
A- Bar
Source
A - Foo
A- Bar
Replica
vacuum_defer_cleanup_age
on primary
default is 0
# of transactions
Table T1
t1 – foo, bar
t2 – foo, car
t3 – foo, dar
t4 – foo, ear
t5 – foo, far
t6 – foo, gar
t1 – foo, bar
t2 – foo, car
t3 – foo, dar
t4 – foo, ear
t5 – foo, far
max_standby_archive/streaming_delay
xlog1
Not all sessions will see the max delay
hot_standby_feedback
xlog1
select * from t1 select * from t1
pg_stat_database_conflicts
benchdb=> select * from pg_stat_database_conflicts;
datid | datname | confl_tablespace | confl_lock | confl_snapshot | confl_bufferpin | confl_deadlock
-------+-----------+------------------+------------+----------------+-----------------+----------------
12891 | template0 | 0 | 0 | 0 | 0 | 0
16384 | rdsadmin | 0 | 0 | 0 | 0 | 0
1 | template1 | 0 | 0 | 0 | 0 | 0
12896 | postgres | 0 | 0 | 0 | 0 | 0
16394 | benchdb | 0 | 0 | 0 | 0 | 0
32810 | bench2 | 0 | 0 | 1 | 0 | 0
pg_stat_statements
Change parameter shared_preload_libraries=pg_stat_statements
=>create extenstion pg_stats_statements
=>select query, calls, total_time, rows, shared_blks_read from
pg_stat_statements where total_time > 100 and query like '%usertable%';
query | calls | total_time | rows | shared_blks_read
-------------------------------------------------------------------------------------------+----------+------------------+------------+-----------------
SELECT * FROM usertable WHERE YCSB_KEY = $1 | 71356782 | 8629119.24887683 | 71356780 | 28779668
SELECT * FROM usertable WHERE YCSB_KEY >= $1 LIMIT ? | 12068394 | 62530609.930002 | 1206839246 | 171093346
UPDATE usertable SET FIELD1=$1 WHERE YCSB_KEY = $2 | 7048967 | 35813107.3580354 | 7048967 | 3825857
analyze usertable; | 1 | 2129.84 | 0 | 15679
SELECT * FROM usertable WHERE YCSB_KEY >= $1 AND md5(YCSB_KEY) = md5(YCSB_KEY) LIMIT ? | 15441280 | 39356905.8080029 | 1544127640 | 230668106
Burst mode: GP2 and T2
T2 – Amazon EC2 instance with burst capability
• Base performance + burst
• Earn credits per hour when below base performance
• Can store up to 24 hours worth of credits
• Amazon CloudWatch metrics to see credits and usage
GP2 – SSD-based Amazon EBS storage
• 3 IOPS per GB base performance
• Earn credits when usage below base
• Burst to 3000+ IOPS
T2 – CPU credits
Burst mode: what’s new
db.t2.large
• 60 CPU Initial Credit
• 36 CPU Credit earned per hour
• Base Performance – 60%
• 8 GB RAM
• Increased IO bandwidth
• Encryption at rest support
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
Burst mode vs. classic vs. Provisioned IOPS
$0.58 per hour
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
db.m3.medium + 200G + 2000 IOPS
Burst mode vs. classic vs. Provisioned IOPS
$0.58 per hour $0.40 per hour
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
db.m3.medium + 200G + 2000 IOPS
db.m3.large + 200G + 2000 IOPS
Burst mode vs. classic vs. Provisioned IOPS
$0.58 per hour $0.40 per hour
$0.50 per hour
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
db.m3.medium + 200G + 2000 IOPS
db.m3.large + 200G + 2000 IOPS
db.t2.medium + 200GB gp2
Burst mode vs. Classic vs. Provisioned IOPS
$0.10 per hour
$0.58 per hour $0.40 per hour
$0.50 per hour
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
db.m3.medium + 200G + 2000 IOPS
db.m3.large + 200G + 2000 IOPS
db.t2.medium + 200GB gp2
db.t2.medium + 1TB gp2
Burst mode vs. Classic vs. Provisioned IOPS
$0.10 per hour
$0.58 per hour
$0.23 per hour
$0.40 per hour
$0.50 per hour
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Tra
ns
ac
tio
ns
pe
r S
ec
on
d (
TP
S)
Hours
100% Read - 20GB data
db.m1.medium + 200GB standard
db.m3.medium + 200G + 2000 IOPS
db.m3.large + 200G + 2000 IOPS
db.t2.medium + 200GB gp2
db.t2.medium + 1TB gp2
db.t2.large + 1TB gp2
Burst mode vs. Classic vs. Provisioned IOPS
$0.10 per hour
$0.58 per hour
$0.23 per hour
$0.40 per hour
$0.50 per hour
$0.30 per hour
Q & A
top related