slony, still a usefull replication tool

Slony, still a usefull replication tool

Who's Guillaume Lelarge?

● French translator of the PostgreSQL manual● Member of pgAdmin's team● Vice-treasurer of PostgreSQL Europe● CTO of Dalibo

● Mail: [email protected]● Twitter: g_lelarge● Blog: http://blog.guillaume.lelarge.info

mailto:[email protected]

Agenda

● Intro● Installation● Configuration● Examples● Maintenance● Tools● The future?

What's Slony?

● Replication system● One master● Multiple slaves● Asynchronous● By trigger● Nodes cascade

● Slaves available● Read-only queries on replicated tables● Read/write for everything else

Developers

● Mostly Affilias staff● Jan Wieck● Christopher Browne● Steve Singer

● And also a lot of other contributors● Devrim Gunduz (RPM packages)

More detail

● BSD license● Written mostly in C

● altperl tools are obviously in Perl

Use cases

● High-availability● Pooling may be done with pgpool or pgbouncer● Load balancing may be done with pgpool

● PostgreSQL upgrade● Cascade replication

● Mostly to reduce load on the master

Concepts

● Trigger based replication● Slon daemon● Slonik language

Concepts - Triggers

● Catches data changes on the master● Prevents data changes on the slaves● Allows finer grain in the replication

● Table per table

● Issue 1: doesn't catch schema change● Issue 2: doesn't catch TRUNCATE

● At least until PostgreSQL 8.4, and Slony 2.1

Concepts - Cluster

● Set of PostgreSQL instances available in the replication

● Has a name

Concepts - Node

● Each PostgreSQL instance is a node● Each node is declared in the cluster● A node has a specific schema containing Slony

catalog tables● Its name is _<clustername>

Concepts – Replication set

● A replication set is a set of tables that will be replicated

● A set has one master node, and one or more slaves node

● You can have several sets in the same cluster● They can have different master node, and

different slaves● Tables of the set will atuomatically acquire

triggers

Concepts – Master/Slave

● Master node● Origin● Provider

● Slave node● Receiver● Subscriber

● It's just vocabulary!

Concepts – slon daemon

● One slon per node● Handles

● Configuration events● SYNC events

●

Concepts – slonik language

● Little language● Handles

● adding and removing nodes● adding and removing sets● merging sets● adding and removing subscriptions● switchover / failover● etc.

Limitations

● Large Objects● DDL commands

● Logical structure (schema, table, ...)● Physical structure (tablespace)● Users and roles

Two branches

● 1.2● Works with PostgreSQL 7.3 to 9.1

● 2.0● Works with PostgreSQL 8.3 to 9.1

Installation

● Prerequisites● Sources

● Download● Configure● Compile● Install

● Binary● RPM/DEB● Windows

Sources - Prerequisites

● You need pg_config

Sources - Download

● http://slony.info/downloads● Choose the branch● Choose “source”● Grab one of the tar files●

● For example:

wget http://slony.info/downloads/2.0/source/slony1-2.0.7.tar.bz2

Sources - Extract

● Quite simple:tar xvfj slony1-2.0.7.tar.bz2

Sources - Configure

● ./configure options● --prefix to specify the installation directory● --with-pgconfigdir to specify the location of the

pg_config tool● --with-perltools to add perl tools● And a few more less interesting

● Example./configure \ --with-perltools \ --with-pgconfigdir=/opt/postgresql-9.1/bin \ --prefix=/opt/slony-2.0

Sources – Compile

● Quite simplemake

● If everything went fine:All of Slony-I is successfully made. Ready to install

Sources - Install

● Quite simple tooSudo make install

● If everything went fine:All of Slony-I is successfully installed

Binary - RPM

● Available on http://yum.postgresql.org/● For example:# wget http://yum.postgresql.org/9.1/redhat/rhel-5-x86_64/pgdg-sl91-9.1-6.noarch.rpm# dpkg -i pgdg-sl91-9.1-6.noarch.rpm# yum update# yum install slony1-91-2.0.7

http://yum.postgresql.org/

Binary - DEB

● Two packages to install:● slony1-2-bin (slon daemon)● postgresql-8.4-slony1 (two PostgreSQL

librairies)

● Example:# aptitude install slony1-2-bin postgresql-8.4-slony1

Binary - Windows

● With StackBuilder

Prerequisites

● Same encoding on the master and the slaves● Time synchronization

Some examples

● Three examples● Different number of slaves

● All with the pgbench schema● Packages already installed:

● PostgreSQL 9.1● Slony 2.0

Add a PostgreSQL user

● Slon connects to the database● It's better to have a specific PostgreSQL user● Must be a superuser● Should have a password● Simple command:

CREATE USER slony PASSWORD 'supersecret';

Add a .pgpass file

● Contents:10.0.0.1:5432:b1:slony:supersecret10.0.0.2:5432:b1:slony:supersecret

● Don't forget the “chmod 600 .pgpass”!

Create the master database

● We use pgbench● to add tables to the database● and to populate them

# createdb b1# pgbench -i b1

Example 1:One master, and one slave

● Master is on 10.0.0.1● Slave is on 10.0.0.2

All tables should have a primary key

● Only way to select safely a row● Let's see the tables with no PKEY:

SELECT relname FROM pg_classWHERE relkind='r' AND NOT relhaspkey;

relname----------------- pgbench_history

● Add a PKEY to the pgbench_history tableALTER TABLE pgbench_history ADD COLUMN id serial;ALTER TABLE pgbench_history ADD PRIMARY KEY (id);

Install PL/pgsql

● Only for PostgreSQL less than 9.0CREATE LANGUAGE plpgsql;

We need PostgreSQL to listen to incoming connections

● So we have to configure postgresql.conf● On the master● And on the slave

listen_addresses = '*'

● And then restart both PostgreSQL servers/etc/init.d/postgresql restart

... and a way to connect both ways

● So we have to configure pg_hba.conf● On the master● And on the slave

host b1 slony 10.0.0.1/32 md5host b1 slony 10.0.0.2/32 md5

● And then reload both PostgreSQL servers/etc/init.d/postgresql reload

Now, create the schema on the slave

● We'll use pg_dump to dump the schema● Option -s

● And psql to restore it on the slave● Option -h to target the slave

pg_dump -s b1 | psql -h 10.0.0.2 b1

Write the slonik script

● We need to write a little script● It helps Slony know what to do● Compose of a few steps

● Declare cluster, and nodes● Initialize cluster● Create set and add tables to set● Add node, and paths● Subscribe slave node to set

Slonik script - declaration

● Declare clustercluster name = 'repli1';

● Declare nodesnode 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';

Slonik script – initialize cluster

● Tells Slony which node is master● Creates a schema in the database

init cluster (id=1, comment='Paris');

Slonik script – create set

● Create a setcreate set (id=1, origin=1, comment='All tables and sequences from b1');

Slonik script – fill set

● Add all tables to the setset add table (set id=1, origin=1, id=1, fully qualified name = 'public.pgbench_accounts', comment = 'Accounts table');set add table (set id=1, origin=1, id=2, fully qualified name = 'public.pgbench_branches', comment = 'Branches table');set add table (set id=1, origin=1, id=3, fully qualified name = 'public.pgbench_tellers', comment = 'Tellers table');set add table (set id=1, origin=1, id=4, fully qualified name = 'public.pgbench_history', comment = 'History table');

● Add all sequences to the setset add sequence (set id=1, origin=1, id=1, fully qualified name = 'public.pgbench_history_id_seq', comment = 'History sequence');

Slonik script – add node

● Add a slavestore node (id=2, comment='Amsterdam', event node=1);

● Add paths between master and slavestore path (server=1, client=2, conninfo='dbname=b1 host=10.0.0.1 user=slony');store path (server=2, client=1, conninfo='dbname=b1 host=10.0.0.2 user=slony');

Slonik script – subscribe node

● Subscribe slave node (id 2) to set (1, all tables)subscribe set (id = 1, provider = 1, receiver = 2, forward = no);

Execute the script

● Put the script in a single filevim script_repli_1

● And execute it with slonikslonik script_repli_1

What happened?

● Let's look into the database● New schema _repli1● Lots of tables in that schema● Triggers on replicated tables

The new schema

● Its name depends on the cluster name● _repli1 for this example

● Dictionnary tables● sl_node, sl_path, sl_tables, sl_sequences, ...

● Work tables● sl_log_1, sl_log_2, sl_event, ...

The new triggers

b1=# \d pgbench_branches Table "public.pgbench_branches" Column | Type | Modifiers ----------+---------------+----------- bid | integer | not null bbalance | integer | filler | character(88) | Indexes: "pgbench_branches_pkey" PRIMARY KEY, btree (bid)

Triggers: _repli1_logtrigger AFTER INSERT OR DELETE OR UPDATE ON pgbench_branches FOR EACH ROW EXECUTE PROCEDURE _repli1.logtrigger('_repli1', '2', 'k')

Disabled triggers: _repli1_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON pgbench_branches FOR EACH ROW EXECUTE PROCEDURE _repli1.denyaccess('_repli1')

Configuration of the daemon

● File slon.conf● Sample one in $srcdir/share● You need at least

● cluster_name● conn_info

● You can also use mkslonconf.sh

Execute the daemon

● On the master/opt/slony-2.0/bin/slon repli1 “host=10.0.0.1 user=slony dbname=b1”

● On the slave/opt/slony-2.0/bin/slon repli1 “host=10.0.0.2 user=slony dbname=b1”

● You can also use start_slon.sh

And it works!

● Each (data) change on the master will be recorded by the trigger in the log tables

● And the slon daemon will push them to the slave

● Each (data) change on the slave will be denied by the trigger

Use, and maintenance

● How to add a table● How to remove a table● How to add another PostgreSQL object● How to add a new slave● How to do a switchover● How to do a failover● How to monitor

Add a table

● Add the new table to every slave● Create a new set for the new table● Add the new table to the new set● Subscribe the slave to the new set● Wait for every events to be confirmed● Merge the new set with the old one

Add a table – step 0

● On the master:CREATE TABLE t1( c1 serial PRIMARY KEY, c2 text);

● Maybe add some data to it:INSERT INTO t1 SELECT i, 'Row '||i FROM generate_series(1, 1000) AS i;


● Add the new table to the slavepg_dump -s b1 -t t1 | psql -h 10.0.0.2 b1


● We need a new slonik script with the preambulecluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';

● Now, create a new set for the new table and sequence

create set (id=2, origin=1, comment='Temporary set to add the new table from b1');


● We add the table and sequence to the new setset add table (set id=2, origin=1, id=5, fully qualified name = 'public.t1', comment = 't1 table');set add sequence (set id=2, origin=1, id=2, fully qualified name = 'public.t1_c1_seq', comment = 't1 sequence');

● Be careful● Set's id must be max of all sets' id● The same for table's id and sequence's id● Quick way to find out

SELECT max(set_id)+1 AS new_id FROM _repli1.sl_set;


● Subscribe slave node (id 2) to the new set (2)subscribe set (id = 2, provider = 1, receiver = 2, forward = no);


● Now, we have to force a pause until all replication events are confirmed by the slave

wait for event (origin = 1, confirmed = all, wait on = 1);


● Now, we can merge new set into old onemerge set (id = 1, add id = 2, origin = 1);

Add a table – Execute the script

● Put the script in a single filevim script_repli_2

● And execute it with slonikslonik script_repli_2

● Now, the new table is replicated

Drop a table

● Drop the table from the replication● Then drop the table

Drop a table – step 1

● Another slonik script, but much simplercluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';set drop table (id = 5, origin = 1);

● And execute the scriptslonik script_repli_3

● The table is removed from the replication set

Drop a table – step 2

● We now need to drop it from the database● We can do it manually on each node● Or via the EXECUTE SCRIPT slonik command

cluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';execute script (set id = 1, filename = 'script.sql', event node = 1);

● In script.sql, we put the DROP TABLEDROP TABLE t1;

Add a column to a table

● Adding a column to a replicated table● Each node must have that row at the same time

● So you can't do it manually● You must use slonik● Also works for renaming a column, or dropping

it● Don't drop the primary key!

Add a column – step 1

● Create a SQL script file that drops the rowALTER TABLE t1 DROP COLUMN c2;

Add a column to a table – step 2

● Create a slonik script file that executes the script on all the nodes

cluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';execute script (set id = 1, filename = 'drop_column_script.sql', event node = 1);

● And execute the scriptslonik script_repli_4

Add another PostgreSQL object

● Add them manually● Or, much better, with the EXECUTE SCRIPT● And that's all

Add a new slave

● Not fundamentally different than what we already did

● Few steps● Restore schema in new slave● Create a slonik script to add the new node, and

subscribe to the set● And we're done!

Add a new slave – step 1

● Reconfigure pg_hba.conf on all existing nodeshost b1 slony 10.0.0.1/32 md5host b1 slony 10.0.0.2/32 md5host b1 slony 10.0.0.3/32 md5

● Reload PostgreSQL configuration on all existing nodes

/etc/init.d/postgresql reload

● Add the slony PostgreSQL user● Restore the schema (from master to the new

node)pg_dump -s b1 | psql -h 10.0.0.3 b1

Add a new slave – step 2

● Create a slonik scriptcluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';store node (id=3, comment='Stuttgart', event node=1);store path (server=1, client=3, conninfo='dbname=b1 host=10.0.0.3 user=slony');store path (server=2, client=3, conninfo='dbname=b1 host=10.0.0.3 user=slony');store path (server=3, client=1, conninfo='dbname=b1 host=10.0.0.1 user=slony');store path (server=3, client=2, conninfo='dbname=b1 host=10.0.0.2 user=slony');subscribe set (id = 1, provider = 1, receiver = 2, forward = no);

Switchover

● We want the slave to become the master node● But the master node is still alive, and we want it

to become a slave node● This is switchover● More exactly, we want to move the origin of a

set

switchover – slonik script

● So, we need yet another slonik scriptcluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';lock set (id = 1, origin = 1);move set (id = 1, old origin = 1, new origin = 2);wait for event (origin = 1, confirmed = 2, wait on = 1);

● Right after, node 2 is master● And node 1 is slave

failover

● Master node died● We want the slave to become the master node● This is failover● More exactly, we want to move the origin of a

set when the current origin is dead

failover – slonik script

● So, we need yet another slonik scriptcluster name = 'repli1';node 1 admin conninfo='dbname=b1 host=10.0.0.1 user=slony';node 2 admin conninfo='dbname=b1 host=10.0.0.2 user=slony';failover (id = 1, backup node = 2);drop node (id = 1, event node = 2);

● Right after, node 2 is master● And node 1 is gone

Monitoring

● With Slony, two things to check● Lag● Schema changes

● One great tool● check_postgres.pl

Monitoring lag – alerts

● Checking lag./check_postgres.pl --action slony_status --host=192.168.56.101,192.168.56.102 --dbname=b1 --dbuser=slony --warning 60 --critical 120

POSTGRES_SLONY_STATUS OK: DB "b1" schema:_repli1 (host:192.168.56.101) Slony lag time: 10 (10 seconds) DB "b1" schema:_repli1 (host:192.168.56.102) Slony lag time: 11 (11 seconds) | time=0.04s 'b1._repli1 Node 1(Paris) -> Node 2(Amsterdam)'=10;60;300 time=0.07s 'b1._repli1 Node 2(Amsterdam) -> Node 1(Paris)'=11;60;300

Monitoring lag – graph

Monitoring schema changes

● Checking schema changes./check_postgres.pl --action same_schema --host=192.168.56.101,192.168.56.102 --dbname=b1 --dbuser=slony --filter ...

<todo>

Tools

● Two main tools to ease your life● altperl● slony1-ctl

● Allow you to skip writing slonik script

Tools - altperl

● Perl scripts● Available with slony

● Follow closely each Slony release

● --with-perltools configure option● Installed by default in $installdir/bin

Tools – altperl example

● Create slon_tools.conf fileif ($ENV{"SLONYNODES"}) { require $ENV{"SLONYNODES"};} else { $CLUSTER_NAME = 'repli1'; add_node(node => 1, host => '192.168.56.101', dbname => 'b1', port => 5432, user => 'slony'); add_node(node => 2, host => '192.168.56.102', dbname => 'b1', port => 5432, user => 'slony');}$SLONY_SETS = { "set1" => { "set_id" => 1, "table_id" => 1, "sequence_id" => 1, "pkeyedtables" => [ 'public.pgbench_accounts', 'public.pgbench_branches', 'public.pgbench_tellers', 'public.pgbench_history'], "sequences" => ['public.pgbench_history_id_seq1'], },};if ($ENV{"SLONYSET"}) { require $ENV{"SLONYSET"}; }

Tools – altperl example

● Use the slonik perl tools, and slonikslonik_init_cluster | slonikslonik_create_set 1 | slonikslonik_subscribe_set 1 2 | slonik

● And that's all

Tools - slony1-ctl

● Bash scripts● Available on pgfoundry

● http://pgfoundry.org/projects/slony1-ctl

● Simple scripts that makes it easier to replicate a complete database

Tools – slony1-ctl config files

● slony_include.h● Paths for directories, and files

● bases.h● List of nodes with cluster name, database, host,

port

● relations.h● List of sets, masters, and slaves

● slon.cfg● Config file for the slon daemon

Tools – slony1-ctl scripts

● Add a new slony replication● create_struct.sh● 01_create_init.sh● 02_exec_init.sh

● Switchover● 82_exec_switch.sh

● Failover● 92_exec_fail.sh

● And a lot more (more than 30 scripts!)

What's next?

● Slony 2.1 (in RC since August, 30th)● Add TRUNCATE trigger support● Add column replication support● Add application name support● Add monitoring thread● New features for altperl tools (bulk adding of

tables and sequences)● Etc...

Conclusion

● Slony is not dead● It offers features that PostgreSQL doesn't have

● Fine-grain replication● Slaves can have other objects● No limitation on PostgreSQL version, or

architecture

● It also has limitations● No DDL replication● No large object replication

slony, still a usefull replication tool

Documents