When clients have high availability requirements for their Plone site, we recommend using the RelStorage implementation for the ZODB, combined with the PostgreSQL database platform. When we deploy this solution, we use ZFS on FreeBSD. Until recenty, this solution relied on
rsync to synchronize the PostgreSQL data directory from the primary server to the secondary server. The problem with this approach is that
rsync has to scan the entire data directory to find changed files. We decided to take advantage of the ZFS snapshot replication features to make this step more efficient.
Since ZFS snapshots have the deltas we need regarding changes to the PostgreSQL data directory, we wrote a script that will:
- Stop PostgreSQL on the local secondary server
- Notify the primary PostgreSQL server that a backup is commencing
- Take a new ZFS snapshot
- Initiate an incremental ZFS replication stream
- Munge the
- Notify the primary PostgreSQL server that the backup is finished
- Re-start PostgreSQL on the local secondary server
There are moving parts to this solution, so you'll need to do some leg work before being able to use the script.
Operating System User
In order to take a ZFS snapshot on the remote primary server and initiate replication back to the secondary, you need to have an operating system user setup with a password-less SSH key (on the primary database server):
$ sudo pw groupadd -n zfssync -g 6000 $ sudo pw useradd -n zfssync -u 6000 -g 6000 -m $ sudo -H -u zfssync ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/zfssync/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/zfssync/.ssh/id_rsa. Your public key has been saved in /home/zfssync/.ssh/id_rsa.pub. The key fingerprint is: 48:da:0d:68:2e:66:96:ee:d8:ba:fc:6d:a3:b6:dd:8d email@example.com The key's randomart image is: +--[ RSA 2048]----+ | | | . | | o o | | + + + | | `* o o S | | = . | | . | |.+ .oo. o | |++=++o.E . | +-----------------+ # Copy /home/zfssync/.ssh to your secondary database server $ sudo zfs allow -u zfssync create,mount,snapshot,send,receive,hold data/pgsql
The calls to
pg_stop_backup require elevated privileges in PostgreSQL. Here's how to set up a PostgreSQL database user with the
$ psql -U pgsql postgres psql (9.3.2) Type "help" for help. postgres=# create user replicator with replication;
host all replicator 10.12.2.0/24 trust
You can use
md5 for the
METHOD, but you must then setup the .pgpass file for your operating system user.
Initial Snapshot Replication
In order to take advantage of efficient incremental replication, the secondary database server must first transfer an initial snapshot of the ZFS filesystem holding the current PostgreSQL data directory (this builds on the first requisite above, to be run on the secondary server):
$ sudo ssh -i /home/zfssync/.ssh/id_rsa zfssync@<IP of primary> zfs snapshot data/pgsql@init-secondary $ sudo ssh -i /home/zfssync/.ssh/id_rsa zfssync@<IP of primary> zfs send -Rv data@init-secondary | sudo zfs recv -Fv data/pgsql
Coup de grâce
reset_secondary.sh script (and ancillary config file info) is on Github: https://gist.github.com/davidblewett/8282108 . It's usage is pretty simple (on secondary):
$ sudo /path/to/reset_secondary.sh <IP of primary>
After the script has run, the secondary server will be running off of an up-to-date ZFS snapshot. If you use the config file info in the gist, it will then continue to use PostgreSQL's built-in streaming replication to keep itself up to date.