This article describes how to setup PostgreSQL Replication and Hot Standby on two Amazon EC2 instances running Ubuntu. I spent good chunk of time learning these tricks in past two weeks, in order to implement High Availability solution for 3DTin database server. The standby node can take over when main node goes down, the way it did last month during electric outage of Amazon EC2′s East coast data centers. 3DTin was down for 6 hours during the outage and since then we have been implementing measures to avoid it from happening again in future. This post may help other web developers to build robust cloud backends as well.
The replication technique used here has become available in PostgreSQL only since version 9.1. There are several blog posts and few books that describe it, but they don’t explain everything that is necessary. I found the explanation of these setups in “PostgreSQL 9 Administration Cookbook” very useful. I tried all the replication recipes described in that book on two VMs on my laptop. After going through 7 different database clusters, each used for testing different recipe, I got a good handle on the things. Today we’ve successfully set up this replication solution on 3DTin production database. With the details fresh in mind I would like to document all the steps for future reference and in the hope that it may also help others, given that the documentation on this topic is not yet abundant.
Setup
The instructions here use Ubuntu 11.10 32-bit server, with PostgreSQL 9.1.4 (Ubuntu 12.04 will do as well with any PostgreSQL version 9.1.x).
We are going to prepare 2 hosts, one master and another standby. If you are using Amazon EC2 like us, you want to run these two hosts in two different Availability zones. That is important if your goal is High Availability. Any accidental downtime is not supposed to affect more than one Amazon EC2 availability zones simultaneously. Therefore if one of the host goes down, you can bring up your application servers in another zone where your standby DB server is running. You can also put them in two different regions, but that will cost you more in networking bandwidth.
Preparation
Install following packages on Ubuntu 11.10/12.04
postgresql-9.1
postgresql-client-9.1
postgresql-contrib-9.1
postgres user account
Installing these packages will automatically create postgres user account.
All following commands are supposed to be executed as postgres user. For security reasons this account doesn’t have any password (like the root account on Ubuntu). If you are working in test environment you can set a password for postgres with
sudo passwd postgres
Then perform all following commands by logging in as postgres
su - postgres
If you are on production server though, you may want to leave postgres passwordless. Instead run all following commands with sudo -u postgres prepended. I am going to omit it from every command for convenience.
Password-less ssh logins
MASTER server should be able to access STANDBY over ssh without password. More accurately – postgres account on MASTER should be able to access postgres account on STANDBY without password. This is needed for the rsync operation in base-backup step (explained below)
You can do it using ssh-copy-id utility.
On master run
ssh-copy-id <IP_OF_STANDBY_SERVER>
You can also specify user in above command, but I deliberately omitted it because it is recommended that on both machines you do these operations under same user account postgres. Not mentioning the user will make the command login to remote machine with same username as you are currently logged in on this machine (or mentioned in sudo -u).
In case of failover, MASTER server and STANDBY server will be switching roles, therefore run ssh-copy-id from current STANDBY server, so that it can access current MASTER server without password too.
Ubuntu specific PostgreSQL conventions
On ubuntu a PostgreSQL database instance is called cluster. Don’t confuse it with multi-node configuration of servers or even a certain SQL keyword. It’s simply an instance of PostgreSQL server that runs on a particular port and saves its data in its own directory (default being /var/lib/postgresql/9.1/<cluster-name> - also set to PGDATA environment variable in various scripts).
Ubuntu comes with pg_createcluster, pg_ctlcluster, pg_lscluster CLI commands to help you administer these clusters. You will use pg_ctlcluster to start, stop, reload the database instance.
Each cluster also has its own set of configuration files stored in /etc/postgresql/9.1/<cluster-name>. We will need to modify only two of them: pg_hba.conf and postgresql.conf.
For keeping things simple, it’s better if you ensure that $PGDATA is same on both MASTER and STANDBY servers. (Side note: On production server $PGDATA is expected to be mounted on a separate volume)
Replication with Streaming Log Shipping and Hot Standby
There are many different configurations that you can follow to achieve replication across two PostgreSQL instances. Each has its own benefits and pitfalls. The technique we are going to use is known as ‘Streaming Log Shipping’ and ‘Hot Standby’ in the Admin cookbook I referred to earlier. This setup seems most optimal in terms of immediate replication (i.e. smallest window of data loss) and minimum networking traffic (hence cheaper).
This configuration is of Master-Slave kind. The Slave or STANDBY server can be used to query the database, but you cannot make any writes to it. This configuration is NOT what is known as “Multi Master” configuration. The Master will accept all read and write queries from your application servers. The Standby server will copy the new modifications from Master with minimal delay. It will only be available for read-only queries while in Standby mode. When the Master goes down or we bring it down for other reasons, we can tell the Standby server to become the Master by creating a touch file (explained later). After this point the ex-Standby server will stop following the original Master and will now be ready to accept read-write queries, it will then be the new Master.
Let’s start
If you already have a PostgreSQL database running on production server, that will be your current MASTER server. But note that we are going to configure both MASTER and STANDBY machines as identically as possible, because you will want to make any one of them MASTER in case another goes down and switch back when it comes back up.
Let’s start with current MASTER running and STANDBY stopped.
On the MASTER create replicator user. The STANDBY server will login to the MASTER using this account to read latest changes that it has to replicate. You should not do this step on current STANDBY server, even if it may sometime become a new Master. That is because during the base backup step the replicator user role will get copied to STANDBY server automatically.
psql -c "CREATE USER replicator SUPERUSER LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD 'changeme';"
Modify pg_hba.conf on both MASTER and STANDBY server by adding this line. This tells respective PostgreSQL instances to accept connection from other node for replication purposes.
host replication replicator <IP_OF_OTHER_HOST>/32 md5
Modify postgresql.conf on both MASTER and STANDBY servers and add following lines to it. (Check if these options are already set to different values elsewhere in the file)
hot_standby = on
max_wal_senders = 1
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .'
listen_addresses = 'localhost,<IP_ADDRESS_OF_THIS_NODE>'
At this point restart MASTER. Do NOT start STANDBY yet
Next we are going to perform base backup from MASTER to STANDBY
BACKUP_LABEL="base-backup"
psql -p $PORT -c "select pg_start_backup('$BACKUP_LABEL');"
rsync -cva --inplace --exclude=*pg_xlog* $PGDATA/ <IP_OF_OTHER_HOST>:$PGDATA/
psql -p $PORT -c "select pg_stop_backup();"
It’s advised that you put this in a bash script so that it can be run quickly and repeatedly without any errors.
On STANDBY create a recovery.conf file in $PGDATA directory and add following lines to it.
standby_mode = 'on'
primary_conninfo = 'host=<IP_OF_OTHER_HOST> port=$PORT user=replicator password=changeme'
trigger_file = '/tmp/postgresql.trigger.$PORT'
Now start the STANDBY server.
At this point you can check the log file (/var/log/postgresql/postgresql-9.1-<cluster-name>.log) to verify things
- On STANDBY look for a message “streaming replication successfully connected to primary”.
- Also you should see “wal receiver process” on STANDBY and “wal sender process” on MASTER.
- Finally run some queries against the STANDBY server to verify it contains same data as MASTER.
If it’s not working, check the firewall rules in Amazon EC2 Security Groups.
How to do Failover/Switchover
Setting up the system for high availability is of no use, if you are going to wait for actual disaster to strike before trying the recovery. Therefore you should try above steps in test environment and simulate disasters.
If MASTER is not down, make sure you stop it first, before you tell STANDBY to take up that role. This is to avoid the MASTER from processing further queries leading to a split-brain problem.
You can turn the STANDBY into a MASTER by simply touching a trigger file that was mentioned in the recovery.conf, /tmp/postgresql.trigger.$PORT.
Now that STANDBY has turned into MASTER, point your application servers to it. Even if your old MASTER is running at this point, the new MASTER is not going to replicate any changes from it. Therefore it is necessary that you stop the old MASTER before you ask the STANDBY to become the new MASTER.
You can tell that STANDBY has become MASTER from the messages in the log that read “archive recovery complete. database system is ready to accept connections.”
How to do Switchback
After some downtime or maintenance period, your master node is back up again and you want to do switchback. You are going to first turn this node into a standby. In this mode it will catch up with the current MASTER replicating the changes that took place while it was down. So we refer to it as current STANDBY now.
Peform Base backup from current MASTER to current STANDBY
BACKUP_LABEL="base-backup"
psql -p $PORT -c "select pg_start_backup('$BACKUP_LABEL');"
rsync -cva --inplace --exclude=*pg_xlog* $PGDATA/ <IP_OF_OTHER_HOST>:$PGDATA/
psql -p $PORT -c "select pg_stop_backup();"
Create recovery.conf in $PGDATA on current STANDBY
standby_mode = 'on'
primary_conninfo = 'host=<IP_OF_OTHER_HOST> port=$PORT user=replicator password=changeme'
trigger_file = '/tmp/postgresql.trigger.$PORT'
After the catch up is over, you can turn the current STANDBY into MASTER by following above switchover procedure – touch the trigger file.
In addition to being an insurance against disasters, the STANDBY server can also be used for load balancing purposes. You can configure the app servers to spread their read queries across the Master and Standby servers. Tools like pgpool provide that kind of facility.
Once you get comfortable setting up a basic two node replication and hot standby scheme, you can move on to advanced configurations too.
If you find any mistakes or suggestions for improvements in this article please leave a comment.