How To Configure PostgreSQL 12 Streaming Replication in CentOS 8?


Introduction

PostgreSQL is one of the most popular open-source relational database management systems (RDBMS). It provides a wide range of features such as data integrity, fault-tolerance, and scalability. One of the essential features in PostgreSQL is Streaming Replication.

It allows you to create multiple copies of a PostgreSQL database cluster in near-real-time by continuously streaming the changes from the primary node to the standby nodes. Streaming replication works by streaming the write-ahead log (WAL) data generated by the primary node to one or more standby nodes over a network connection.

Configuration of Master Node for Streaming Replication

Enabling WAL archiving on the Master node

WAL (Write-Ahead Log) is a critical component in PostgreSQL that allows for crash recovery and point-in-time recovery. In order to set up streaming replication, we need to enable WAL archiving on the master node. This will ensure that all transactions committed on the master are written to a designated archive location in addition to being written into the WAL files.

To enable WAL archiving, you'll need to update `postgresql.conf` file with following parameters−

# Enable Archiving 
archive_mode = on archive_command = 'cp %p /var/lib/pgsql/12/archive/%f'  

In this example, we have enabled archiving and specified an archive command that will copy WAL segments from their default directory (`pg_xlog`) to `/var/lib/pgsql/12/archive`. You can adjust this path as necessary.

Configuring pg_hba.conf file to allow replication connections from Slave node(s)

In order for a slave node to connect and replicate data from the master node, we need to configure `pg_hba.conf` file accordingly. You'll need to add an entry in this file specifying that replication connections are allowed from specific IP address(es) or subnet(s) −

# TYPE DATABASE USER ADDRESS METHOD host replication repluser slave_ip/32 md5    

In this example, we have added an entry allowing connections from a specific IP address (`slave_ip`) using `md5` authentication method. Additionally, we have created a new database called `replication`, which is where our replication user will have access.

Creating a replication user on the Master node with appropriate privileges

Now that our master server is configured for streaming replication and our slave is authorized to connect, we need to create a new user on the master server with appropriate privileges. This user will be responsible for making the necessary changes to the master database that will be replicated to slave(s). To create this user, you can use the following SQL command as PostgreSQL superuser −

CREATE ROLE repluser WITH REPLICATION LOGIN PASSWORD 'password';      

In this example, we have created a new role called `repluser` with replication privileges.

We've also specified login capabilities and assigned it a password. Now that our master node is properly configured for streaming replication, we can move on to configuring our slave node(s).

Configuration of Slave Node for Streaming Replication

Setting up recovery.conf file on the Slave node to connect to the Master node

Once you have configured the Master Node for Streaming Replication, it is time to set up your Slave Node(s). The first step in setting up your Slave Node(s) is creating a configuration file named recovery.conf.

This file contains the necessary information for establishing a connection with the Master Node and initiating streaming replication. To create this configuration file, you first need to determine what values need to be set.

In general, there are three required values: standby_mode, primary_conninfo, and restore_command. Standby_mode tells PostgreSQL that this server will function as a standby server and should always stay in replication mode.

Primary_conninfo provides information about how to connect to the Master Node. Restore_command tells PostgreSQL how to retrieve any missing data from archived WAL files.

After determining these values, open a new file named "recovery.conf" in your PostgreSQL data directory (usually /var/lib/pgsql/12/data/). Add these values as key/value pairs in the following format−

standby_mode = 'on' primary_conninfo = 'host= port=5432 user= password=' 
restore_command = 'cp /path/to/archive/%f "%p"' 

Note that %f and %p are special characters representing the name of each archived WAL segment and destination path respectively.

Starting PostgreSQL service on the Slave node to initiate replication process

Once you’ve created your recovery.conf file with necessary parameters filled out correctly it’s time start Postgres service on slave node.

sudo systemctl start postgresql-12.service 
sudo systemctl enable postgresql-12 

At this point, your slave should successfully establish a connection to your master and begin copying the write-ahead log data.

By default, Postgres will run the slave in a continuous loop, checking for new WAL files to apply as they become available on the master. You can check whether replication is working properly by running following query on slave node−

SELECT * FROM pg_stat_replication;  

This should provide you with confirmation that replication is online and working as expected.

Testing and Verifying Streaming Replication Setup

After the configuration of the Master and Slave nodes for streaming replication, it is essential to test and verify that replication is properly set up. To do this, we will use a tool called pg_stat_replication that displays information about the status of streaming replication.

First, let's connect to the Master node using psql−

bash 
$ sudo -i -u postgres psql 

Now execute the following command to display information about streaming replication −

sql postgres=# SELECT client_addr, state, sync_state FROM pg_stat_replication;  

The query above returns information about each connected stream replication client instance. The column 'client_addr' shows the IP address of each Slave node connected to Master node; 'state' shows whether or not a connection is currently being used for streaming replication; and 'sync_state' shows whether or not data is currently being streamed from the Master node to a Slave node.

Checking Replication Status Using pg_stat_replication View

pg_stat_replication view provides valuable insight into monitoring replicas in PostgreSQL 12. When it comes to monitoring PostgreSQL 12 Streaming Replication, it is one of the most critical system views.

Let's find out how we can check if our replica (Slave) server(s) are in sync with our primary (Master) server by making use of pg_stat_replication view. To check if all replicas are in sync with master run−

sql SELECT client_addr AS slave_ip_address, 
state AS connection_state, sync_state AS synchronization_status 
FROM pg_stat_replication; 

This query will show you every replica IP address along with its corresponding connection state and synchronization status.

Testing Failover by Promoting a Slave to Become New Master

A failover process allows switching over when there is something wrong with the current master node. In PostgreSQL 12 Streaming Replication, the failover process must be done manually. This means that the administrator needs to intervene and promote one of the Slave nodes to become a new Master node.

To perform a failover, you need to do the following steps−

  • Disconnect all replication connections from the current Master node.

  • Promote one of the Slave nodes by setting "recovery_target_timeline" in recovery.conf to "latest".

  • Start PostgreSQL service on this promoted Slave node.

After promoting a Slave node as new Master, it is essential to check if all other Slaves are connected to this new Master and receiving data correctly. You can use pg_stat_replication view again to check whether each Slave node is successfully connected or not−

sql SELECT client_addr AS slave_ip_address, 
state AS connection_state, sync_state AS synchronization_status 
FROM pg_stat_replication

Once you are sure that everything is working as expected, you can switch your application's connection settings from old (failed) primary to new primary (promoted Slave).

Conclusion

In this article, we have discussed how to configure PostgreSQL 12 Streaming Replication in CentOS 8 with detailed instructions on setting up both Master and Slave nodes. We have also learned how to test replication setup and monitor replica servers using pg_stat_replication view. We have covered how to perform a failover by promoting one of the slave nodes as a new master manually.

Updated on: 11-Jul-2023

555 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements