In this laboratory we will implement replication, load balancing and failover for PostgreSQL 14 database using Patroni project.
We will configure 3 PostgreSQL servers - a primary node and a two standby replica. We will also configure a HAproxy load balancer in front of PostgreSQL cluster. As a part of Patroni cluster we will set up etcd cluster with three nodes. We will do some basic operations on the cluster to verify the correctness of replication and high availability features.
Roles: etcd - store all metadata about PostgreSQL cluster, based on quorum, it makes decicion who is master. PostgreSQL - three nodes with streaming replication, one master and two slaves. HAProxy - load balancer and additional matrix layer to forward SQL queries to proper node. container - docker instance, example: demo-patroniX, demo-etcd baseOS - it's a Virtual Machine example: st10Xvm10X.rtb-lab.pl
Labolatory is based on docker environment. You just need to log into one of st10Xvm1[01-10].rtb-lab.pl VM and follow commands in bash.
To have easy access for prepared files, clone the labs repository and install packages.
git clone https://github.com/RTBHOUSE/mimuw-lab2024L.git
sudo apt install docker docker-compose postgresql-client
Execute the below steps.
Add PostgreSQL repository, key and update the packages list:
cd lab02/files
sudo docker-compose up -d
Please confirm that all containers are up and running.
sudo docker-compose ps
Output should be similar to
Name Command State Ports
-------------------------------------------------------------------------------------------------------------------------------------------
demo-etcd1 /bin/sh /entrypoint.sh etcd Up
demo-etcd2 /bin/sh /entrypoint.sh etcd Up
demo-etcd3 /bin/sh /entrypoint.sh etcd Up
demo-haproxy /bin/sh /entrypoint.sh haproxy Up 0.0.0.0:5000->5000/tcp,:::5000->5000/tcp, 0.0.0.0:5001->5001/tcp,:::5001->5001/tcp
demo-patroni1 /bin/sh /entrypoint.sh Up
demo-patroni2 /bin/sh /entrypoint.sh Up
demo-patroni3 /bin/sh /entrypoint.sh Up
If the Output is different please raise your hand.
On one of the patroni containers, demo-patroniX check cluster status.
sudo docker exec -ti demo-patroni1 bash
postgres@patroni1:~$ patronictl list
Who is leader/master ?
Now let's check patroni configuration
postgres@patroni1:~$ patronictl show-config
Before we you run next command please one again check who is leader.
Imagine that current master needs to be shutdown for some maintenance reason. We need to use switchover command and delegate one of replica as new master. Keep in mind that we are still in the same container demo-patroniX.
patronictl failover
Patroni will ask you about new master. Example:
postgres@patroni1:~$ patronictl failover
Candidate ['patroni1', 'patroni3'] []: patroni1
Current cluster topology
+---------+----------+------------+--------+---------+----+-----------+
| Cluster | Member | Host | Role | State | TL | Lag in MB |
+---------+----------+------------+--------+---------+----+-----------+
| demo | patroni1 | 172.21.0.4 | | running | 2 | 0 |
| demo | patroni2 | 172.21.0.6 | Leader | running | 2 | 0 |
| demo | patroni3 | 172.21.0.2 | | running | 2 | 0 |
+---------+----------+------------+--------+---------+----+-----------+
Are you sure you want to failover cluster demo, demoting current master patroni2? [y/N]: y
2022-10-13 09:21:50.18813 Successfully failed over to "patroni1"
+---------+----------+------------+--------+---------+----+-----------+
| Cluster | Member | Host | Role | State | TL | Lag in MB |
+---------+----------+------------+--------+---------+----+-----------+
| demo | patroni1 | 172.21.0.4 | Leader | running | 2 | |
| demo | patroni2 | 172.21.0.6 | | stopped | | unknown |
| demo | patroni3 | 172.21.0.2 | | running | 2 | 0 |
+---------+----------+------------+--------+---------+----+-----------+
What is State of old master ? How is new master ? What is Lag ?
Your next task is about change PostgreSQL configuration. It's a little bit tricky, because PostgreSQL config is managed by patroni. Change max_commection from 100 to 150. Patroni will use vim to edit config, press "i" to switch to INSERT mode, use arrows to move to proper line and change value. When you finish click ESC and type ":x" to save change.
postgres@patroni1:~$ patronictl edit-config
Wait few seconds and run the below command few times.
patronictl list
What did you notice ?
Change was accepted by cluster but not applied yet. As max_commection is critical parameter it can be changed only during the restart. Below command will restart PostgreSQL instance, repeat this command for all cluster members.
postgres@patroni1:~$ patronictl restart demo patroniX
Good practice is to restart master as the last one.
Before we move forward, make sure that cluster is healty and all nodes applied configuration change.
Log off from container.
This part will help you understand what is role of etcd service in patroni cluster and what type of data it stores.
Log into container demo-etcdX
sudo docker exec -ti demo-etcdX bash
postgres@etcd1:~$ etcdctl member list
How many nodes do you see ? Who is leader ?
In the same container, run
postgres@etcd1:~$ curl http://etcd1:2380/members | jq .
Now, it's important to run below commands one by one. At the end you should find PATH where all metadata about nodes are stored.
postgres@etcd1:~$ etcdctl ls
postgres@etcd1:~$ etcdctl ls /service
postgres@etcd1:~$ etcdctl ls /service/PATH
When you found path to members, please run
postgres@etcd1:~$ etcdctl get PATH_TO_NODE/patroniX
What do you see ? What type of data are stored in etcd ?
That was last task about etcd, now leave container and back to baseOS.
In this part we will prove that claster can survive without one node.
sudo docker-compose ps
It' up to you which one do you kill.
sudo docker kill demo-patroniX
Confirm that node is down
sudo docker exec -ti demo-patroniX patronictl list
How many nodes do you see ? Who is leader ?
Now, let's check what happen if we kill two nodes at the time.
sudo docker kill demo-patroniX
Make sure, that you run below command on running node. Repeat command few times and observe output.
sudo docker exec -ti demo-patroniX patronictl list
What did you notice ? How long it took that cluster noticed that second node is down ? Who is leader ?
Start all nodes
sudo docker-compose up -d
This part is dedicated to PostgreSQL replication. We will create table and confirm that is was replicated.
Find who is master
sudo docker exec -ti demo-patroniX patronictl list
On master
sudo docker exec -ti demo-patroniX psql
Now you are in PostgreSQL. Confirm replication status
postgres=# select * from pg_stat_replication ;
Create table test
postgres=# create table mimuw (id int);
postgres=#\dt
Leave docker and run command from baseOS
postgres=# \q
Below command run on slave node
sudo docker exec -ti demo-patroniX psql
postgres=# select pg_is_in_recovery();
Output should be like below
pg_is_in_recovery
-------------------
t
(1 row)
Confirm that tables exists on all nodes
sudo docker exec -ti demo-patroniX psql
postgres=# \dt
"
Log off from container.
## 6. Optional tasks
Use postgres as password from user postgres.
You need two bash consoles.
Console 1
```bash
psql -U postgres -p 5000 -h localhost
Do not kill this connection.
At the same time run commands in console 2.
sudo docker exec -ti demo-patroniX patronictl list
Find who is master and kill container.
sudo docker kill demo-patroniX
Now look at Console 1, check if your connection is still alive ?
As laboratory uses docker, you will not find logs files. It's much more easier if you use
sudo docker logs CONTAINER-NAME
In any corner case you can stop/start docker.
sudo docker stop CONTAINER-NAME
sudo docker start CONTAINER-NAME