Overview
Patroni and ETCD provide a great combination for a cluster solution for a PostgreSQL. In a nutshell: Patroni managed the cluster while the ETCD allows configuration data to be stored. The diagram of how they interact with PostgreSQL you can see below:
So let's get started with the configuration of ETCD first. To configure ETCD, we need to:
- Install the package
- Configure the cluster
Both tasks are fairly simple:
Setup ETCD
Let's assume we have three nodes (as the picture above)
- node00 (10.100.10.200)
- node01 (10.100.10.201)
- node02 (10.100.10.202)
I like starting from “0” I know To install the package, simply use the package provider: yum, apt-get, etc:
Install RPM
Install ETCD package
node02:$ yum -y install etcd node02: --> Running transaction check node02: ---> Package etcd.x86_64 0:3.3.11-2.el7.centos will be installed node02: --> Finished Dependency Resolution node02: node02: Dependencies Resolved node02: node02: ================================================================================ node02: Package Arch Version Repository Size node02: ================================================================================ node02: Installing: node02: etcd x86_64 3.3.11-2.el7.centos extras 10 M node02: node02: Transaction Summary node02: ================================================================================ node02: Install 1 Package node02: node02: Total download size: 10 M node02: Installed size: 45 M node02: Downloading packages: node02: Running transaction check node02: Running transaction test node02: Transaction test succeeded node02: Running transaction node02: Installing : etcd-3.3.11-2.el7.centos.x86_64 1/1 node02: Verifying : etcd-3.3.11-2.el7.centos.x86_64 1/1 node02: node02: Installed: node02: etcd.x86_64 0:3.3.11-2.el7.centos node02: node02: Complete!
You need to do that all nodes
Configure the cluster
To configure the cluster, use the following script:
Configure ETCD
mkdir -p /db/etcd host=`hostname` cat > /etc/etcd/etcd.conf <<EOF name: ${host} data-dir: /db/etcd initial-advertise-peer-urls: http://${host}:2380 listen-peer-urls: http://0.0.0.0:2380 listen-client-urls: http://0.0.0.0:2379 advertise-client-urls: http://${host}:2379 initial-cluster: "node00=http://node00:2380,node01=http://node01:2380,node02=http://node02:2380" EOF mkdir -p /var/log/postgresql
Again, you need to run that on all 3 nodes, after that we can start the ETCD
Start the ETCD
Start ETCD
[root@node00 ~]# etcd --config-file /etc/etcd/etcd.conf &>/var/log/postgresql/etcd.log & [1] 5536 [root@node00 ~]#
Again, on all nodes. We can after that check the status of the cluster:
Check the status
Check status
[root@node00 postgresql]# etcdctl cluster-health member 4fd4358d80b0e45e is healthy: got healthy result from http://node00:2379 member 825e916a3f48ca3f is healthy: got healthy result from http://node01:2379 member ce1dafd3ad7d9971 is healthy: got healthy result from http://node02:2379 cluster is healthy [root@node00 postgresql]#
If you uses API v3, you have to use the new command:
Check status
[root@lpara etcd]# etcdctl --cluster=true endpoint health http://lpara:2379 is healthy: successfully committed proposal: took = 5.287583ms http://lparb:2379 is healthy: successfully committed proposal: took = 6.793576ms [root@lpara etcd]#
You can also “TEST” the server as publishing messages (key-values)
Test ETCD
#On 2nd node: [root@node01 ~]# etcdctl put ha-cookbook-2 "Hello World" OK [root@node01 ~]# #On 1st node: [root@node00 etcd]# etcdctl get ha-cookbook-2 ha-cookbook-2 Hello World [root@node00 etcd]#
So we have our key-value propagated among the cluster, nice….
Setup Patroni
The Patroni is really the brain behind the cluster management and the coordination with the distributed key store (in our case ETCD, although other solutions are possible as well: Consul, K8S, etc) Patroni is a fork of Governor and its job is to manage the cluster. With patroni you can also reinitialize a node, as we will see below. To sum it up, the configuration of Patroni, you can use the following script:
Setup Patroni
yum -y install centos-release-scl-rh yum -y install llvm-toolset-7-clang gcc python3-devel postgresql15-devel.x86_64 pip3 install patroni pip3 install python-etcd mkdir /etc/patroni mkdir -p /db/pgdata mkdir -p /var/log/postgresql/ chown -R postgres:postgres /etc/patroni chown postgres:postgres /db/pgdata chown postgres:postgres /var/log/postgresql/ chown postgres:postgres /db host=`hostname` binexec=`find / -name "pg_config"` bindir=`${binexec} --bindir` ipaddr=`cat /etc/hosts | grep "${host}" | cut -d' ' -f1` subnet=`echo ${ipaddr} | cut -d'.' -f1-3` cat > /etc/patroni/stampede.yml <<EOF scope: stampede name: ${host} restapi: listen: ${host}:8008 connect_address: ${host}:8008 etcd: host: ${host}:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true use_slots: true parameters: wal_level: logical wal_log_hints: "on" initdb: - encoding: UTF8 - data-checksums pg_hba: - host replication rep_user ${ipaddr}/24 md5 - host all all ${subnet}.0/24 md5 users: admin: password: adminpass options: - createrole - createdb postgresql: listen: ${host}:5432 connect_address: ${host}:5432 data_dir: /db/pgdata bin_dir: ${bindir} pgpass: /tmp/pgpass0 authentication: replication: username: rep_user password: newpass superuser: username: postgres password: newpass parameters: unix_socket_directories: '/var/run/postgresql' external_pid_file: '/var/run/postgresql/15-main.pid' logging_collector: "on" log_directory: "/var/log/postgresql" log_filename: "postgresql-15-main.log" archive_command: 'test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f' archive_mode: 'on' EOF chown -R postgres:postgres /etc/patroni chown postgres:postgres /db/pgdata chown -R postgres:postgres /var/log/postgresql/ chown postgres:postgres /db chmod 600 /etc/patroni/stampede.yml chmod 750 /db/pgdata chown postgres:postgres /var/log/postgresql/patroni.log patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
Execute that command on all servers in the cluster. You don't have to substitute anything. Everything is self generated (I like twerking with bash). On the first server, you just need to create the replication user:
Create replication User
postgres=# create role rep_user with replication login password 'newpass'; CREATE ROLE postgres=#
After that, you should have all 3 nodes working together:
Check up
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | running | 2 | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+ -bash-4.2$
Now, we can see that the node: “node00” is the leader and the rest are Replicas.
Test failover
Let's test a failover. We will halt the first node:
Halt a node
[root@node00 wal]# halt
Don't forget, that Patroni is checking if there is a Leader node every 5 seconds or so. So it should see that we are Leaderless and promote a replica:
Auto-Promoting Replica
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | stopped | | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+
We see that we have stopped server, so let's see again:
Auto-Promoting Replica P2
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Replica | stopped | | unknown | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Leader | running | 2 | | +--------+--------+---------+---------+----+-----------+
Yeyyy, after some time, we have a new Leader. Off we go.
Management
Of course, with every software, we have to manage it. So let's what issues I had. Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02:
IT IS IMPORTANT TO NOTE THAT, IF YOU UPDATE THE PATRONI YML FILE, YOU NEED TO RELOAD THE CLUSTER AND RESTART IT!!!
Reinitialize a node
Re-initializing is quite simple. It will pretty much do a pg_basebackup and restore it on the node:
Reinitialize a node
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml reinit stampede node02 + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | running | 2 | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+ Are you sure you want to reinitialize members node02? [y/N]: y Success: reinitialize for member node02 -bash-4.2$
Restart Cluster
Restart the cluster
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml restart stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+ When should the restart take place (e.g. 2024-02-20T20:19) [now]: Are you sure you want to restart members node01, node02, node00? [y/N]: y Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []: Success: restart on member node01 Success: restart on member node02 Success: restart on member node00 -bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+
Reload cluster
The reload is needed if you have changed anything in the configuration file of the resources, EXCEPT the DCS
Reload cluster
-bash-4.2$ patronictl -c /etc/patroni/stampede.yml reload stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+ Are you sure you want to reload members node01, node02, node00? [y/N]: y Reload request received for member node01 and will be processed within 10 seconds Reload request received for member node02 and will be processed within 10 seconds Reload request received for member node00 and will be processed within 10 seconds -bash-4.2$
Update DCS
If you want to edit the DCS, you have to EITHER re-initialize the server OR use the rest API:
Update DCS part
$ curl -s -XPATCH -d \ '{"postgresql":{"parameters":{"archive_command":"test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f"}}}' \ http://localhost:8008/config
Appendix
The whole code to run you that cluster using vagrant, you can see below.
$allow_shell = <<SCRIPT sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config systemctl restart sshd.service SCRIPT $edit_hosts_file = <<SCRIPT cat > /etc/hosts <<EOF 127.0.0.1 localhost 10.100.10.200 node00 10.100.10.201 node01 10.100.10.202 node02 EOF SCRIPT $install_etcd = <<SCRIPT yum -y update yum -y install python3-psycopg2 python3-pip PyYAML yum -y install etcd mkdir -p /db/etcd # Install PostgreSQL: sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm sudo yum install -y postgresql15-server chown postgres:postgres /db/etcd #Modify etcd host=`hostname` cat > /etc/etcd/etcd.conf <<EOF name: ${host} data-dir: /db/etcd initial-advertise-peer-urls: http://${host}:2380 listen-peer-urls: http://0.0.0.0:2380 listen-client-urls: http://0.0.0.0:2379 advertise-client-urls: http://${host}:2379 initial-cluster: "node00=http://node00:2380,node01=http://node01:2380,node02=http://node02:2380" EOF mkdir -p /var/log/postgresql etcd --config-file /etc/etcd/etcd.conf &>/var/log/postgresql/etcd.log & SCRIPT $install_patroni = <<SCRIPT yum -y install centos-release-scl-rh yum -y install llvm-toolset-7-clang gcc python3-devel postgresql15-devel.x86_64 pip3 install patroni pip3 install python-etcd mkdir /etc/patroni mkdir -p /db/pgdata mkdir -p /var/log/postgresql/ chown -R postgres:postgres /etc/patroni chown postgres:postgres /db/pgdata chmod 600 /etc/patroni/stampede.yml chown postgres:postgres /var/log/postgresql/ chown postgres:postgres /db host=`hostname` binexec=`find / -name "pg_config"` bindir=`${binexec} --bindir` ipaddr=`cat /etc/hosts | grep "${host}" | cut -d' ' -f1` subnet=`echo ${ipaddr} | cut -d'.' -f1-3` cat > /etc/patroni/stampede.yml <<EOF scope: stampede name: ${host} restapi: listen: ${host}:8008 connect_address: ${host}:8008 etcd: host: ${host}:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true use_slots: true parameters: wal_level: logical wal_log_hints: "on" initdb: - encoding: UTF8 - data-checksums pg_hba: - host replication rep_user ${ipaddr}/24 md5 - host all all ${subnet}.0/24 md5 users: admin: password: adminpass options: - createrole - createdb postgresql: listen: ${host}:5432 connect_address: ${host}:5432 data_dir: /db/pgdata bin_dir: ${bindir} pgpass: /tmp/pgpass0 authentication: replication: username: rep_user password: newpass superuser: username: postgres password: newpass parameters: unix_socket_directories: '/var/run/postgresql' external_pid_file: '/var/run/postgresql/15-main.pid' logging_collector: "on" log_directory: "/var/log/postgresql" log_filename: "postgresql-15-main.log" EOF chmod 750 /db/pgdata chown postgres:postgres /var/log/postgresql/patroni.log patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log & SCRIPT $install_ansible = <<SCRIPT sudo yum -y install epel-release sudo yum -y install ansible cat > /etc/ansible/hosts <<EOF [patroni] 10.100.10.200 10.100.10.201 10.100.10.202 [patroni:vars] ansible_user=root ansible_password=vagrant ansible_connection=ssh EOF sed -i 's/#host_key_checking = False/host_key_checking = False/g' /etc/ansible/ansible.cfg SCRIPT $disable_selinux = <<SCRIPT setenforce 0 SCRIPT # All Vagrant configuration is done below. The "2" in Vagrant.configure # configures the configuration version (we support older styles for # backwards compatibility). Please don't change it unless you know what # you're doing. Vagrant.configure("2") do |config| (0..2).each do |i| config.vm.define "node0#{i}" do |node| node.vm.box = "centos/7" node.vm.box_check_update = true node.vm.network :private_network, ip: "10.100.10.20#{i}" node.vm.hostname = "node0#{i}" node.vm.provision "shell", inline: $allow_shell, privileged: true node.vm.provision "shell", inline: $disable_selinux, privileged: true node.vm.provision "shell", inline: $install_ansible, privileged: true node.vm.provision "shell", inline: $edit_hosts_file, privileged: true node.vm.provision "shell", inline: $install_etcd, privileged: true node.vm.provision "shell", inline: $install_patroni, privileged: true node.vm.provider "virtualbox" do |vb| vb.name = "node0#{i}" vb.memory = "1024" end end end end
That code will create 3 node cluster with everything.