=====Overview===== Patroni and ETCD provide a great combination for a cluster solution for a PostgreSQL. In a nutshell: Patroni managed the cluster while the ETCD allows configuration data to be stored. The diagram of how they interact with PostgreSQL you can see below: {{ :patroni_etcd_overview.png?400 |}} So let's get started with the configuration of ETCD first. To configure ETCD, we need to: - Install the package - Configure the cluster Both tasks are fairly simple: =====Setup ETCD===== Let's assume we have three nodes (as the picture above) * node00 (10.100.10.200) * node01 (10.100.10.201) * node02 (10.100.10.202) I like starting from "0" I know :-) To install the package, simply use the package provider: yum, apt-get, etc: ====Install RPM==== node02:$ yum -y install etcd node02: --> Running transaction check node02: ---> Package etcd.x86_64 0:3.3.11-2.el7.centos will be installed node02: --> Finished Dependency Resolution node02: node02: Dependencies Resolved node02: node02: ================================================================================ node02: Package Arch Version Repository Size node02: ================================================================================ node02: Installing: node02: etcd x86_64 3.3.11-2.el7.centos extras 10 M node02: node02: Transaction Summary node02: ================================================================================ node02: Install 1 Package node02: node02: Total download size: 10 M node02: Installed size: 45 M node02: Downloading packages: node02: Running transaction check node02: Running transaction test node02: Transaction test succeeded node02: Running transaction node02: Installing : etcd-3.3.11-2.el7.centos.x86_64 1/1 node02: Verifying : etcd-3.3.11-2.el7.centos.x86_64 1/1 node02: node02: Installed: node02: etcd.x86_64 0:3.3.11-2.el7.centos node02: node02: Complete! You need to do that all nodes ====Configure the cluster==== To configure the cluster, use the following script: mkdir -p /db/etcd host=`hostname` cat > /etc/etcd/etcd.conf < Again, you need to run that on all 3 nodes, after that we can start the ETCD ====Start the ETCD=== [root@node00 ~]# etcd --config-file /etc/etcd/etcd.conf &>/var/log/postgresql/etcd.log & [1] 5536 [root@node00 ~]# Again, on all nodes. We can after that check the status of the cluster: ===Check the status=== [root@node00 postgresql]# etcdctl cluster-health member 4fd4358d80b0e45e is healthy: got healthy result from http://node00:2379 member 825e916a3f48ca3f is healthy: got healthy result from http://node01:2379 member ce1dafd3ad7d9971 is healthy: got healthy result from http://node02:2379 cluster is healthy [root@node00 postgresql]# If you uses API v3, you have to use the new command: [root@lpara etcd]# etcdctl --cluster=true endpoint health http://lpara:2379 is healthy: successfully committed proposal: took = 5.287583ms http://lparb:2379 is healthy: successfully committed proposal: took = 6.793576ms [root@lpara etcd]# You can also "TEST" the server as publishing messages (key-values) #On 2nd node: [root@node01 ~]# etcdctl put ha-cookbook-2 "Hello World" OK [root@node01 ~]# #On 1st node: [root@node00 etcd]# etcdctl get ha-cookbook-2 ha-cookbook-2 Hello World [root@node00 etcd]# So we have our key-value propagated among the cluster, nice.... We can also check the health of the cluster endpoints as follow: [root@etcd02 ~]# etcdctl endpoint status --write-out=table --endpoints=http://etcd01:2379,http://etcd00:2379,http://etcd02:2379 +--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | http://etcd01:2379 | 32552f25bf46ac89 | 3.5.21 | 20 kB | false | false | 2 | 12 | 12 | | | http://etcd00:2379 | 5dd4414f709cd700 | 3.5.21 | 20 kB | true | false | 2 | 12 | 12 | | | http://etcd02:2379 | e1e9e5d83c377f01 | 3.5.21 | 20 kB | false | false | 2 | 12 | 12 | | +--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@etcd02 ~]# =====Setup Patroni===== The Patroni is really the brain behind the cluster management and the coordination with the distributed key store (in our case ETCD, although other solutions are possible as well: Consul, K8S, etc) Patroni is a fork of Governor and its job is to manage the cluster. With patroni you can also reinitialize a node, as we will see below. To sum it up, the configuration of Patroni, you can use the following script: yum -y install centos-release-scl-rh yum -y install llvm-toolset-7-clang gcc python3-devel postgresql15-devel.x86_64 pip3 install patroni pip3 install python-etcd mkdir /etc/patroni mkdir -p /db/pgdata mkdir -p /var/log/postgresql/ chown -R postgres:postgres /etc/patroni chown postgres:postgres /db/pgdata chown postgres:postgres /var/log/postgresql/ chown postgres:postgres /db host=`hostname` binexec=`find / -name "pg_config"` bindir=`${binexec} --bindir` ipaddr=`cat /etc/hosts | grep "${host}" | cut -d' ' -f1` subnet=`echo ${ipaddr} | cut -d'.' -f1-3` cat > /etc/patroni/stampede.yml < /var/log/postgresql/patroni.log & Execute that command on all servers in the cluster. You don't have to substitute anything. Everything is self generated (I like twerking with bash). On the first server, you just need to create the replication user: postgres=# create role rep_user with replication login password 'newpass'; CREATE ROLE postgres=# After that, you should have all 3 nodes working together: -bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | running | 2 | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+ -bash-4.2$ Now, we can see that the node: "node00" is the leader and the rest are Replicas. =====Test failover===== Let's test a failover. We will halt the first node: [root@node00 wal]# halt Don't forget, that Patroni is checking if there is a Leader node every 5 seconds or so. So it should see that we are Leaderless and promote a replica: -bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | stopped | | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+ We see that we have stopped server, so let's see again: -bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Replica | stopped | | unknown | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Leader | running | 2 | | +--------+--------+---------+---------+----+-----------+ Yeyyy, after some time, we have a new Leader. Off we go. =====Management===== Of course, with every software, we have to manage it. So let's what issues I had. Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02: **IT IS IMPORTANT TO NOTE THAT, IF YOU UPDATE THE PATRONI YML FILE, YOU NEED TO RELOAD THE CLUSTER AND RESTART IT!!!** ====Reinitialize a node==== Re-initializing is quite simple. It will pretty much do a pg_basebackup and restore it on the node: -bash-4.2$ patronictl -c /etc/patroni/stampede.yml reinit stampede node02 + Cluster: stampede --------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+---------+----+-----------+ | node00 | node00 | Leader | running | 2 | | | node01 | node01 | Replica | running | 2 | 0 | | node02 | node02 | Replica | running | 2 | 0 | +--------+--------+---------+---------+----+-----------+ Are you sure you want to reinitialize members node02? [y/N]: y Success: reinitialize for member node02 -bash-4.2$ ====Restart Cluster==== -bash-4.2$ patronictl -c /etc/patroni/stampede.yml restart stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+ When should the restart take place (e.g. 2024-02-20T20:19) [now]: Are you sure you want to restart members node01, node02, node00? [y/N]: y Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []: Success: restart on member node01 Success: restart on member node02 Success: restart on member node00 -bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+ ====Reload cluster==== The reload is needed if you have changed anything in the configuration file of the resources, EXCEPT the DCS -bash-4.2$ patronictl -c /etc/patroni/stampede.yml reload stampede + Cluster: stampede (7337687292879936167) ---+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+--------+---------+-----------+----+-----------+ | node00 | node00 | Leader | running | 1 | | | node01 | node01 | Replica | streaming | 1 | 0 | | node02 | node02 | Replica | streaming | 1 | 0 | +--------+--------+---------+-----------+----+-----------+ Are you sure you want to reload members node01, node02, node00? [y/N]: y Reload request received for member node01 and will be processed within 10 seconds Reload request received for member node02 and will be processed within 10 seconds Reload request received for member node00 and will be processed within 10 seconds -bash-4.2$ ====Update DCS==== If you want to edit the DCS, you have to EITHER re-initialize the server OR use the rest API: $ curl -s -XPATCH -d \ '{"postgresql":{"parameters":{"archive_command":"test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f"}}}' \ http://localhost:8008/config =====Appendix===== The whole code to run you that cluster using vagrant, you can see below. $allow_shell = <