Differences

This shows you the differences between two versions of the page.

--- postgresql_patroni [2023/08/31 19:45] – andonovj
+++ postgresql_patroni [2025/05/13 05:14] (current) – andonovj
@@ Line 100: / Line 100: @@
 cluster is healthy
 [root@node00 postgresql]#
+</Code>
+If you uses API v3, you have to use the new command:
+<Code:bash| Check status>
+[root@lpara etcd]# etcdctl --cluster=true endpoint health
+http://lpara:2379 is healthy: successfully committed proposal: took = 5.287583ms
+http://lparb:2379 is healthy: successfully committed proposal: took = 6.793576ms
+[root@lpara etcd]#
 </Code>
@@ Line 119: / Line 128: @@
 So we have our key-value propagated among the cluster, nice....
+We can also check the health of the cluster endpoints as follow:
+<Code:Bash|Check cluster endpoints>
+[root@etcd02 ~]# etcdctl endpoint status --write-out=table --endpoints=http://etcd01:2379,http://etcd00:2379,http://etcd02:2379
++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+|      ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+| http://etcd01:2379 | 32552f25bf46ac89 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
+| http://etcd00:2379 | 5dd4414f709cd700 |  3.5.21 |   20 kB |      true |      false |         2 |         12 |                 12 |        |
+| http://etcd02:2379 | e1e9e5d83c377f01 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+[root@etcd02 ~]#
+</Code>
 =====Setup Patroni=====
@@ Line 138: / Line 164: @@
 chown -R postgres:postgres /etc/patroni
 chown postgres:postgres /db/pgdata
-chmod 600 /etc/patroni/stampede.yml
 chown postgres:postgres /var/log/postgresql/
 chown postgres:postgres /db
@@ Line 206: / Line 231: @@
     log_directory: "/var/log/postgresql"
     log_filename: "postgresql-15-main.log"
+    archive_command: 'test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f'
+    archive_mode: 'on'
 EOF
+chown -R postgres:postgres /etc/patroni
+chown postgres:postgres /db/pgdata
+chown -R postgres:postgres /var/log/postgresql/
+chown postgres:postgres /db
+chmod 600 /etc/patroni/stampede.yml
+chmod 750 /db/pgdata
 chown postgres:postgres /var/log/postgresql/patroni.log
 patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
 </Code>
@@ Line 276: / Line 310: @@
 Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02:
+**IT IS IMPORTANT TO NOTE THAT, IF YOU UPDATE THE PATRONI YML FILE, YOU NEED TO RELOAD THE CLUSTER AND RESTART IT!!!**
 ====Reinitialize a node====
@@ Line 295: / Line 330: @@
 </Code>
-That was necessary because, even though, node02 was "running" it wasn't in sync with the leader. After the re-initialization, all is fixed.
+====Restart Cluster====
+<Code:bash|Restart the cluster>
+-bash-4.2$ patronictl -c /etc/patroni/stampede.yml restart stampede
++ Cluster: stampede (7337687292879936167) ---+-----------+
+| Member | Host   | Role    | State     | TL | Lag in MB |
++--------+--------+---------+-----------+----+-----------+
+| node00 | node00 | Leader  | running   |  1 |           |
+| node01 | node01 | Replica | streaming |  1 |         0 |
+| node02 | node02 | Replica | streaming |  1 |         0 |
++--------+--------+---------+-----------+----+-----------+
+When should the restart take place (e.g. 2024-02-20T20:19)  [now]:
+Are you sure you want to restart members node01, node02, node00? [y/N]: y
+Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
+Success: restart on member node01
+Success: restart on member node02
+Success: restart on member node00
+-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede
++ Cluster: stampede (7337687292879936167) ---+-----------+
+| Member | Host   | Role    | State     | TL | Lag in MB |
++--------+--------+---------+-----------+----+-----------+
+| node00 | node00 | Leader  | running   |  1 |           |
+| node01 | node01 | Replica | streaming |  1 |         0 |
+| node02 | node02 | Replica | streaming |  1 |         0 |
++--------+--------+---------+-----------+----+-----------+
+</Code>
+====Reload cluster====
+The reload is needed if you have changed anything in the configuration file of the resources, EXCEPT the DCS
+<Code:bash|Reload cluster>
+-bash-4.2$ patronictl -c /etc/patroni/stampede.yml reload stampede
++ Cluster: stampede (7337687292879936167) ---+-----------+
+| Member | Host   | Role    | State     | TL | Lag in MB |
++--------+--------+---------+-----------+----+-----------+
+| node00 | node00 | Leader  | running   |  1 |           |
+| node01 | node01 | Replica | streaming |  1 |         0 |
+| node02 | node02 | Replica | streaming |  1 |         0 |
++--------+--------+---------+-----------+----+-----------+
+Are you sure you want to reload members node01, node02, node00? [y/N]: y
+Reload request received for member node01 and will be processed within 10 seconds
+Reload request received for member node02 and will be processed within 10 seconds
+Reload request received for member node00 and will be processed within 10 seconds
+-bash-4.2$
+</Code>
+====Update DCS====
+If you want to edit the DCS, you have to EITHER re-initialize the server OR use the rest API:
+<Code:bash|Update DCS part>
+$ curl -s -XPATCH -d \
+        '{"postgresql":{"parameters":{"archive_command":"test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f"}}}' \
+        http://localhost:8008/config
+</Code>
+=====Appendix=====
+The whole code to run you that cluster using vagrant, you can see below.
+<Code:Bash>
+$allow_shell = <<SCRIPT
+sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config
+systemctl restart sshd.service
+SCRIPT
+$edit_hosts_file = <<SCRIPT
+cat > /etc/hosts <<EOF
+.0.0.1 localhost
+.100.10.200 node00
+.100.10.201 node01
+.100.10.202 node02
+EOF
+SCRIPT
+$install_etcd = <<SCRIPT
+yum -y update
+yum -y install python3-psycopg2 python3-pip PyYAML
+yum -y install etcd
+mkdir -p /db/etcd
+# Install PostgreSQL:
+sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
+sudo yum install -y postgresql15-server
+chown postgres:postgres /db/etcd
+#Modify etcd
+host=`hostname`
+cat > /etc/etcd/etcd.conf <<EOF
+name: ${host}
+data-dir: /db/etcd
+initial-advertise-peer-urls: http://${host}:2380
+listen-peer-urls: http://0.0.0.0:2380
+listen-client-urls: http://0.0.0.0:2379
+advertise-client-urls: http://${host}:2379
+initial-cluster: "node00=http://node00:2380,node01=http://node01:2380,node02=http://node02:2380"
+EOF
+mkdir -p /var/log/postgresql
+etcd --config-file /etc/etcd/etcd.conf &>/var/log/postgresql/etcd.log &
+SCRIPT
+$install_patroni = <<SCRIPT
+yum -y install centos-release-scl-rh
+yum -y install llvm-toolset-7-clang gcc python3-devel postgresql15-devel.x86_64
+pip3 install patroni
+pip3 install python-etcd
+mkdir /etc/patroni
+mkdir -p /db/pgdata
+mkdir -p /var/log/postgresql/
+chown -R postgres:postgres /etc/patroni
+chown postgres:postgres /db/pgdata
+chmod 600 /etc/patroni/stampede.yml
+chown postgres:postgres /var/log/postgresql/
+chown postgres:postgres /db
+host=`hostname`
+binexec=`find / -name "pg_config"`
+bindir=`${binexec} --bindir`
+ipaddr=`cat /etc/hosts | grep "${host}" | cut -d' ' -f1`
+subnet=`echo ${ipaddr} | cut -d'.' -f1-3`
+cat > /etc/patroni/stampede.yml <<EOF
+scope: stampede
+name: ${host}
+restapi:
+  listen: ${host}:8008
+  connect_address: ${host}:8008
+etcd:
+  host: ${host}:2379
+bootstrap:
+  dcs:
+    ttl: 30
+    loop_wait: 10
+    retry_timeout: 10
+    maximum_lag_on_failover: 1048576
+    postgresql:
+      use_pg_rewind: true
+      use_slots: true
+      parameters:
+        wal_level: logical
+        wal_log_hints: "on"
+  initdb:
+  - encoding: UTF8
+  - data-checksums
+  pg_hba:
+  - host replication rep_user ${ipaddr}/24 md5
+  - host all all ${subnet}.0/24 md5
+  users:
+    admin:
+      password: adminpass
+      options:
+        - createrole
+        - createdb
+postgresql:
+  listen: ${host}:5432
+  connect_address: ${host}:5432
+  data_dir: /db/pgdata
+  bin_dir: ${bindir}
+  pgpass: /tmp/pgpass0
+  authentication:
+    replication:
+      username: rep_user
+      password: newpass
+    superuser:
+      username: postgres
+      password: newpass
+  parameters:
+    unix_socket_directories: '/var/run/postgresql'
+    external_pid_file: '/var/run/postgresql/15-main.pid'
+    logging_collector: "on"
+    log_directory: "/var/log/postgresql"
+    log_filename: "postgresql-15-main.log"
+EOF
+chmod 750 /db/pgdata
+chown postgres:postgres /var/log/postgresql/patroni.log
+patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
+SCRIPT
+$install_ansible = <<SCRIPT
+sudo yum -y install epel-release
+sudo yum -y install ansible
+cat > /etc/ansible/hosts <<EOF
+[patroni]
+.100.10.200
+.100.10.201
+.100.10.202
+[patroni:vars]
+ansible_user=root
+ansible_password=vagrant
+ansible_connection=ssh
+EOF
+sed -i 's/#host_key_checking = False/host_key_checking = False/g' /etc/ansible/ansible.cfg
+SCRIPT
+$disable_selinux = <<SCRIPT
+setenforce 0
+SCRIPT
+# All Vagrant configuration is done below. The "2" in Vagrant.configure
+# configures the configuration version (we support older styles for
+# backwards compatibility). Please don't change it unless you know what
+# you're doing.
+Vagrant.configure("2") do |config|
+	(0..2).each do |i|
+    config.vm.define "node0#{i}" do |node|
+      node.vm.box = "centos/7"
+      node.vm.box_check_update = true
+      node.vm.network :private_network, ip: "10.100.10.20#{i}"
+      node.vm.hostname = "node0#{i}"
+	  node.vm.provision "shell", inline: $allow_shell, privileged: true
+	  node.vm.provision "shell", inline: $disable_selinux, privileged: true
+	  node.vm.provision "shell", inline: $install_ansible, privileged: true
+	  node.vm.provision "shell", inline: $edit_hosts_file, privileged: true
+	  node.vm.provision "shell", inline: $install_etcd, privileged: true
+    node.vm.provision "shell", inline: $install_patroni, privileged: true
+      node.vm.provider "virtualbox" do |vb|
+        vb.name = "node0#{i}"
+        vb.memory = "1024"
+      end
+    end
+  end
+end
+</Code>
+That code will create 3 node cluster with everything.
+=====Backup=====
+Below you can see a method to backup, encrypt and upload a backup to OCI. The backup uses the replica for a backup source:
+<Code:bash | Backup >
+#!/bin/bash
+switch_logfiles(){
+    v_leader=$1
+    while :
+    do
+        echo "Switch logfile"
+        psql -Ubarman -h ${v_leader} -c "select pg_switch_wal()" -d datamine
+        sleep 10
+    done
+}
+backup() {
+environment=$1
+echo "Backup start"
+echo "Obtaining the first replica in the cluster"
+REPLICA_HOST=postgres-${environment}
+v_first_replica=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "replica") | .host' | head -n 1)
+v_leader=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "leader") | .host')
+echo "Replacing placeholder"
+sed -i "s/REPLICA_IP_PLACEHOLDER/${v_first_replica}/g" /etc/barman.d/${environment}/streaming-server.conf
+echo "Creating replication slot"
+barman -c /etc/barman_${environment}.conf receive-wal --create-slot ${environment}
+echo "Starting cron and receive-wal"
+barman -c /etc/barman_${environment}.conf cron
+echo "Backup process start"
+switch_logfiles ${v_leader} &
+barman -c /etc/barman_${environment}.conf backup ${environment} --wait
+kill $!
+echo "Return placeholder"
+sed -i "s/${v_first_replica}/REPLICA_IP_PLACEHOLDER/g" /etc/barman.d/${environment}/streaming-server.conf
+}
+clean_old_backups(){
+    find /var/log/mbackups -depth -mtime +7 -type f -print
+}
+encrypt_backup(){
+        date
+        environment=$1
+        backup_dir=$(ls -rtd /data/backup/${environment}/base/* | tail -1)
+        cd $(dirname ${backup_dir})
+        backup_file=$(echo ${backup_dir} | cut -d '/' -f 6)
+        enc_key=$(cat /var/lib/barman/enckey.enc | base64 -d)
+        tar zcvf - ${backup_file}/ | openssl enc -e -pbkdf2 -aes-256-ctr -k ${enc_key} -out ${backup_file}.tar.gz.enc
+        date
+}
+copy_files_to_os(){
+        date
+        environment=$1
+        echo "Set BUCKET_NAME and OCI config file variable"
+        case ${environment} in
+            "dev")
+                BUCKET_NAME=postgres-dev-backup
+                OCI_CONFIG_FILE=/var/lib/barman/ocivault/dev.conf
+            ;;
+            "stg")
+                BUCKET_NAME=postgres-stg-backup
+                OCI_CONFIG_FILE=/var/lib/barman/ocivault/stg.conf
+            ;;
+            "prod")
+                BUCKET_NAME=postgres-prod-backup
+                OCI_CONFIG_FILE=/var/lib/barman/ocivault/prod.conf
+            ;;
+            *)
+        esac
+        export BUCKET_NAME
+        export OCI_CONFIG_FILE
+        encrypted_file=$(ls -rtd /data/backup/${environment}/base/* | grep "tar.gz.enc" | tail -1)
+        oci os object put --bucket-name ${BUCKET_NAME} --file ${encrypted_file}
+        rm -f ${encrypted_file}
+        date
+}
+backup $1
+encrypt_backup $1
+copy_files_to_os $1
+</Code>