Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
postgresql_patroni [2023/08/31 20:36] andonovjpostgresql_patroni [2025/05/13 05:14] (current) andonovj
Line 100: Line 100:
 cluster is healthy cluster is healthy
 [root@node00 postgresql]# [root@node00 postgresql]#
 +</Code>
 +
 +If you uses API v3, you have to use the new command:
 +
 +<Code:bash| Check status>
 +[root@lpara etcd]# etcdctl --cluster=true endpoint health
 +http://lpara:2379 is healthy: successfully committed proposal: took = 5.287583ms
 +http://lparb:2379 is healthy: successfully committed proposal: took = 6.793576ms
 +[root@lpara etcd]#
 </Code> </Code>
  
Line 119: Line 128:
  
 So we have our key-value propagated among the cluster, nice.... So we have our key-value propagated among the cluster, nice....
 +
 +
 +We can also check the health of the cluster endpoints as follow:
 +
 +<Code:Bash|Check cluster endpoints>
 +[root@etcd02 ~]# etcdctl endpoint status --write-out=table --endpoints=http://etcd01:2379,http://etcd00:2379,http://etcd02:2379
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +|      ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +| http://etcd01:2379 | 32552f25bf46ac89 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
 +| http://etcd00:2379 | 5dd4414f709cd700 |  3.5.21 |   20 kB |      true |      false |         2 |         12 |                 12 |        |
 +| http://etcd02:2379 | e1e9e5d83c377f01 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +[root@etcd02 ~]# 
 +
 +</Code>
 +
  
 =====Setup Patroni===== =====Setup Patroni=====
Line 138: Line 164:
 chown -R postgres:postgres /etc/patroni chown -R postgres:postgres /etc/patroni
 chown postgres:postgres /db/pgdata chown postgres:postgres /db/pgdata
-chmod 600 /etc/patroni/stampede.yml 
 chown postgres:postgres /var/log/postgresql/ chown postgres:postgres /var/log/postgresql/
 chown postgres:postgres /db chown postgres:postgres /db
Line 206: Line 231:
     log_directory: "/var/log/postgresql"      log_directory: "/var/log/postgresql" 
     log_filename: "postgresql-15-main.log"      log_filename: "postgresql-15-main.log" 
 +    archive_command: 'test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f'
 +    archive_mode: 'on'
 EOF EOF
  
 +chown -R postgres:postgres /etc/patroni
 +chown postgres:postgres /db/pgdata
 +chown -R postgres:postgres /var/log/postgresql/
 +chown postgres:postgres /db
 +chmod 600 /etc/patroni/stampede.yml
 chmod 750 /db/pgdata chmod 750 /db/pgdata
 chown postgres:postgres /var/log/postgresql/patroni.log chown postgres:postgres /var/log/postgresql/patroni.log
 +
 patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log & patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
 </Code> </Code>
Line 277: Line 310:
 Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02: Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02:
  
 +**IT IS IMPORTANT TO NOTE THAT, IF YOU UPDATE THE PATRONI YML FILE, YOU NEED TO RELOAD THE CLUSTER AND RESTART IT!!!**
  
 ====Reinitialize a node==== ====Reinitialize a node====
Line 296: Line 330:
 </Code> </Code>
  
-That was necessary because, even though, node02 was "running" it wasn't in sync with the leader. After the re-initialization, all is fixed. 
  
 +
 +====Restart Cluster====
 +<Code:bash|Restart the cluster>
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml restart stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +When should the restart take place (e.g. 2024-02-20T20:19)  [now]:
 +Are you sure you want to restart members node01, node02, node00? [y/N]: y
 +Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
 +Success: restart on member node01
 +Success: restart on member node02
 +Success: restart on member node00
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +</Code>
 +
 +====Reload cluster====
 +The reload is needed if you have changed anything in the configuration file of the resources, EXCEPT the DCS
 +
 +<Code:bash|Reload cluster>
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml reload stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +Are you sure you want to reload members node01, node02, node00? [y/N]: y
 +Reload request received for member node01 and will be processed within 10 seconds
 +Reload request received for member node02 and will be processed within 10 seconds
 +Reload request received for member node00 and will be processed within 10 seconds
 +-bash-4.2$
 +</Code>
 +
 +====Update DCS====
 +If you want to edit the DCS, you have to EITHER re-initialize the server OR use the rest API:
 +
 +
 +<Code:bash|Update DCS part>
 +$ curl -s -XPATCH -d \
 +        '{"postgresql":{"parameters":{"archive_command":"test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f"}}}' \
 +        http://localhost:8008/config
 +</Code>
 =====Appendix===== =====Appendix=====
 The whole code to run you that cluster using vagrant, you can see below. The whole code to run you that cluster using vagrant, you can see below.
Line 485: Line 573:
  
 That code will create 3 node cluster with everything. That code will create 3 node cluster with everything.
 +
 +
 +=====Backup=====
 +Below you can see a method to backup, encrypt and upload a backup to OCI. The backup uses the replica for a backup source:
 +
 +<Code:bash | Backup >
 +#!/bin/bash
 +
 +switch_logfiles(){
 +    v_leader=$1
 +
 +    while :
 +    do
 +        echo "Switch logfile"
 +        psql -Ubarman -h ${v_leader} -c "select pg_switch_wal()" -d datamine
 +        sleep 10
 +    done
 +}
 +
 +backup() {
 +
 +environment=$1
 +echo "Backup start"
 +
 +echo "Obtaining the first replica in the cluster"
 +REPLICA_HOST=postgres-${environment}
 +
 +v_first_replica=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "replica") | .host' | head -n 1)
 +v_leader=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "leader") | .host')
 +
 +echo "Replacing placeholder"
 +sed -i "s/REPLICA_IP_PLACEHOLDER/${v_first_replica}/g" /etc/barman.d/${environment}/streaming-server.conf
 +
 +echo "Creating replication slot"
 +barman -c /etc/barman_${environment}.conf receive-wal --create-slot ${environment}
 +
 +echo "Starting cron and receive-wal"
 +barman -c /etc/barman_${environment}.conf cron
 +
 +echo "Backup process start"
 +
 +switch_logfiles ${v_leader} &
 +
 +barman -c /etc/barman_${environment}.conf backup ${environment} --wait
 +
 +kill $!
 +
 +echo "Return placeholder"
 +sed -i "s/${v_first_replica}/REPLICA_IP_PLACEHOLDER/g" /etc/barman.d/${environment}/streaming-server.conf
 +}
 +
 +clean_old_backups(){
 +    find /var/log/mbackups -depth -mtime +7 -type f -print
 +}
 +
 +encrypt_backup(){
 +        date
 +        environment=$1
 +        backup_dir=$(ls -rtd /data/backup/${environment}/base/* | tail -1)
 +        cd $(dirname ${backup_dir})
 +        backup_file=$(echo ${backup_dir} | cut -d '/' -f 6)
 +        enc_key=$(cat /var/lib/barman/enckey.enc | base64 -d)
 +        tar zcvf - ${backup_file}/ | openssl enc -e -pbkdf2 -aes-256-ctr -k ${enc_key} -out ${backup_file}.tar.gz.enc
 +        date
 +}
 +
 +copy_files_to_os(){
 +        date
 +        environment=$1
 +        echo "Set BUCKET_NAME and OCI config file variable"
 +        case ${environment} in
 +            "dev")
 +                BUCKET_NAME=postgres-dev-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/dev.conf
 +            ;;
 +            "stg")
 +                BUCKET_NAME=postgres-stg-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/stg.conf
 +            ;;
 +            "prod")
 +                BUCKET_NAME=postgres-prod-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/prod.conf
 +            ;;
 +            *)
 +        esac
 +
 +        export BUCKET_NAME
 +        export OCI_CONFIG_FILE
 +
 +        encrypted_file=$(ls -rtd /data/backup/${environment}/base/* | grep "tar.gz.enc" | tail -1)
 +        oci os object put --bucket-name ${BUCKET_NAME} --file ${encrypted_file}
 +        rm -f ${encrypted_file}
 +        date
 +}
 +
 +backup $1
 +encrypt_backup $1
 +copy_files_to_os $1
 +
 +</Code>
  • postgresql_patroni.1693514198.txt.gz
  • Last modified: 2023/08/31 20:36
  • by andonovj