postgresql_patroni

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
postgresql_patroni [2023/08/31 20:29] andonovjpostgresql_patroni [2025/05/13 05:14] (current) andonovj
Line 100: Line 100:
 cluster is healthy cluster is healthy
 [root@node00 postgresql]# [root@node00 postgresql]#
 +</Code>
 +
 +If you uses API v3, you have to use the new command:
 +
 +<Code:bash| Check status>
 +[root@lpara etcd]# etcdctl --cluster=true endpoint health
 +http://lpara:2379 is healthy: successfully committed proposal: took = 5.287583ms
 +http://lparb:2379 is healthy: successfully committed proposal: took = 6.793576ms
 +[root@lpara etcd]#
 </Code> </Code>
  
Line 119: Line 128:
  
 So we have our key-value propagated among the cluster, nice.... So we have our key-value propagated among the cluster, nice....
 +
 +
 +We can also check the health of the cluster endpoints as follow:
 +
 +<Code:Bash|Check cluster endpoints>
 +[root@etcd02 ~]# etcdctl endpoint status --write-out=table --endpoints=http://etcd01:2379,http://etcd00:2379,http://etcd02:2379
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +|      ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +| http://etcd01:2379 | 32552f25bf46ac89 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
 +| http://etcd00:2379 | 5dd4414f709cd700 |  3.5.21 |   20 kB |      true |      false |         2 |         12 |                 12 |        |
 +| http://etcd02:2379 | e1e9e5d83c377f01 |  3.5.21 |   20 kB |     false |      false |         2 |         12 |                 12 |        |
 ++--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 +[root@etcd02 ~]# 
 +
 +</Code>
 +
  
 =====Setup Patroni===== =====Setup Patroni=====
Line 138: Line 164:
 chown -R postgres:postgres /etc/patroni chown -R postgres:postgres /etc/patroni
 chown postgres:postgres /db/pgdata chown postgres:postgres /db/pgdata
-chmod 600 /etc/patroni/stampede.yml 
 chown postgres:postgres /var/log/postgresql/ chown postgres:postgres /var/log/postgresql/
 chown postgres:postgres /db chown postgres:postgres /db
Line 206: Line 231:
     log_directory: "/var/log/postgresql"      log_directory: "/var/log/postgresql" 
     log_filename: "postgresql-15-main.log"      log_filename: "postgresql-15-main.log" 
 +    archive_command: 'test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f'
 +    archive_mode: 'on'
 EOF EOF
  
 +chown -R postgres:postgres /etc/patroni
 +chown postgres:postgres /db/pgdata
 +chown -R postgres:postgres /var/log/postgresql/
 +chown postgres:postgres /db
 +chmod 600 /etc/patroni/stampede.yml
 chmod 750 /db/pgdata chmod 750 /db/pgdata
 chown postgres:postgres /var/log/postgresql/patroni.log chown postgres:postgres /var/log/postgresql/patroni.log
 +
 patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log & patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
 </Code> </Code>
Line 277: Line 310:
 Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02: Well, I had many issues, but one of the most important ones was that Node02 was ahead of the Node00 for some reason, so I had to reinitialize Node02:
  
 +**IT IS IMPORTANT TO NOTE THAT, IF YOU UPDATE THE PATRONI YML FILE, YOU NEED TO RELOAD THE CLUSTER AND RESTART IT!!!**
  
 ====Reinitialize a node==== ====Reinitialize a node====
Line 296: Line 330:
 </Code> </Code>
  
-That was necessary because, even though, node02 was "running" it wasn't in sync with the leader. After the re-initialization, all is fixed. 
  
 +
 +====Restart Cluster====
 +<Code:bash|Restart the cluster>
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml restart stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +When should the restart take place (e.g. 2024-02-20T20:19)  [now]:
 +Are you sure you want to restart members node01, node02, node00? [y/N]: y
 +Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
 +Success: restart on member node01
 +Success: restart on member node02
 +Success: restart on member node00
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml list stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +</Code>
 +
 +====Reload cluster====
 +The reload is needed if you have changed anything in the configuration file of the resources, EXCEPT the DCS
 +
 +<Code:bash|Reload cluster>
 +-bash-4.2$ patronictl -c /etc/patroni/stampede.yml reload stampede
 ++ Cluster: stampede (7337687292879936167) ---+-----------+
 +| Member | Host   | Role    | State     | TL | Lag in MB |
 ++--------+--------+---------+-----------+----+-----------+
 +| node00 | node00 | Leader  | running    1 |           |
 +| node01 | node01 | Replica | streaming |  1 |         0 |
 +| node02 | node02 | Replica | streaming |  1 |         0 |
 ++--------+--------+---------+-----------+----+-----------+
 +Are you sure you want to reload members node01, node02, node00? [y/N]: y
 +Reload request received for member node01 and will be processed within 10 seconds
 +Reload request received for member node02 and will be processed within 10 seconds
 +Reload request received for member node00 and will be processed within 10 seconds
 +-bash-4.2$
 +</Code>
 +
 +====Update DCS====
 +If you want to edit the DCS, you have to EITHER re-initialize the server OR use the rest API:
 +
 +
 +<Code:bash|Update DCS part>
 +$ curl -s -XPATCH -d \
 +        '{"postgresql":{"parameters":{"archive_command":"test ! -f /db/wal_archive/%f && cp %p /db/wal_archive/%f"}}}' \
 +        http://localhost:8008/config
 +</Code>
 +=====Appendix=====
 +The whole code to run you that cluster using vagrant, you can see below.
 +
 +<Code:Bash>
 +$allow_shell = <<SCRIPT
 +sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config    
 +systemctl restart sshd.service
 +SCRIPT
 +
 +$edit_hosts_file = <<SCRIPT
 +cat > /etc/hosts <<EOF
 +127.0.0.1 localhost
 +
 +10.100.10.200 node00
 +10.100.10.201 node01 
 +10.100.10.202 node02 
 +EOF
 +SCRIPT
 +
 +
 +$install_etcd = <<SCRIPT
 +yum -y update
 +yum -y install python3-psycopg2 python3-pip PyYAML
 +yum -y install etcd
 +mkdir -p /db/etcd
 +
 +# Install PostgreSQL:
 +sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
 +sudo yum install -y postgresql15-server
 +chown postgres:postgres /db/etcd
 +
 +#Modify etcd
 +host=`hostname`
 +cat > /etc/etcd/etcd.conf <<EOF
 +name: ${host}
 +data-dir: /db/etcd 
 +initial-advertise-peer-urls: http://${host}:2380 
 +listen-peer-urls: http://0.0.0.0:2380 
 +listen-client-urls: http://0.0.0.0:2379 
 +advertise-client-urls: http://${host}:2379 
 +initial-cluster: "node00=http://node00:2380,node01=http://node01:2380,node02=http://node02:2380"
 +EOF
 +
 +mkdir -p /var/log/postgresql
 +etcd --config-file /etc/etcd/etcd.conf &>/var/log/postgresql/etcd.log &
 +SCRIPT
 +
 +
 +$install_patroni = <<SCRIPT
 +yum -y install centos-release-scl-rh
 +yum -y install llvm-toolset-7-clang gcc python3-devel postgresql15-devel.x86_64
 +pip3 install patroni
 +pip3 install python-etcd
 +mkdir /etc/patroni
 +mkdir -p /db/pgdata 
 +mkdir -p /var/log/postgresql/
 +
 +
 +chown -R postgres:postgres /etc/patroni
 +chown postgres:postgres /db/pgdata
 +chmod 600 /etc/patroni/stampede.yml
 +chown postgres:postgres /var/log/postgresql/
 +chown postgres:postgres /db
 +
 +host=`hostname`
 +binexec=`find / -name "pg_config"`
 +bindir=`${binexec} --bindir`
 +ipaddr=`cat /etc/hosts | grep "${host}" | cut -d' ' -f1`
 +subnet=`echo ${ipaddr} | cut -d'.' -f1-3`
 +
 +cat > /etc/patroni/stampede.yml <<EOF
 +scope: stampede 
 +name: ${host} 
 + 
 +restapi: 
 +  listen: ${host}:8008 
 +  connect_address: ${host}:8008 
 + 
 +etcd: 
 +  host: ${host}:2379 
 + 
 +bootstrap: 
 +  dcs: 
 +    ttl: 30 
 +    loop_wait: 10 
 +    retry_timeout: 10 
 +    maximum_lag_on_failover: 1048576 
 +    postgresql: 
 +      use_pg_rewind: true 
 +      use_slots: true 
 +      parameters: 
 +        wal_level: logical 
 +        wal_log_hints: "on" 
 + 
 +  initdb: 
 +  - encoding: UTF8 
 +  - data-checksums 
 + 
 +  pg_hba: 
 +  - host replication rep_user ${ipaddr}/24 md5 
 +  - host all all ${subnet}.0/24 md5 
 + 
 +  users: 
 +    admin: 
 +      password: adminpass 
 +      options: 
 +        - createrole 
 +        - createdb 
 + 
 +postgresql: 
 +  listen: ${host}:5432 
 +  connect_address: ${host}:5432 
 +  data_dir: /db/pgdata 
 +  bin_dir: ${bindir}
 +  pgpass: /tmp/pgpass0 
 +  authentication: 
 +    replication: 
 +      username: rep_user 
 +      password: newpass 
 +    superuser: 
 +      username: postgres 
 +      password: newpass 
 +  parameters: 
 +    unix_socket_directories: '/var/run/postgresql' 
 +    external_pid_file: '/var/run/postgresql/15-main.pid' 
 +    logging_collector: "on" 
 +    log_directory: "/var/log/postgresql" 
 +    log_filename: "postgresql-15-main.log" 
 +EOF
 +
 +
 +chmod 750 /db/pgdata
 +chown postgres:postgres /var/log/postgresql/patroni.log
 +patroni /etc/patroni/stampede.yml &> /var/log/postgresql/patroni.log &
 +SCRIPT
 +
 +$install_ansible = <<SCRIPT
 +sudo yum -y install epel-release
 +sudo yum -y install ansible
 +
 +cat > /etc/ansible/hosts <<EOF
 +[patroni]
 +10.100.10.200
 +10.100.10.201
 +10.100.10.202
 + 
 +[patroni:vars]
 +ansible_user=root
 +ansible_password=vagrant
 +ansible_connection=ssh
 +EOF
 +
 +sed -i 's/#host_key_checking = False/host_key_checking = False/g' /etc/ansible/ansible.cfg  
 +SCRIPT
 +
 +$disable_selinux = <<SCRIPT
 +setenforce 0
 +SCRIPT
 +
 +
 +# All Vagrant configuration is done below. The "2" in Vagrant.configure
 +# configures the configuration version (we support older styles for
 +# backwards compatibility). Please don't change it unless you know what
 +# you're doing.
 +Vagrant.configure("2") do |config|
 + (0..2).each do |i|
 +    config.vm.define "node0#{i}" do |node|
 +      node.vm.box = "centos/7"
 +      node.vm.box_check_update = true
 +      node.vm.network :private_network, ip: "10.100.10.20#{i}"
 +      node.vm.hostname = "node0#{i}"
 +   node.vm.provision "shell", inline: $allow_shell, privileged: true
 +   node.vm.provision "shell", inline: $disable_selinux, privileged: true
 +   node.vm.provision "shell", inline: $install_ansible, privileged: true
 +   node.vm.provision "shell", inline: $edit_hosts_file, privileged: true
 +   node.vm.provision "shell", inline: $install_etcd, privileged: true
 +    node.vm.provision "shell", inline: $install_patroni, privileged: true
 +      node.vm.provider "virtualbox" do |vb|
 +        vb.name = "node0#{i}"
 +        vb.memory = "1024"
 +      end
 +    end
 +  end
 +end
 +</Code>
 +
 +That code will create 3 node cluster with everything.
 +
 +
 +=====Backup=====
 +Below you can see a method to backup, encrypt and upload a backup to OCI. The backup uses the replica for a backup source:
 +
 +<Code:bash | Backup >
 +#!/bin/bash
 +
 +switch_logfiles(){
 +    v_leader=$1
 +
 +    while :
 +    do
 +        echo "Switch logfile"
 +        psql -Ubarman -h ${v_leader} -c "select pg_switch_wal()" -d datamine
 +        sleep 10
 +    done
 +}
 +
 +backup() {
 +
 +environment=$1
 +echo "Backup start"
 +
 +echo "Obtaining the first replica in the cluster"
 +REPLICA_HOST=postgres-${environment}
 +
 +v_first_replica=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "replica") | .host' | head -n 1)
 +v_leader=$(curl -sk https://${REPLICA_HOST}:8008/cluster | jq -r '.members[] | select(.role == "leader") | .host')
 +
 +echo "Replacing placeholder"
 +sed -i "s/REPLICA_IP_PLACEHOLDER/${v_first_replica}/g" /etc/barman.d/${environment}/streaming-server.conf
 +
 +echo "Creating replication slot"
 +barman -c /etc/barman_${environment}.conf receive-wal --create-slot ${environment}
 +
 +echo "Starting cron and receive-wal"
 +barman -c /etc/barman_${environment}.conf cron
 +
 +echo "Backup process start"
 +
 +switch_logfiles ${v_leader} &
 +
 +barman -c /etc/barman_${environment}.conf backup ${environment} --wait
 +
 +kill $!
 +
 +echo "Return placeholder"
 +sed -i "s/${v_first_replica}/REPLICA_IP_PLACEHOLDER/g" /etc/barman.d/${environment}/streaming-server.conf
 +}
 +
 +clean_old_backups(){
 +    find /var/log/mbackups -depth -mtime +7 -type f -print
 +}
 +
 +encrypt_backup(){
 +        date
 +        environment=$1
 +        backup_dir=$(ls -rtd /data/backup/${environment}/base/* | tail -1)
 +        cd $(dirname ${backup_dir})
 +        backup_file=$(echo ${backup_dir} | cut -d '/' -f 6)
 +        enc_key=$(cat /var/lib/barman/enckey.enc | base64 -d)
 +        tar zcvf - ${backup_file}/ | openssl enc -e -pbkdf2 -aes-256-ctr -k ${enc_key} -out ${backup_file}.tar.gz.enc
 +        date
 +}
 +
 +copy_files_to_os(){
 +        date
 +        environment=$1
 +        echo "Set BUCKET_NAME and OCI config file variable"
 +        case ${environment} in
 +            "dev")
 +                BUCKET_NAME=postgres-dev-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/dev.conf
 +            ;;
 +            "stg")
 +                BUCKET_NAME=postgres-stg-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/stg.conf
 +            ;;
 +            "prod")
 +                BUCKET_NAME=postgres-prod-backup
 +                OCI_CONFIG_FILE=/var/lib/barman/ocivault/prod.conf
 +            ;;
 +            *)
 +        esac
 +
 +        export BUCKET_NAME
 +        export OCI_CONFIG_FILE
 +
 +        encrypted_file=$(ls -rtd /data/backup/${environment}/base/* | grep "tar.gz.enc" | tail -1)
 +        oci os object put --bucket-name ${BUCKET_NAME} --file ${encrypted_file}
 +        rm -f ${encrypted_file}
 +        date
 +}
 +
 +backup $1
 +encrypt_backup $1
 +copy_files_to_os $1
 +
 +</Code>
  • postgresql_patroni.1693513792.txt.gz
  • Last modified: 2023/08/31 20:29
  • by andonovj