postgresql_repmgr

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
postgresql_repmgr [2021/09/15 22:09] – created andonovjpostgresql_repmgr [2024/06/22 15:55] (current) – [Overview] andonovj
Line 1: Line 1:
-===Basic Administration=== +=====Overview===== 
-  - [[postgresql_repmgr_installation|Installation & Configuration]] +In this section, we will configure the virtual machines and install, and configure the environment. 
-  - [[postgresql_repmgr_management|Management]]+In the other other sections we will play with the environment. 
 + 
 +We will use: 
 + 
 +  * Vagrant For Virtualization 
 +  * Ubuntu - For Operation System 
 +  * 2nd Quandrant - For Rempgr 
 + 
 + 
 +{{:1_g-5e4zhde9lojvmbdsiptg.png?800|}} 
 + 
 +=====Virtual Environment Config===== 
 +To configure the Virtual Environment, we will use Vagrant and Ubuntu machines. 
 +You can configure the following vagrant file: 
 + 
 + 
 +<Code:bash|Vagrant Config> 
 +C:\Virtual Box\REPMGR>vagrant init 
 +A `Vagrantfile` has been placed in this directory. You are now 
 +ready to `vagrant up` your first virtual environment! Please read 
 +the comments in the Vagrantfile as well as documentation on 
 +`vagrantup.com` for more information on using Vagrant. 
 + 
 +C:\Virtual Box\REPMGR>vagrant status 
 +Current machine states: 
 + 
 +server1                   not created (virtualbox) 
 +server2                   not created (virtualbox) 
 + 
 +This environment represents multiple VMs. The VMs are all listed 
 +above with their current state. For more information about a specific 
 +VM, run `vagrant status NAME`. 
 + 
 +C:\Virtual Box\REPMGR>vagrant up 
 +Bringing machine 'server1' up with 'virtualbox' provider... 
 +Bringing machine 'server2' up with 'virtualbox' provider... 
 +==> server1: Box 'consumerlab/ubuntu-server-16-04-LTS' could not be found. Attempting to find and install... 
 +    server1: Box Provider: virtualbox 
 +    server1: Box Version: >= 0 
 +==> server1: Loading metadata for box 'consumerlab/ubuntu-server-16-04-LTS' 
 +    server1: URL: https://vagrantcloud.com/consumerlab/ubuntu-server-16-04-LTS 
 +==> server1: Adding box 'consumerlab/ubuntu-server-16-04-LTS' (v1.0.2) for provider: virtualbox 
 +    server1: Downloading: https://vagrantcloud.com/consumerlab/boxes/ubuntu-server-16-04-LTS/versions/1.0.2/providers/virtualbox.box 
 +*************************************** 
 +</Code> 
 + 
 + 
 +The vagrant file is the following: 
 + 
 +<Code:bash| Vagrant file> 
 +Vagrant.configure("2") do |config| 
 +  (1..2).each do |n| 
 +    config.vm.define "server#{n}" do |define| 
 +      define.vm.box = "bento/ubuntu-20.04" 
 +      define.ssh.insert_key = false 
 +      define.vm.hostname = "server#{n}" 
 +      define.vm.network :private_network, ip: "10.0.15.2#{n}" 
 + 
 +      define.vm.provider "virtualbox" do |node| 
 +        node.name = "server#{n}" 
 +      end 
 +    end 
 +  end 
 +end 
 +</Code> 
 + 
 +Once the machines are up, we can start with the installation. 
 +Bare in mind that this vagrant will automatically configure the IPs as well, but will not configure /etc/hosts. Something which we will do now. 
 + 
 + 
 +=====Server Config===== 
 +For the servers we have to: 
 + 
 +  - Configure /etc/hosts 
 +  - Configure 2nd Quadrant Repository 
 +  - Install 2nd Quadrant Packages 
 + 
 + 
 +So, let's get going. 
 + 
 + 
 +====Modify /etc/hosts ==== 
 +That is pretty trivial, just add the IPs and the hostnames to /etc/hosts: 
 + 
 + 
 +<Code:bash|Hosts> 
 +10.0.15.21 server1 
 +10.0.15.22 server2 
 +</Code> 
 + 
 +That will allow, the hosts to see and ping each other. 
 + 
 +Then we can configure the repositories: 
 + 
 + 
 +====Configure Repositories==== 
 +I decided to use 2nd quadrant packages, as they are the ones I use for work as well :) 
 +First thing first, we have to update our repos and pull the latest metadata from them. 
 + 
 +<Code:bash|Update repos> 
 +root@server1:~# apt-get update 
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease 
 +Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB] 
 +Get:3 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] 
 +Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [101 kB] 
 +Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [1,213 kB] 
 +Get:6 http://security.ubuntu.com/ubuntu focal-security/main i386 Packages [282 kB] 
 +Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main i386 Packages [535 kB] 
 +Get:8 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [866 kB] 
 +Get:9 http://archive.ubuntu.com/ubuntu focal-updates/main Translation-en [259 kB] 
 +Get:10 http://archive.ubuntu.com/ubuntu focal-updates/restricted i386 Packages [19.3 kB] 
 +Get:11 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [443 kB] 
 +Get:12 http://security.ubuntu.com/ubuntu focal-security/main Translation-en [167 kB] 
 +Get:13 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [406 kB] 
 +Get:14 http://archive.ubuntu.com/ubuntu focal-updates/restricted Translation-en [63.5 kB] 
 +Get:15 http://archive.ubuntu.com/ubuntu focal-updates/universe i386 Packages [633 kB] 
 +Get:16 http://security.ubuntu.com/ubuntu focal-security/restricted i386 Packages [18.0 kB] 
 +Get:17 http://security.ubuntu.com/ubuntu focal-security/restricted Translation-en [58.3 kB] 
 +Get:18 http://security.ubuntu.com/ubuntu focal-security/universe i386 Packages [507 kB] 
 +Get:19 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [638 kB] 
 +Get:20 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [855 kB] 
 +Get:21 http://archive.ubuntu.com/ubuntu focal-updates/universe Translation-en [182 kB] 
 +Get:22 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [24.6 kB] 
 +Get:23 http://archive.ubuntu.com/ubuntu focal-updates/multiverse i386 Packages [8,232 B] 
 +Get:24 http://archive.ubuntu.com/ubuntu focal-updates/multiverse Translation-en [6,776 B] 
 +Get:25 http://archive.ubuntu.com/ubuntu focal-backports/universe i386 Packages [4,736 B] 
 +Get:26 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [5,800 B] 
 +Get:27 http://security.ubuntu.com/ubuntu focal-security/universe Translation-en [101 kB] 
 +Get:28 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [21.9 kB] 
 +Get:29 http://security.ubuntu.com/ubuntu focal-security/multiverse Translation-en [4,948 B] 
 +Fetched 7,652 kB in 7s (1,048 kB/s) 
 +Reading package lists... Done 
 +</Code> 
 + 
 +====Install PostgreSQL==== 
 +Then we can proceed with installation of PostgreSQL 
 +<Code:bash|Install PostgreSQL> 
 +root@server1:~# apt-get install -y postgresql 
 +Reading package lists... Done 
 +Building dependency tree 
 +Reading state information... Done 
 +The following additional packages will be installed: 
 +  libllvm10 libpq5 libsensors-config libsensors5 postgresql-12 postgresql-client-12 postgresql-client-common postgresql-common ssl-cert sysstat 
 +Suggested packages: 
 +  lm-sensors postgresql-doc postgresql-doc-12 libjson-perl openssl-blacklist isag 
 +The following NEW packages will be installed: 
 +  libllvm10 libpq5 libsensors-config libsensors5 postgresql postgresql-12 postgresql-client-12 postgresql-client-common postgresql-common ssl-cert sysstat 
 +0 upgraded, 11 newly installed, 0 to remove and 54 not upgraded. 
 +Need to get 15.1 MB/30.6 MB of archives. 
 +After this operation, 122 MB of additional disk space will be used. 
 +Get:1 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpq5 amd64 12.8-0ubuntu0.20.04.1 [116 kB] 
 +Get:2 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 postgresql-client-12 amd64 12.8-0ubuntu0.20.04.1 [1,046 kB
 +Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 postgresql-12 amd64 12.8-0ubuntu0.20.04.1 [13.5 MB] 
 +Get:4 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 sysstat amd64 12.2.0-2ubuntu0.1 [448 kB] 
 +Fetched 15.1 MB in 15s (1,015 kB/s) 
 +Preconfiguring packages ... 
 +*************************************************** 
 + 
 +Success. You can now start the database server using: 
 + 
 +    pg_ctlcluster 12 main start 
 + 
 +Ver Cluster Port Status Owner    Data directory              Log file 
 +12  main    5432 down   postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log 
 +update-alternatives: using /usr/share/postgresql/12/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode 
 +Setting up sysstat (12.2.0-2ubuntu0.1) ... 
 + 
 +Creating config file /etc/default/sysstat with new version 
 +update-alternatives: using /usr/bin/sar.sysstat to provide /usr/bin/sar (sar) in auto mode 
 +Created symlink /etc/systemd/system/multi-user.target.wants/sysstat.service → /lib/systemd/system/sysstat.service. 
 +Setting up postgresql (12+214ubuntu0.1) ... 
 +Processing triggers for systemd (245.4-4ubuntu3.11) ... 
 +Processing triggers for man-db (2.9.1-1) ... 
 +Processing triggers for libc-bin (2.31-0ubuntu9.2) ... 
 +</Code> 
 + 
 +Do this on both machines: server1 and server2 
 + 
 +====Install REPMGR Repos==== 
 +Then we can proceed with adding the repo for REPMGR 
 + 
 +<Code:bash|Add REPMGR Repo> 
 +root@server1:~# sudo add-apt-repository 'deb http://packages.2ndquadrant.com/repmgr3/apt/ xenial-2ndquadrant main' 
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease 
 +Get:2 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease [5,044 B] 
 +Hit:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease 
 +Hit:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease 
 +Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] 
 +Err:2 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease 
 +  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6 
 +Reading package lists... Done 
 +W: GPG error: http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6 
 +E: The repository 'http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease' is not signed. 
 +N: Updating from such a repository can't be done securely, and is therefore disabled by default. 
 +N: See apt-secure(8) manpage for repository creation and user configuration details. 
 +root@server1:~# sudo apt update 
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease 
 +Hit:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease 
 +Get:3 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease [5,044 B] 
 +Hit:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease 
 +Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] 
 +Err:3 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease 
 +  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6 
 +Reading package lists... Done 
 +W: GPG error: http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6 
 +E: The repository 'http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease' is not signed. 
 +N: Updating from such a repository can't be done securely, and is therefore disabled by default. 
 +N: See apt-secure(8) manpage for repository creation and user configuration details. 
 +root@server1:~# 
 +</Code> 
 + 
 +====Configure SSH keys==== 
 +To configure passwordless connection, we have to: 
 + 
 +  - Generate public / private key pair 
 +  - Put the public key in the authorized keys on the other host (e.g. the public key of server1, in the authorized_keys file on server2) 
 + 
 +So let's get going: 
 + 
 + 
 +<Code:bash|Generate Public/Private Key pair> 
 +postgres@server2:~$ ssh-keygen 
 +Generating public/private rsa key pair. 
 +Enter file in which to save the key (/var/lib/postgresql/.ssh/id_rsa): 
 +Created directory '/var/lib/postgresql/.ssh'
 +Enter passphrase (empty for no passphrase): 
 +Enter same passphrase again: 
 +Your identification has been saved in /var/lib/postgresql/.ssh/id_rsa 
 +Your public key has been saved in /var/lib/postgresql/.ssh/id_rsa.pub 
 +The key fingerprint is: 
 +SHA256:KYxUhOKyll2anPYivR3ZrrRw/Kz7xDMVU0AtovXY4bA postgres@server2 
 +The key's randomart image is: 
 ++---[RSA 3072]----+ 
 +|     oo .oo.     | 
 +|  . .. + o..     | 
 +| . .. o Ooo      | 
 +|. .. = E =o      | 
 +| o+ * o S.       | 
 +|.o B. +..        | 
 +|. o..* *         | 
 +| . o=.B o        | 
 +|  ..o*==         | 
 ++----[SHA256]-----+ 
 +</Code> 
 + 
 +This will generate us two files: Private and Public key. Private key, NEVER SHARE :) 
 + 
 +<Code:bash|Keys> 
 +postgres@server2:~/.ssh$ ls -alrt 
 +total 20 
 +-rw-r--r-- 1 postgres postgres  570 Sep 16 08:58 id_rsa.pub 
 +-rw------- 1 postgres postgres 2602 Sep 16 08:58 id_rsa 
 +</Code> 
 + 
 +The content of that server and put it in the authorized_keys file of server1 and vica versa: 
 + 
 + 
 +<Code:bash| Add public key to server1> 
 +postgres@server1:~/.ssh$ ls -alrt 
 +total 24 
 +-rw-r--r-- 1 postgres postgres  570 Sep 16 08:57 id_rsa.pub 
 +-rw------- 1 postgres postgres 2602 Sep 16 08:57 id_rsa 
 +-rw-rw-r-- 1 postgres postgres  571 Sep 16 08:59 authorized_keys 
 +-rw-r--r-- 1 postgres postgres  444 Sep 16 08:59 known_hosts 
 +drwx------ 2 postgres postgres 4096 Sep 16 08:59 . 
 +drwxr-xr-x 5 postgres postgres 4096 Sep 16 09:12 .. 
 +postgres@server1:~/.ssh$ tail -100 authorized_keys 
 +ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDc7WdWJWmc6ArXUkFFhArVa/42//MPue0u1HmgBK389GkpLnKZ8dBfNRcFfrvocCsQyJZu5MsfK/sRR2/JycjifnL45bavw+qqejmyiuJnMmiNEJI2sbturdE/y3vABw1QjDaCG+LEDJzGT3DzY3HvxcU+33vBZYpNx7YKSI+LQnyeT6iGNNoy5hNzI+/O8jM3pg4ReN+XC6NJ77maBmQ+EaLhLsxk3vmcnP6C7TCDxez1iTZHZpV9tNGFx/OCABAOgV6ELpzRPsS/3MBSfif8eeJ6Sc8JPVQSruOzf3lYYAouYZTLLJYtDZOw3rc0+v7fn17jVk9w981Zn4zzkacI5FbVHmtM6oAnCIkSUUJwXlQzZprF3C+Z+Z10mmtBlS9/NFdZklpK3rHh1T1a3Fu1SXWlkHZxKdLxr5ytUL89ubGAPSa4UkBXRalBTUFjcYhvqk5hPqO6a+KQO1ehcNmn7/ffdEPrattoOxnGCfKaWLbXyxFmt3WFCq5Jy/wZoq8= postgres@server2 
 + 
 +postgres@server1:~/.ssh$ 
 +</Code> 
 + 
 + 
 +After you do that on both servers, you can connect without password: 
 + 
 +<Code:bash|Test Passwordless Configuration> 
 +postgres@server1:~/.ssh$ ssh server2 
 +Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-80-generic x86_64) 
 + 
 + * Documentation:  https://help.ubuntu.com 
 + Management:     https://landscape.canonical.com 
 + * Support:        https://ubuntu.com/advantage 
 + 
 +  System information as of Thu 16 Sep 2021 09:30:52 AM UTC 
 + 
 +  System load:  0.08              Processes:             128 
 +  Usage of /:   2.8% of 61.31GB   Users logged in:       1 
 +  Memory usage: 25%               IPv4 address for eth0: 10.0.2.15 
 +  Swap usage:   0%                IPv4 address for eth1: 10.0.15.22 
 + 
 + 
 +This system is built by the Bento project by Chef Software 
 +More information can be found at https://github.com/chef/bento 
 +Last login: Thu Sep 16 08:59:34 2021 from 10.0.15.21 
 +postgres@server2:~$ 
 +</Code> 
 + 
 +=====Cluster Environment===== 
 +After we've installed and configured all the packages, we can proceed with cluster configuration.  
 +Firstly, we have to configure the database and the users: 
 + 
 +====Configure User==== 
 +The user to be used for the replication can be configured as so: 
 + 
 +<Code:bash|Replication User Configuration> 
 +root@server1:~# su - postgres 
 +postgres@server1:~$ createuser --replication --createdb --createrole --superuser repmgr 
 +postgres@server1:~$ psql -c 'ALTER USER repmgr SET search_path TO repmgr_test, "$user", public;' 
 +ALTER ROLE 
 +postgres@server1:~$ createdb repmgr --owner=repmgr 
 +postgres@server1:~$ 
 +</Code> 
 + 
 +Do that on both servers. 
 + 
 +After, we have done that, we can configure security connection to the PostgreSQL in pg_hba.conf 
 + 
 +====Connection Configuration==== 
 +To allow repmgr user to be able to connect to PostgreSQL, we need to append to the pg_hba.conf file the following entries. 
 + 
 +<Code:bash|Configure pg_hba.conf> 
 +host    repmgr repmgr 10.0.15.21/32 trust 
 +host    replication repmgr 10.0.15.21/32 trust 
 +host    repmgr repmgr 10.0.15.22/32 trust 
 +host    replication repmgr 10.0.15.22/32 trust 
 +</Code> 
 + 
 +We can test the settings as follows: 
 + 
 + 
 +<Code:bash|Test connections> 
 +postgres@server2:/etc/postgresql/12/main$ psql 'host=server2 dbname=repmgr user=repmgr' 
 +psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1)) 
 +SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) 
 +Type "help" for help. 
 + 
 +repmgr=# \q 
 +postgres@server2:/etc/postgresql/12/main$ psql 'host=server1 dbname=repmgr user=repmgr' 
 +psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1)) 
 +SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) 
 +Type "help" for help. 
 + 
 +repmgr=# 
 +</Code> 
 + 
 +Finally we have to set pre-loaded library for repmgr in the postgresql.conf 
 + 
 +<Code:bash|Set Pre-loaded Library> 
 +shared_preload_libraries = 'repmgr' 
 +</Code> 
 + 
 +Do that on both servers, to verify the configuration. 
 +Now, finally we can configure the REPMGR 
 + 
 +=====REPMGR===== 
 +The default configuration of Repmgr is located in: 
 + 
 +  * /etc/repmgr.conf 
 +  * /etc/default/repmgrd 
 + 
 +We can do a basic configuration as follows: 
 + 
 +====Configuration==== 
 +<Code:bash|Repmgr.conf> 
 +cluster=cluster 
 +node_id=(1/2 - 1 for server1 and 2 for server2) 
 +node_name=server2 
 +conninfo='host=server(1/2, the same as before) user=repmgr dbname=repmgr connect_timeout=2' 
 +failover=automatic 
 +promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file' 
 +follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file' 
 +logfile='/var/log/postgresql/repmgr.log' 
 +loglevel=NOTICE 
 +reconnect_attempts=4 
 +reconnect_interval=5 
 +data_directory='/var/lib/postgresql/12/main' 
 +</Code> 
 + 
 + 
 +This files, defines the structure of the replication. Including the commands for promotion and follow-up command. 
 +The next file, will define either will be enabled or not, and which will file it will use. 
 + 
 +<Code:bash|Repmgrd> 
 +# default settings for repmgrd. This file is source by /bin/sh from 
 +# /etc/init.d/repmgrd 
 + 
 +# disable repmgrd by default so it won't get started upon installation 
 +# valid values: yes/no 
 +REPMGRD_ENABLED=yes 
 + 
 +# configuration file (required) 
 +REPMGRD_CONF="/etc/repmgr.conf" 
 + 
 +# additional options 
 +#REPMGRD_OPTS="" 
 + 
 +# user to run repmgrd as 
 +#REPMGRD_USER=postgres 
 + 
 +# repmgrd binary 
 +#REPMGRD_BIN=/usr/bin/repmgrd 
 + 
 +# pid file 
 +#REPMGRD_PIDFILE=/var/run/repmgrd.pid 
 +</Code> 
 + 
 +Furthermore, we have to make a symlink to the correct pg_ctl 
 + 
 + 
 +<Code:bash|Create symlink to pg_ctl> 
 +root@server2:~#  ln -s /usr/lib/postgresql/12/bin/pg_ctl /usr/bin/pg_ctl 
 +</Code> 
 + 
 +We have to do that on both servers. 
 +====Cluster Initialization==== 
 +To initialize a cluster, we can do it as so: 
 + 
 + 
 +<Code:bash|Initialize a cluster> 
 +postgres@server1:~$ repmgr primary register 
 +WARNING: the following problems were found in the configuration file: 
 +  parameter "cluster" is deprecated and will be ignored 
 +  parameter "logfile" has been renamed to "log_file" 
 +  parameter "loglevel" has been renamed to "log_level" 
 +INFO: connecting to primary database... 
 +NOTICE: attempting to install extension "repmgr" 
 +NOTICE: "repmgr" extension successfully installed 
 +NOTICE: primary node record (ID: 1) registered 
 +postgres@server1:~$ 
 +</Code> 
 + 
 +This will initialize the server and we can check that: 
 + 
 +<Code:bash|Check cluster status> 
 +postgres@server1:~$ repmgr cluster show 
 +WARNING: the following problems were found in the configuration file: 
 +  parameter "cluster" is deprecated and will be ignored 
 +  parameter "logfile" has been renamed to "log_file" 
 +  parameter "loglevel" has been renamed to "log_level" 
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string 
 +----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------- 
 +  | server1 | primary | * running |          | default  | 100      | 1        | host=server1 user=repmgr dbname=repmgr connect_timeout=2 
 +postgres@server1:~$ 
 +</Code> 
 + 
 +After that is done, we can set-up the standby server (e.g. server2) 
 + 
 + 
 +====Setup Standby==== 
 +To setup standby, we have to delete everything from the server and clone it: 
 + 
 + 
 +<Code:bash|Setup standby server> 
 +root@server2:~# rm -rf /var/lib/postgresql/12/main/ 
 +root@server2:~# su - postgres 
 +postgres@server2:~$ repmgr -h server1 -U repmgr -d repmgr standby clone 
 +WARNING: the following problems were found in the configuration file: 
 +  parameter "cluster" is deprecated and will be ignored 
 +  parameter "logfile" has been renamed to "log_file" 
 +  parameter "loglevel" has been renamed to "log_level" 
 +NOTICE: destination directory "/var/lib/postgresql/12/main" provided 
 +INFO: connecting to source node 
 +DETAIL: connection string is: host=server1 user=repmgr dbname=repmgr 
 +DETAIL: current installation size is 31 MB 
 +NOTICE: checking for available walsenders on the source node (2 required) 
 +NOTICE: checking replication connections can be made to the source server (2 required) 
 +INFO: creating directory "/var/lib/postgresql/12/main"... 
 +NOTICE: starting backup (using pg_basebackup)... 
 +HINT: this may take some time; consider using the -c/--fast-checkpoint option 
 +INFO: executing: 
 +  pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/12/main -h server1 -p 5432 -U repmgr -X stream 
 +NOTICE: standby clone (using pg_basebackup) complete 
 +NOTICE: you can now start your PostgreSQL server 
 +HINT: for example: pg_ctl -D /var/lib/postgresql/12/main start 
 +HINT: after starting the server, you need to register this standby with "repmgr standby register" 
 +postgres@server2:~$ 
 +</Code> 
 + 
 +After that, we should be able to see the replication working: 
 + 
 +<Code:bash|Check status of REPMGR> 
 +postgres@server2:~$ repmgr cluster show 
 +WARNING: the following problems were found in the configuration file: 
 +  parameter "cluster" is deprecated and will be ignored 
 +  parameter "logfile" has been renamed to "log_file" 
 +  parameter "loglevel" has been renamed to "log_level" 
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string 
 +----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------- 
 +  | server1 | primary | * running |          | default  | 100      | 1        | host=server1 user=repmgr dbname=repmgr connect_timeout=2 
 +  | server2 | standby |   running | server1  | default  | 100      | 1        | host=server2 user=repmgr dbname=repmgr connect_timeout=2 
 +postgres@server2:~$ 
 +</Code> 
 + 
 + 
 +We can test the connection as follows: 
 + 
 +====Test Configuration==== 
 +Firstly, we will test the normal functionality and then we will test the failove. 
 + 
 + 
 +For the functionality, we can create a table on primary and see if it is present on the standby to test: 
 + 
 + 
 +==On Primary== 
 +<Code:bash|Test Functionality> 
 +repmgr=# create table test(id integer); 
 +CREATE TABLE 
 +repmgr=# insert into test values(1); 
 +INSERT 0 1 
 +repmgr=# commit; 
 +WARNING:  there is no transaction in progress 
 +COMMIT 
 +repmgr=# 
 +</Code> 
 + 
 +==Standby== 
 +Now, we can test it on the standby: 
 + 
 +<Code:bash|Test Functionality> 
 +postgres=# \c repmgr 
 +You are now connected to database "repmgr" as user "postgres"
 +repmgr=# select * from test; 
 + id 
 +---- 
 +  1 
 +(1 row) 
 + 
 +repmgr=# 
 +</Code> 
 + 
 +Now, let's test the failover: 
 + 
 +====Automatic Failover==== 
 +To test that, we can shutdown the primary and see if it is failover automatically (as per our settings) 
 +Now, I changed a little bit the config, so please check Appendix A. 
 + 
 +===Modify Postgres user=== 
 +Firstly, we need to add the postgres user to the sudo group: 
 + 
 +<Code:bash|Add user to Sudoers group> 
 +### Server 1 
 +root@server2:/var/log/postgresql# usermod -aG sudo postgres 
 +root@server2:/var/log/postgresql# 
 + 
 +### Server 2 
 +root@server2:/var/log/postgresql# usermod -aG sudo postgres 
 +root@server2:/var/log/postgresql# 
 +</Code> 
 + 
 + 
 +===Test the failover=== 
 +Now, we can finally test the failover. 
 + 
 +<Code:bash|Test Failover> 
 +postgres@server1:/var/log/postgresql$ repmgr cluster show 
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string     
 +----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------- 
 +  | server1 | standby | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr  
 +  | server2 | primary |   running | server1  | default  | 100      | 3        | host=server2 user=repmgr dbname=repmgr 
 +</Code> 
 + 
 + 
 +We can simply shutdown the prod: 
 + 
 +<Code:bash|Shutdown production> 
 +root@server2:/var/log/postgresql# service postgresql restart 
 +</Code> 
 + 
 +Once, this is done, we can see the log file from server1: 
 + 
 +<Code:bash|Check the log on standby> 
 +[2021-09-28 10:46:45[INFOprimary and this node have the same location ("default"
 +[2021-09-28 10:46:45] [INFO] no other sibling nodes - we win by default 
 +[2021-09-28 10:46:45] [DEBUG] election result: WON 
 +[2021-09-28 10:46:45] [NOTICE] this node is the only available candidate and will now promote itself 
 +[2021-09-28 10:46:45] [INFO] promote_command is: 
 +  "repmgr standby promote -f /etc/repmgr.conf --log-to-file" 
 +[2021-09-28 10:46:45] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgr.log" 
 + 
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr" 
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +[2021-09-28 10:46:45] [NOTICE] promoting standby to primary 
 +[2021-09-28 10:46:45] [DETAIL] promoting server "server1" (ID: 1) using pg_promote() 
 +[2021-09-28 10:46:45] [NOTICE] waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete 
 +[2021-09-28 10:46:46] [DEBUG] setting node 1 as primary and marking existing primary as failed 
 +[2021-09-28 10:46:46] [NOTICE] STANDBY PROMOTE successful 
 +[2021-09-28 10:46:46] [DETAIL] server "server1" (ID: 1) was successfully promoted to primary 
 +[2021-09-28 10:46:46] [INFO] 0 followers to notify 
 +[2021-09-28 10:46:46] [INFO] switching to primary monitoring mode 
 +[2021-09-28 10:46:46] [NOTICE] monitoring cluster primary "server1" (ID: 1) 
 +</Code> 
 + 
 +As we can see, server1 has been promoted to primary, that however leaves us to deal with the aftermath. 
 + 
 +===Dealing with the aftermath=== 
 +Now, we have one primary and 1 failed primary, we need to re-establish connection. 
 + 
 +<Code:bash|Re-establish the connection> 
 +postgres@server1:/var/log/postgresql$ repmgr cluster show 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr" 
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string     
 +----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------- 
 +  | server1 | primary | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr connect_timeout=2 
 +  | server2 | primary | - failed  |          | default  | 100      | ?        | host=server2 user=repmgr dbname=repmgr connect_timeout=2 
 + 
 +WARNING: following issues were detected 
 +  - unable to connect to node "server2" (ID: 2) 
 +HINT: execute with --verbose option to see connection error messages 
 +postgres@server1: 
 +</Code> 
 + 
 +We can re-establish the connection as follows: 
 + 
 +1. Clear the old path and Resnap: 
 + 
 +<Code:bash|Resnap failed server's data> 
 +root@server2:/var/log/postgresql# rm -rf /var/lib/postgresql/12/main/ 
 +root@server2:/var/log/postgresql# su - postgres 
 +postgres@server2:~$ repmgr -h server1 -U repmgr -d repmgr standby clone 
 +WARNING: following problems with command line parameters detected: 
 +  "config_directory" set in repmgr.conf, but --copy-external-config-files not provided 
 +NOTICE: destination directory "/var/lib/postgresql/12/main" provided 
 +INFO: connecting to source node 
 +DETAIL: connection string is: host=server1 user=repmgr dbname=repmgr 
 +DETAIL: current installation size is 31 MB 
 +DEBUG: 2 node records returned by source node 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +DEBUG: upstream_node_id determined as 1 
 +NOTICE: checking for available walsenders on the source node (2 required) 
 +NOTICE: checking replication connections can be made to the source server (2 required) 
 +INFO: creating directory "/var/lib/postgresql/12/main"... 
 +DEBUG: create_replication_slot(): creating slot "repmgr_slot_2" on upstream 
 +NOTICE: starting backup (using pg_basebackup)... 
 +HINT: this may take some time; consider using the -c/--fast-checkpoint option 
 +INFO: executing: 
 +  /usr/lib/postgresql/12/bin/pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/12/main -h server1 -p 5432 -U repmgr -X stream -S repmgr_slot_2 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +NOTICE: standby clone (using pg_basebackup) complete 
 +NOTICE: you can now start your PostgreSQL server 
 +HINT: for example: sudo systemctl start postgresql 
 +HINT: after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record 
 +postgres@server2:~$ 
 +</Code> 
 + 
 +2. Startup the postgresql 
 +3. Re-register 
 + 
 +<Code:bash|Startup PostgreSQL and Re-register> 
 +postgres@server2:~$ logout 
 +root@server2:/var/log/postgresql# service postgresql start 
 +root@server2:/var/log/postgresql# su - postgres 
 +postgres@server2:~$ repmgr standby register --force 
 +INFO: connecting to local node "server2" (ID: 2) 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr" 
 +INFO: connecting to primary database 
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr" 
 +INFO: standby registration complete 
 +postgres@server2:~$ 
 +</Code> 
 + 
 + 
 +Now we can check our status as follows: 
 + 
 + 
 +<Code:bash|Check the newly recovered cluster> 
 +postgres@server2:~$ repmgr cluster show 
 +ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string     
 +----+---------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------ 
 +  | server1 | primary | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr  
 +  | server2 | standby |   running | server1  | default  | 100      | 3        | host=server2 user=repmgr dbname=repmgr 
 +postgres@server2:~$ 
 +</Code> 
 + 
 + 
 +For that configuration I have used the following appendix: 
 +===Appendix=== 
 +<Code:bash|repmgrd.conf> 
 +node_id='2'                    # A unique integer greater than zero (1/2 depends on the node) 
 +node_name='server2'                       # An arbitrary (but unique) string; we recommend. (unique number for the cluster) 
 +conninfo='host=server1 user=repmgr dbname=repmgr connect_timeout=2'         # Database connection information as a conninfo string. 
 +data_directory='/var/lib/postgresql/12/main'             # The node's data directory. This is needed by repmgr 
 +config_directory='/etc/postgresql/12/main'               # If configuration files are located outside the data 
 +replication_user='repmgr'        # User to make replication connections with, if not set 
 +replication_type='physical'      # Must be one of "physical" or "bdr"
 +location='default'               # An arbitrary string defining the location of the node; this 
 +use_replication_slots='yes'      # whether to use physical replication slots 
 +witness_sync_interval=15         # interval (in seconds) to synchronise node records 
 +log_level='DEBUG'                 # Log level: possible values are DEBUG, INFO, NOTICE, 
 +log_facility='STDERR'            # Logging facility: possible values are STDERR, or for 
 +log_file='/var/log/postgresql/repmgr.log' 
 +log_status_interval=300  # interval (in seconds) for repmgrd to log a status message 
 +pg_bindir='/usr/lib/postgresql/12/bin'                          # Path to PostgreSQL binary directory (location 
 +repmgr_bindir='/usr/lib/postgresql/12/bin'                      # Path to repmgr binary directory (location of the repmgr 
 +use_primary_conninfo_password=false     # explicitly set "password" in "primary_conninfo" 
 +passfile='/var/lib/postgresql/12/main/.pgpass'       # path to .pgpass file to include in "primary_conninfo" 
 +ssh_options='-o StrictHostKeyChecking=no'       # Options to append to "ssh" 
 +promote_check_timeout=60                # The length of time (in seconds) to wait 
 +promote_check_interval=1                # The interval (in seconds) to check whether 
 +primary_follow_timeout=60               # The max length of time (in seconds) to wait 
 +standby_follow_timeout=15               # The max length of time (in seconds) to wait 
 +shutdown_check_timeout=60               # The max length of time (in seconds) to wait for the demotion 
 +standby_reconnect_timeout=60            # The max length of time (in seconds) to wait 
 +node_rejoin_timeout=60          # The maximum length of time (in seconds) to wait for 
 +failover='automatic'                    # one of 'automatic', 'manual'
 +priority=100                            # indicates a preferred priority for promoting nodes; 
 +reconnect_attempts=6                    # Number of attempts which will be made to reconnect to an unreachable 
 +reconnect_interval=10                   # Interval between attempts to reconnect to an unreachable 
 +promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'  # command repmgrd executes when promoting a new primary; use something like: 
 +follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'  # command repmgrd executes when instructing a standby to follow a new primary; 
 +primary_notification_timeout=60                 # Interval (in seconds) which repmgrd on a standby 
 +repmgrd_standby_startup_timeout=60              # Interval (in seconds) which repmgrd on a standby will wait 
 +monitoring_history=no           # Whether to write monitoring data to the "montoring_history" table 
 +monitor_interval_secs=2                 # Interval (in seconds) at which to write monitoring data 
 +degraded_monitoring_timeout=-1          # Interval (in seconds) after which repmgrd will terminate if the 
 +async_query_timeout=60          # Interval (in seconds) which repmgrd will wait before 
 +standby_disconnect_on_failover=false    # If "true", in a failover situation wait for all standbys to 
 +primary_visibility_consensus=false      # If "true", only continue with failover if no standbys have seen 
 +service_start_command ='sudo systemctl start postgresql' 
 +service_stop_command ='sudo systemctl stop postgresql' 
 +service_restart_command ='sudo systemctl restart postgresql' 
 +service_reload_command ='sudo systemctl reload postgresql' 
 +archive_ready_warning=16                # repmgr node check --archive-ready 
 +archive_ready_critical=128              # 
 +replication_lag_warning=300             # repmgr node check --replication-lag 
 +replication_lag_critical=600            # 
 +bdr_local_monitoring_only=false         # Only monitor the local node; no checks will be 
 +bdr_recovery_timeout=30                 # If a BDR node was offline and has become available 
 +</Code> 
 +====Manual Failover==== 
  • postgresql_repmgr.1631743762.txt.gz
  • Last modified: 2021/09/15 22:09
  • by andonovj