postgresql_repmgr_installation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
postgresql_repmgr_installation [2021/09/16 08:30] – [Virtual Environment Config] andonovjpostgresql_repmgr_installation [2021/09/28 11:30] (current) andonovj
Line 9: Line 9:
   * 2nd Quandrant - For Rempgr   * 2nd Quandrant - For Rempgr
  
 +
 +{{:1_g-5e4zhde9lojvmbdsiptg.png?600|}}
  
 =====Virtual Environment Config===== =====Virtual Environment Config=====
Line 69: Line 71:
  
  
-====Server Config====+=====Server Config=====
 For the servers we have to: For the servers we have to:
  
Line 80: Line 82:
  
  
-===Modify /etc/hosts ===+====Modify /etc/hosts ====
 That is pretty trivial, just add the IPs and the hostnames to /etc/hosts: That is pretty trivial, just add the IPs and the hostnames to /etc/hosts:
  
Line 94: Line 96:
  
  
-===Configure Repositories===+====Configure Repositories====
 I decided to use 2nd quadrant packages, as they are the ones I use for work as well :) I decided to use 2nd quadrant packages, as they are the ones I use for work as well :)
 +First thing first, we have to update our repos and pull the latest metadata from them.
  
 +<Code:bash|Update repos>
 +root@server1:~# apt-get update
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
 +Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
 +Get:3 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
 +Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [101 kB]
 +Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [1,213 kB]
 +Get:6 http://security.ubuntu.com/ubuntu focal-security/main i386 Packages [282 kB]
 +Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main i386 Packages [535 kB]
 +Get:8 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [866 kB]
 +Get:9 http://archive.ubuntu.com/ubuntu focal-updates/main Translation-en [259 kB]
 +Get:10 http://archive.ubuntu.com/ubuntu focal-updates/restricted i386 Packages [19.3 kB]
 +Get:11 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [443 kB]
 +Get:12 http://security.ubuntu.com/ubuntu focal-security/main Translation-en [167 kB]
 +Get:13 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [406 kB]
 +Get:14 http://archive.ubuntu.com/ubuntu focal-updates/restricted Translation-en [63.5 kB]
 +Get:15 http://archive.ubuntu.com/ubuntu focal-updates/universe i386 Packages [633 kB]
 +Get:16 http://security.ubuntu.com/ubuntu focal-security/restricted i386 Packages [18.0 kB]
 +Get:17 http://security.ubuntu.com/ubuntu focal-security/restricted Translation-en [58.3 kB]
 +Get:18 http://security.ubuntu.com/ubuntu focal-security/universe i386 Packages [507 kB]
 +Get:19 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [638 kB]
 +Get:20 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [855 kB]
 +Get:21 http://archive.ubuntu.com/ubuntu focal-updates/universe Translation-en [182 kB]
 +Get:22 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [24.6 kB]
 +Get:23 http://archive.ubuntu.com/ubuntu focal-updates/multiverse i386 Packages [8,232 B]
 +Get:24 http://archive.ubuntu.com/ubuntu focal-updates/multiverse Translation-en [6,776 B]
 +Get:25 http://archive.ubuntu.com/ubuntu focal-backports/universe i386 Packages [4,736 B]
 +Get:26 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [5,800 B]
 +Get:27 http://security.ubuntu.com/ubuntu focal-security/universe Translation-en [101 kB]
 +Get:28 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [21.9 kB]
 +Get:29 http://security.ubuntu.com/ubuntu focal-security/multiverse Translation-en [4,948 B]
 +Fetched 7,652 kB in 7s (1,048 kB/s)
 +Reading package lists... Done
 +</Code>
  
-<Code:bash|Install repo>+====Install PostgreSQL==== 
 +Then we can proceed with installation of PostgreSQL 
 +<Code:bash|Install PostgreSQL> 
 +root@server1:~# apt-get install -y postgresql 
 +Reading package lists... Done 
 +Building dependency tree 
 +Reading state information... Done 
 +The following additional packages will be installed: 
 +  libllvm10 libpq5 libsensors-config libsensors5 postgresql-12 postgresql-client-12 postgresql-client-common postgresql-common ssl-cert sysstat 
 +Suggested packages: 
 +  lm-sensors postgresql-doc postgresql-doc-12 libjson-perl openssl-blacklist isag 
 +The following NEW packages will be installed: 
 +  libllvm10 libpq5 libsensors-config libsensors5 postgresql postgresql-12 postgresql-client-12 postgresql-client-common postgresql-common ssl-cert sysstat 
 +0 upgraded, 11 newly installed, 0 to remove and 54 not upgraded. 
 +Need to get 15.1 MB/30.6 MB of archives. 
 +After this operation, 122 MB of additional disk space will be used. 
 +Get:1 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpq5 amd64 12.8-0ubuntu0.20.04.1 [116 kB] 
 +Get:2 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 postgresql-client-12 amd64 12.8-0ubuntu0.20.04.1 [1,046 kB] 
 +Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 postgresql-12 amd64 12.8-0ubuntu0.20.04.1 [13.5 MB] 
 +Get:4 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 sysstat amd64 12.2.0-2ubuntu0.1 [448 kB] 
 +Fetched 15.1 MB in 15s (1,015 kB/s) 
 +Preconfiguring packages ... 
 +***************************************************
  
 +Success. You can now start the database server using:
  
 +    pg_ctlcluster 12 main start
 +
 +Ver Cluster Port Status Owner    Data directory              Log file
 +12  main    5432 down   postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log
 +update-alternatives: using /usr/share/postgresql/12/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
 +Setting up sysstat (12.2.0-2ubuntu0.1) ...
 +
 +Creating config file /etc/default/sysstat with new version
 +update-alternatives: using /usr/bin/sar.sysstat to provide /usr/bin/sar (sar) in auto mode
 +Created symlink /etc/systemd/system/multi-user.target.wants/sysstat.service → /lib/systemd/system/sysstat.service.
 +Setting up postgresql (12+214ubuntu0.1) ...
 +Processing triggers for systemd (245.4-4ubuntu3.11) ...
 +Processing triggers for man-db (2.9.1-1) ...
 +Processing triggers for libc-bin (2.31-0ubuntu9.2) ...
 </Code> </Code>
 +
 +Do this on both machines: server1 and server2
 +
 +====Install REPMGR Repos====
 +Then we can proceed with adding the repo for REPMGR
 +
 +<Code:bash|Add REPMGR Repo>
 +root@server1:~# sudo add-apt-repository 'deb http://packages.2ndquadrant.com/repmgr3/apt/ xenial-2ndquadrant main'
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
 +Get:2 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease [5,044 B]
 +Hit:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease
 +Hit:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease
 +Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
 +Err:2 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease
 +  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6
 +Reading package lists... Done
 +W: GPG error: http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6
 +E: The repository 'http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease' is not signed.
 +N: Updating from such a repository can't be done securely, and is therefore disabled by default.
 +N: See apt-secure(8) manpage for repository creation and user configuration details.
 +root@server1:~# sudo apt update
 +Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
 +Hit:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease
 +Get:3 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease [5,044 B]
 +Hit:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease
 +Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
 +Err:3 http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease
 +  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6
 +Reading package lists... Done
 +W: GPG error: http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A47615A0D3FA41F6
 +E: The repository 'http://packages.2ndquadrant.com/repmgr3/apt xenial-2ndquadrant InRelease' is not signed.
 +N: Updating from such a repository can't be done securely, and is therefore disabled by default.
 +N: See apt-secure(8) manpage for repository creation and user configuration details.
 +root@server1:~#
 +</Code>
 +
 +====Configure SSH keys====
 +To configure passwordless connection, we have to:
 +
 +  - Generate public / private key pair
 +  - Put the public key in the authorized keys on the other host (e.g. the public key of server1, in the authorized_keys file on server2)
 +
 +So let's get going:
 +
 +
 +<Code:bash|Generate Public/Private Key pair>
 +postgres@server2:~$ ssh-keygen
 +Generating public/private rsa key pair.
 +Enter file in which to save the key (/var/lib/postgresql/.ssh/id_rsa):
 +Created directory '/var/lib/postgresql/.ssh'.
 +Enter passphrase (empty for no passphrase):
 +Enter same passphrase again:
 +Your identification has been saved in /var/lib/postgresql/.ssh/id_rsa
 +Your public key has been saved in /var/lib/postgresql/.ssh/id_rsa.pub
 +The key fingerprint is:
 +SHA256:KYxUhOKyll2anPYivR3ZrrRw/Kz7xDMVU0AtovXY4bA postgres@server2
 +The key's randomart image is:
 ++---[RSA 3072]----+
 +|     oo .oo.     |
 +|  . .. + o..     |
 +| . .. o Ooo      |
 +|. .. = E =o      |
 +| o+ * o S.       |
 +|.o B. +..        |
 +|. o..* *         |
 +| . o=.B o        |
 +|  ..o*==         |
 ++----[SHA256]-----+
 +</Code>
 +
 +This will generate us two files: Private and Public key. Private key, NEVER SHARE :)
 +
 +<Code:bash|Keys>
 +postgres@server2:~/.ssh$ ls -alrt
 +total 20
 +-rw-r--r-- 1 postgres postgres  570 Sep 16 08:58 id_rsa.pub
 +-rw------- 1 postgres postgres 2602 Sep 16 08:58 id_rsa
 +</Code>
 +
 +The content of that server and put it in the authorized_keys file of server1 and vica versa:
 +
 +
 +<Code:bash| Add public key to server1>
 +postgres@server1:~/.ssh$ ls -alrt
 +total 24
 +-rw-r--r-- 1 postgres postgres  570 Sep 16 08:57 id_rsa.pub
 +-rw------- 1 postgres postgres 2602 Sep 16 08:57 id_rsa
 +-rw-rw-r-- 1 postgres postgres  571 Sep 16 08:59 authorized_keys
 +-rw-r--r-- 1 postgres postgres  444 Sep 16 08:59 known_hosts
 +drwx------ 2 postgres postgres 4096 Sep 16 08:59 .
 +drwxr-xr-x 5 postgres postgres 4096 Sep 16 09:12 ..
 +postgres@server1:~/.ssh$ tail -100 authorized_keys
 +ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDc7WdWJWmc6ArXUkFFhArVa/42//MPue0u1HmgBK389GkpLnKZ8dBfNRcFfrvocCsQyJZu5MsfK/sRR2/JycjifnL45bavw+qqejmyiuJnMmiNEJI2sbturdE/y3vABw1QjDaCG+LEDJzGT3DzY3HvxcU+33vBZYpNx7YKSI+LQnyeT6iGNNoy5hNzI+/O8jM3pg4ReN+XC6NJ77maBmQ+EaLhLsxk3vmcnP6C7TCDxez1iTZHZpV9tNGFx/OCABAOgV6ELpzRPsS/3MBSfif8eeJ6Sc8JPVQSruOzf3lYYAouYZTLLJYtDZOw3rc0+v7fn17jVk9w981Zn4zzkacI5FbVHmtM6oAnCIkSUUJwXlQzZprF3C+Z+Z10mmtBlS9/NFdZklpK3rHh1T1a3Fu1SXWlkHZxKdLxr5ytUL89ubGAPSa4UkBXRalBTUFjcYhvqk5hPqO6a+KQO1ehcNmn7/ffdEPrattoOxnGCfKaWLbXyxFmt3WFCq5Jy/wZoq8= postgres@server2
 +
 +postgres@server1:~/.ssh$
 +</Code>
 +
 +
 +After you do that on both servers, you can connect without password:
 +
 +<Code:bash|Test Passwordless Configuration>
 +postgres@server1:~/.ssh$ ssh server2
 +Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-80-generic x86_64)
 +
 + * Documentation:  https://help.ubuntu.com
 + * Management:     https://landscape.canonical.com
 + * Support:        https://ubuntu.com/advantage
 +
 +  System information as of Thu 16 Sep 2021 09:30:52 AM UTC
 +
 +  System load:  0.08              Processes:             128
 +  Usage of /:   2.8% of 61.31GB   Users logged in:       1
 +  Memory usage: 25%               IPv4 address for eth0: 10.0.2.15
 +  Swap usage:   0%                IPv4 address for eth1: 10.0.15.22
 +
 +
 +This system is built by the Bento project by Chef Software
 +More information can be found at https://github.com/chef/bento
 +Last login: Thu Sep 16 08:59:34 2021 from 10.0.15.21
 +postgres@server2:~$
 +</Code>
 +
 +=====Cluster Environment=====
 +After we've installed and configured all the packages, we can proceed with cluster configuration. 
 +Firstly, we have to configure the database and the users:
 +
 +====Configure User====
 +The user to be used for the replication can be configured as so:
 +
 +<Code:bash|Replication User Configuration>
 +root@server1:~# su - postgres
 +postgres@server1:~$ createuser --replication --createdb --createrole --superuser repmgr
 +postgres@server1:~$ psql -c 'ALTER USER repmgr SET search_path TO repmgr_test, "$user", public;'
 +ALTER ROLE
 +postgres@server1:~$ createdb repmgr --owner=repmgr
 +postgres@server1:~$
 +</Code>
 +
 +Do that on both servers.
 +
 +After, we have done that, we can configure security connection to the PostgreSQL in pg_hba.conf
 +
 +====Connection Configuration====
 +To allow repmgr user to be able to connect to PostgreSQL, we need to append to the pg_hba.conf file the following entries.
 +
 +<Code:bash|Configure pg_hba.conf>
 +host    repmgr repmgr 10.0.15.21/32 trust
 +host    replication repmgr 10.0.15.21/32 trust
 +host    repmgr repmgr 10.0.15.22/32 trust
 +host    replication repmgr 10.0.15.22/32 trust
 +</Code>
 +
 +We can test the settings as follows:
 +
 +
 +<Code:bash|Test connections>
 +postgres@server2:/etc/postgresql/12/main$ psql 'host=server2 dbname=repmgr user=repmgr'
 +psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1))
 +SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
 +Type "help" for help.
 +
 +repmgr=# \q
 +postgres@server2:/etc/postgresql/12/main$ psql 'host=server1 dbname=repmgr user=repmgr'
 +psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1))
 +SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
 +Type "help" for help.
 +
 +repmgr=#
 +</Code>
 +
 +Finally we have to set pre-loaded library for repmgr in the postgresql.conf
 +
 +<Code:bash|Set Pre-loaded Library>
 +shared_preload_libraries = 'repmgr'
 +</Code>
 +
 +Do that on both servers, to verify the configuration.
 +Now, finally we can configure the REPMGR
 +
 +=====REPMGR=====
 +The default configuration of Repmgr is located in:
 +
 +  * /etc/repmgr.conf
 +  * /etc/default/repmgrd
 +
 +We can do a basic configuration as follows:
 +
 +====Configuration====
 +<Code:bash|Repmgr.conf>
 +cluster=cluster
 +node_id=(1/2 - 1 for server1 and 2 for server2)
 +node_name=server2
 +conninfo='host=server(1/2, the same as before) user=repmgr dbname=repmgr connect_timeout=2'
 +failover=automatic
 +promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
 +follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file'
 +logfile='/var/log/postgresql/repmgr.log'
 +loglevel=NOTICE
 +reconnect_attempts=4
 +reconnect_interval=5
 +data_directory='/var/lib/postgresql/12/main'
 +</Code>
 +
 +
 +This files, defines the structure of the replication. Including the commands for promotion and follow-up command.
 +The next file, will define either will be enabled or not, and which will file it will use.
 +
 +<Code:bash|Repmgrd>
 +# default settings for repmgrd. This file is source by /bin/sh from
 +# /etc/init.d/repmgrd
 +
 +# disable repmgrd by default so it won't get started upon installation
 +# valid values: yes/no
 +REPMGRD_ENABLED=yes
 +
 +# configuration file (required)
 +REPMGRD_CONF="/etc/repmgr.conf"
 +
 +# additional options
 +#REPMGRD_OPTS=""
 +
 +# user to run repmgrd as
 +#REPMGRD_USER=postgres
 +
 +# repmgrd binary
 +#REPMGRD_BIN=/usr/bin/repmgrd
 +
 +# pid file
 +#REPMGRD_PIDFILE=/var/run/repmgrd.pid
 +</Code>
 +
 +Furthermore, we have to make a symlink to the correct pg_ctl
 +
 +
 +<Code:bash|Create symlink to pg_ctl>
 +root@server2:~#  ln -s /usr/lib/postgresql/12/bin/pg_ctl /usr/bin/pg_ctl
 +</Code>
 +
 +We have to do that on both servers.
 +====Cluster Initialization====
 +To initialize a cluster, we can do it as so:
 +
 +
 +<Code:bash|Initialize a cluster>
 +postgres@server1:~$ repmgr primary register
 +WARNING: the following problems were found in the configuration file:
 +  parameter "cluster" is deprecated and will be ignored
 +  parameter "logfile" has been renamed to "log_file"
 +  parameter "loglevel" has been renamed to "log_level"
 +INFO: connecting to primary database...
 +NOTICE: attempting to install extension "repmgr"
 +NOTICE: "repmgr" extension successfully installed
 +NOTICE: primary node record (ID: 1) registered
 +postgres@server1:~$
 +</Code>
 +
 +This will initialize the server and we can check that:
 +
 +<Code:bash|Check cluster status>
 +postgres@server1:~$ repmgr cluster show
 +WARNING: the following problems were found in the configuration file:
 +  parameter "cluster" is deprecated and will be ignored
 +  parameter "logfile" has been renamed to "log_file"
 +  parameter "loglevel" has been renamed to "log_level"
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
 +----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------
 +  | server1 | primary | * running |          | default  | 100      | 1        | host=server1 user=repmgr dbname=repmgr connect_timeout=2
 +postgres@server1:~$
 +</Code>
 +
 +After that is done, we can set-up the standby server (e.g. server2)
 +
 +
 +====Setup Standby====
 +To setup standby, we have to delete everything from the server and clone it:
 +
 +
 +<Code:bash|Setup standby server>
 +root@server2:~# rm -rf /var/lib/postgresql/12/main/
 +root@server2:~# su - postgres
 +postgres@server2:~$ repmgr -h server1 -U repmgr -d repmgr standby clone
 +WARNING: the following problems were found in the configuration file:
 +  parameter "cluster" is deprecated and will be ignored
 +  parameter "logfile" has been renamed to "log_file"
 +  parameter "loglevel" has been renamed to "log_level"
 +NOTICE: destination directory "/var/lib/postgresql/12/main" provided
 +INFO: connecting to source node
 +DETAIL: connection string is: host=server1 user=repmgr dbname=repmgr
 +DETAIL: current installation size is 31 MB
 +NOTICE: checking for available walsenders on the source node (2 required)
 +NOTICE: checking replication connections can be made to the source server (2 required)
 +INFO: creating directory "/var/lib/postgresql/12/main"...
 +NOTICE: starting backup (using pg_basebackup)...
 +HINT: this may take some time; consider using the -c/--fast-checkpoint option
 +INFO: executing:
 +  pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/12/main -h server1 -p 5432 -U repmgr -X stream
 +NOTICE: standby clone (using pg_basebackup) complete
 +NOTICE: you can now start your PostgreSQL server
 +HINT: for example: pg_ctl -D /var/lib/postgresql/12/main start
 +HINT: after starting the server, you need to register this standby with "repmgr standby register"
 +postgres@server2:~$
 +</Code>
 +
 +After that, we should be able to see the replication working:
 +
 +<Code:bash|Check status of REPMGR>
 +postgres@server2:~$ repmgr cluster show
 +WARNING: the following problems were found in the configuration file:
 +  parameter "cluster" is deprecated and will be ignored
 +  parameter "logfile" has been renamed to "log_file"
 +  parameter "loglevel" has been renamed to "log_level"
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
 +----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------
 +  | server1 | primary | * running |          | default  | 100      | 1        | host=server1 user=repmgr dbname=repmgr connect_timeout=2
 +  | server2 | standby |   running | server1  | default  | 100      | 1        | host=server2 user=repmgr dbname=repmgr connect_timeout=2
 +postgres@server2:~$
 +</Code>
 +
 +
 +We can test the connection as follows:
 +
 +====Test Configuration====
 +Firstly, we will test the normal functionality and then we will test the failove.
 +
 +
 +For the functionality, we can create a table on primary and see if it is present on the standby to test:
 +
 +
 +==On Primary==
 +<Code:bash|Test Functionality>
 +repmgr=# create table test(id integer);
 +CREATE TABLE
 +repmgr=# insert into test values(1);
 +INSERT 0 1
 +repmgr=# commit;
 +WARNING:  there is no transaction in progress
 +COMMIT
 +repmgr=#
 +</Code>
 +
 +==Standby==
 +Now, we can test it on the standby:
 +
 +<Code:bash|Test Functionality>
 +postgres=# \c repmgr
 +You are now connected to database "repmgr" as user "postgres".
 +repmgr=# select * from test;
 + id
 +----
 +  1
 +(1 row)
 +
 +repmgr=#
 +</Code>
 +
 +Now, let's test the failover:
 +
 +====Automatic Failover====
 +To test that, we can shutdown the primary and see if it is failover automatically (as per our settings)
 +Now, I changed a little bit the config, so please check Appendix A.
 +
 +===Modify Postgres user===
 +Firstly, we need to add the postgres user to the sudo group:
 +
 +<Code:bash|Add user to Sudoers group>
 +### Server 1
 +root@server2:/var/log/postgresql# usermod -aG sudo postgres
 +root@server2:/var/log/postgresql#
 +
 +### Server 2
 +root@server2:/var/log/postgresql# usermod -aG sudo postgres
 +root@server2:/var/log/postgresql#
 +</Code>
 +
 +
 +===Test the failover===
 +Now, we can finally test the failover.
 +
 +<Code:bash|Test Failover>
 +postgres@server1:/var/log/postgresql$ repmgr cluster show
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string    
 +----+---------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------
 +  | server1 | standby | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr 
 +  | server2 | primary |   running | server1  | default  | 100      | 3        | host=server2 user=repmgr dbname=repmgr
 +</Code>
 +
 +
 +We can simply shutdown the prod:
 +
 +<Code:bash|Shutdown production>
 +root@server2:/var/log/postgresql# service postgresql restart
 +</Code>
 +
 +Once, this is done, we can see the log file from server1:
 +
 +<Code:bash|Check the log on standby>
 +[2021-09-28 10:46:45] [INFO] primary and this node have the same location ("default")
 +[2021-09-28 10:46:45] [INFO] no other sibling nodes - we win by default
 +[2021-09-28 10:46:45] [DEBUG] election result: WON
 +[2021-09-28 10:46:45] [NOTICE] this node is the only available candidate and will now promote itself
 +[2021-09-28 10:46:45] [INFO] promote_command is:
 +  "repmgr standby promote -f /etc/repmgr.conf --log-to-file"
 +[2021-09-28 10:46:45] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgr.log"
 +
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr"
 +[2021-09-28 10:46:45] [DEBUG] connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +[2021-09-28 10:46:45] [NOTICE] promoting standby to primary
 +[2021-09-28 10:46:45] [DETAIL] promoting server "server1" (ID: 1) using pg_promote()
 +[2021-09-28 10:46:45] [NOTICE] waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
 +[2021-09-28 10:46:46] [DEBUG] setting node 1 as primary and marking existing primary as failed
 +[2021-09-28 10:46:46] [NOTICE] STANDBY PROMOTE successful
 +[2021-09-28 10:46:46] [DETAIL] server "server1" (ID: 1) was successfully promoted to primary
 +[2021-09-28 10:46:46] [INFO] 0 followers to notify
 +[2021-09-28 10:46:46] [INFO] switching to primary monitoring mode
 +[2021-09-28 10:46:46] [NOTICE] monitoring cluster primary "server1" (ID: 1)
 +</Code>
 +
 +As we can see, server1 has been promoted to primary, that however leaves us to deal with the aftermath.
 +
 +===Dealing with the aftermath===
 +Now, we have one primary and 1 failed primary, we need to re-establish connection.
 +
 +<Code:bash|Re-establish the connection>
 +postgres@server1:/var/log/postgresql$ repmgr cluster show
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr"
 + ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string    
 +----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------
 +  | server1 | primary | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr connect_timeout=2
 +  | server2 | primary | - failed  |          | default  | 100      | ?        | host=server2 user=repmgr dbname=repmgr connect_timeout=2
 +
 +WARNING: following issues were detected
 +  - unable to connect to node "server2" (ID: 2)
 +HINT: execute with --verbose option to see connection error messages
 +postgres@server1:
 +</Code>
 +
 +We can re-establish the connection as follows:
 +
 +1. Clear the old path and Resnap:
 +
 +<Code:bash|Resnap failed server's data>
 +root@server2:/var/log/postgresql# rm -rf /var/lib/postgresql/12/main/
 +root@server2:/var/log/postgresql# su - postgres
 +postgres@server2:~$ repmgr -h server1 -U repmgr -d repmgr standby clone
 +WARNING: following problems with command line parameters detected:
 +  "config_directory" set in repmgr.conf, but --copy-external-config-files not provided
 +NOTICE: destination directory "/var/lib/postgresql/12/main" provided
 +INFO: connecting to source node
 +DETAIL: connection string is: host=server1 user=repmgr dbname=repmgr
 +DETAIL: current installation size is 31 MB
 +DEBUG: 2 node records returned by source node
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +DEBUG: upstream_node_id determined as 1
 +NOTICE: checking for available walsenders on the source node (2 required)
 +NOTICE: checking replication connections can be made to the source server (2 required)
 +INFO: creating directory "/var/lib/postgresql/12/main"...
 +DEBUG: create_replication_slot(): creating slot "repmgr_slot_2" on upstream
 +NOTICE: starting backup (using pg_basebackup)...
 +HINT: this may take some time; consider using the -c/--fast-checkpoint option
 +INFO: executing:
 +  /usr/lib/postgresql/12/bin/pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/12/main -h server1 -p 5432 -U repmgr -X stream -S repmgr_slot_2
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +NOTICE: standby clone (using pg_basebackup) complete
 +NOTICE: you can now start your PostgreSQL server
 +HINT: for example: sudo systemctl start postgresql
 +HINT: after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record
 +postgres@server2:~$
 +</Code>
 +
 +2. Startup the postgresql
 +3. Re-register
 +
 +<Code:bash|Startup PostgreSQL and Re-register>
 +postgres@server2:~$ logout
 +root@server2:/var/log/postgresql# service postgresql start
 +root@server2:/var/log/postgresql# su - postgres
 +postgres@server2:~$ repmgr standby register --force
 +INFO: connecting to local node "server2" (ID: 2)
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server2 fallback_application_name=repmgr"
 +INFO: connecting to primary database
 +DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=server1 fallback_application_name=repmgr"
 +INFO: standby registration complete
 +postgres@server2:~$
 +</Code>
 +
 +
 +Now we can check our status as follows:
 +
 +
 +<Code:bash|Check the newly recovered cluster>
 +postgres@server2:~$ repmgr cluster show
 +ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string    
 +----+---------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------
 +  | server1 | primary | * running |          | default  | 100      | 3        | host=server1 user=repmgr dbname=repmgr 
 +  | server2 | standby |   running | server1  | default  | 100      | 3        | host=server2 user=repmgr dbname=repmgr
 +postgres@server2:~$
 +</Code>
 +
 +
 +For that configuration I have used the following appendix:
 +===Appendix===
 +<Code:bash|repmgrd.conf>
 +node_id='2'                    # A unique integer greater than zero (1/2 depends on the node)
 +node_name='server2'                       # An arbitrary (but unique) string; we recommend. (unique number for the cluster)
 +conninfo='host=server1 user=repmgr dbname=repmgr connect_timeout=2'         # Database connection information as a conninfo string.
 +data_directory='/var/lib/postgresql/12/main'             # The node's data directory. This is needed by repmgr
 +config_directory='/etc/postgresql/12/main'               # If configuration files are located outside the data
 +replication_user='repmgr'        # User to make replication connections with, if not set
 +replication_type='physical'      # Must be one of "physical" or "bdr".
 +location='default'               # An arbitrary string defining the location of the node; this
 +use_replication_slots='yes'      # whether to use physical replication slots
 +witness_sync_interval=15         # interval (in seconds) to synchronise node records
 +log_level='DEBUG'                 # Log level: possible values are DEBUG, INFO, NOTICE,
 +log_facility='STDERR'            # Logging facility: possible values are STDERR, or for
 +log_file='/var/log/postgresql/repmgr.log'
 +log_status_interval=300  # interval (in seconds) for repmgrd to log a status message
 +pg_bindir='/usr/lib/postgresql/12/bin'                          # Path to PostgreSQL binary directory (location
 +repmgr_bindir='/usr/lib/postgresql/12/bin'                      # Path to repmgr binary directory (location of the repmgr
 +use_primary_conninfo_password=false     # explicitly set "password" in "primary_conninfo"
 +passfile='/var/lib/postgresql/12/main/.pgpass'       # path to .pgpass file to include in "primary_conninfo"
 +ssh_options='-o StrictHostKeyChecking=no'       # Options to append to "ssh"
 +promote_check_timeout=60                # The length of time (in seconds) to wait
 +promote_check_interval=1                # The interval (in seconds) to check whether
 +primary_follow_timeout=60               # The max length of time (in seconds) to wait
 +standby_follow_timeout=15               # The max length of time (in seconds) to wait
 +shutdown_check_timeout=60               # The max length of time (in seconds) to wait for the demotion
 +standby_reconnect_timeout=60            # The max length of time (in seconds) to wait
 +node_rejoin_timeout=60          # The maximum length of time (in seconds) to wait for
 +failover='automatic'                    # one of 'automatic', 'manual'.
 +priority=100                            # indicates a preferred priority for promoting nodes;
 +reconnect_attempts=6                    # Number of attempts which will be made to reconnect to an unreachable
 +reconnect_interval=10                   # Interval between attempts to reconnect to an unreachable
 +promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'  # command repmgrd executes when promoting a new primary; use something like:
 +follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'  # command repmgrd executes when instructing a standby to follow a new primary;
 +primary_notification_timeout=60                 # Interval (in seconds) which repmgrd on a standby
 +repmgrd_standby_startup_timeout=60              # Interval (in seconds) which repmgrd on a standby will wait
 +monitoring_history=no           # Whether to write monitoring data to the "montoring_history" table
 +monitor_interval_secs=2                 # Interval (in seconds) at which to write monitoring data
 +degraded_monitoring_timeout=-1          # Interval (in seconds) after which repmgrd will terminate if the
 +async_query_timeout=60          # Interval (in seconds) which repmgrd will wait before
 +standby_disconnect_on_failover=false    # If "true", in a failover situation wait for all standbys to
 +primary_visibility_consensus=false      # If "true", only continue with failover if no standbys have seen
 +service_start_command ='sudo systemctl start postgresql'
 +service_stop_command ='sudo systemctl stop postgresql'
 +service_restart_command ='sudo systemctl restart postgresql'
 +service_reload_command ='sudo systemctl reload postgresql'
 +archive_ready_warning=16                # repmgr node check --archive-ready
 +archive_ready_critical=128              #
 +replication_lag_warning=300             # repmgr node check --replication-lag
 +replication_lag_critical=600            #
 +bdr_local_monitoring_only=false         # Only monitor the local node; no checks will be
 +bdr_recovery_timeout=30                 # If a BDR node was offline and has become available
 +</Code>
 +====Manual Failover====
 +
  • postgresql_repmgr_installation.1631781027.txt.gz
  • Last modified: 2021/09/16 08:30
  • by andonovj