postgresql_pgbackrest_configuration [My wiki]

This is an old revision of the document!

I find pgbackrest, way more “stupid” (or maybe I am “stupid) than Barman. I am sorry, but I guess I just don't get it. You have a pgbackrest executable, which checks the config file, it generates the postgresql.auto.conf file and the server STILL restores till the last WAL files when you try to do PITR. But I guess I will return to this section, when I am smarter and know how to use it. Either case, let's get into it. Unlike Barman, pgbackrest REQUIRES passwordless connection for the backup / restore to the server and unlike Barman in REQUIRES you to archive the WAL Files to a common repo or global repo. P.S. Barman doesn't require passwordless connection for backup, as we can configure “streaming” server….

Either case, so in general. My setup is. 1 central backup server, 1 NFS on that backup server, which is mounted on the data nodes and the data nodes push the WAL Files to that repo using pgbackrest. Simple as that.

After you install pgbackrest on all servers: data nodes + backup server. You can set the archive command as follows:

Setup archive & restore commands

archive_command: "pgbackrest --stanza=cluster_backup archive-push %p"
restore_command: "pgbackrest --stanza=cluster_backup archive-get %f %p"

So in nutshell, my patroni.yml looks like this:

Patroni Yaml

scope: stampede 
name: ${host} 
 
restapi: 
  listen: ${host}:8008 
  connect_address: ${host}:8008 
 
etcd: 
  hosts: etcd00:2379, etcd01:2379, etcd02:2379 
 
bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    maximum_lag_on_syncnode: 15000000
    synchronous_mode: true
    postgresql:
      use_pg_rewind: true
      use_slots: true

  initdb: 
  - encoding: UTF8 
  - data-checksums 

  pg_hba: 
  - host replication rep_user ${subnet}.0/24 md5 
  - host all all ${subnet}.0/24 md5 

postgresql: 
  listen: ${host}:5432 
  connect_address: ${host}:5432 
  data_dir: /db/pgdata 
  bin_dir: ${bindir}
  pgpass: /tmp/pgpass0 
  authentication: 
    replication: 
      username: rep_user 
      password: newpass 
    superuser: 
      username: postgres 
      password: newpass 
  parameters: 
    unix_socket_directories: '/var/run/postgresql' 
    external_pid_file: '/var/run/postgresql/17-main.pid' 
    logging_collector: "on" 
    log_directory: "/var/log/postgresql" 
    log_filename: "postgresql-17-main.log" 
    shared_buffers: 100MB
    work_mem: 16MB
    maintenance_work_mem: 10MB
    max_worker_processes: 16
    wal_buffers: 16MB
    max_wal_size: 200MB
    min_wal_size: 100MB
    effective_cache_size: 50MB
    fsync: on
    checkpoint_completion_target: 0.9
    log_rotation_size: 100MB
    listen_addresses: '*'
    max_connections: 2000
    temp_buffers: 4MB
    archive_mode: "on"
    wal_level: "replica"
    archive_command: "pgbackrest --stanza=cluster_backup archive-push %p"
    restore_command: "pgbackrest --stanza=cluster_backup archive-get %f %p"

After that, we need to create the Stanza (Configuration of server and a backup). That is done from the backup server.

Create Stanza

pgbackrest --stanza=cluster_backup stanza-create

That command will create you the stanza and then the archive command from the data nodes will work. You can also manually archive a file as follows, ran from the data node.

Manually archive a file

pgbackrest --stanza=cluster_backup archive-push /db/pgdata/pg_wal/000000010000000000000001 --log-level-console=debug

Now, pgbackrest has the following backups:

Full
Differential (What has changed since FULL)
Incremental (What has changed since FULL or Differential)

I know, it is the opposite of Oracle, but hey, we cannot live in perfect world. If you try to make DIfferential or Incremental without Full, pgbackrest is “smart” and it will do FULL instead.

Create Full backup

[postgres@backup ~]$ pgbackrest --stanza=cluster_backup --type=full --log-level-console=info backup
2025-05-17 04:41:41.193 P00   INFO: backup command begin 2.55.1: --no-archive-check --delta --exec-id=17762-aead3b0a --log-level-console=info --log-level-file=debug --log-path=/backups --pg1-host=node00 --pg2-host=node01 --pg3-host=node02 --pg1-host-user=postgres --pg2-host-user=postgres --pg3-host-user=postgres --pg1-path=/db/pgdata --pg2-path=/db/pgdata --pg3-path=/db/pgdata --pg1-port=5432 --pg2-port=5432 --pg3-port=5432 --process-max=2 --repo1-path=/backups --repo1-retention-full=14 --repo1-retention-full-type=time --stanza=cluster_backup --start-fast --type=full
2025-05-17 04:41:42.409 P00   INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes
2025-05-17 04:41:42.757 P00   INFO: backup start archive = 00000003000000000000000C, lsn = 0/C000028
2025-05-17 04:41:48.940 P00   INFO: execute non-exclusive backup stop and wait for all WAL segments to archive
2025-05-17 04:41:48.966 P00   INFO: backup stop archive = 00000003000000000000000C, lsn = 0/C000120
2025-05-17 04:41:49.080 P00   INFO: new backup label = 20250517-044142F
2025-05-17 04:41:49.139 P00   INFO: full backup size = 29.4MB, file total = 1272
2025-05-17 04:41:49.140 P00   INFO: backup command end: completed successfully (7950ms)
2025-05-17 04:41:49.140 P00   INFO: expire command begin 2.55.1: --exec-id=17762-aead3b0a --log-level-console=info --log-level-file=debug --log-path=/backups --repo1-path=/backups --repo1-retention-full=14 --repo1-retention-full-type=time --stanza=cluster_backup
2025-05-17 04:41:49.142 P00   INFO: repo1: time-based archive retention not met - archive logs will not be expired
2025-05-17 04:41:49.345 P00   INFO: expire command end: completed successfully (205ms)

After that we can create an incremental or differential backup:

Create Incremental & Differential backup

#Incremental
pgbackrest --stanza=cluster_backup --type=incr --log-level-console=info backup

#Differential
pgbackrest --stanza=cluster_backup --type=diff --log-level-console=info backup

With that at hand, we can drop and delete all database:

Now, the restore can be:

Complete
PITR: Time, LSN, Checkpoint, Transactions, etc

Don't forget that you need also the WAL Files in either case. The cluster has to be consistent after all. So make the difference between:

Restore (Moving the data files from backup to the data dir)
Recover (Applying the WAL Files, until we are consistent state and/or until we are told (recovery_target_time)

These are important concepts. To restore a backup we can use the following command.

Restore latest backup

pgbackrest --stanza=cluster_backup restore --type=immediate

Or we can restore only to a PITR (using time)

Restore to PITR

pgbackrest --stanza=cluster_backup \
           --type=time --target="2025-05-16 19:58:08.384753+00" \
           --target-action=pause restore

Despite, whatever you chose, MY BIGGEST PROBLEM was that Patroni continued to apply the WAL Files, despite what I was telling him. So my HUMBLE opinion. After restore with pgbackrest, DO NOT START IT WITH PATRONI, BUT start it outside of Patroni:

Starting outside of Patroni

/usr/pgsql-17/bin/pg_ctl start -D /db/pgdata -w

Check if that is the correct state and if you are happy. Then promote it and THEN stop it:

stopping outside of Patroni

/usr/pgsql-17/bin/pg_ctl stop -D /db/pgdata -w

After that, you can start it from patroni and maybe failover:

Start within patroni

service patroni start
su - postgres
patronictl -c /etc/patroni/stampede.yml failover stampede

That is how I was able to make it work. I am 99.(9)8% (Math nerds here), sure I do something wrong, but still. If you cannot get it work with patroni initially, that way works.

Overview

Setup

Create Stanza

Backup

Restore

My wiki