This is an old revision of the document!
Overview
I find pgbackrest, way more “stupid” (or maybe I am “stupid) than Barman. I am sorry, but I guess I just don't get it.
You have a pgbackrest executable, which checks the config file, it generates the postgresql.auto.conf file and the server STILL restores till the last WAL files when you try to do PITR. But I guess I will return to this section, when I am smarter and know how to use it.
Either case, let's get into it. Unlike Barman, pgbackrest REQUIRES passwordless connection for the backup / restore to the server and unlike Barman in REQUIRES you to archive the WAL Files to a common repo or global repo. P.S. Barman doesn't require passwordless connection for backup, as we can configure “streaming” server….
Either case, so in general. My setup is. 1 central backup server, 1 NFS on that backup server, which is mounted on the data nodes and the data nodes push the WAL Files to that repo using pgbackrest. Simple as that.
Setup
After you install pgbackrest on all servers: data nodes + backup server. You can set the archive command as follows:
Setup archive & restore commands
archive_command: "pgbackrest --stanza=cluster_backup archive-push %p" restore_command: "pgbackrest --stanza=cluster_backup archive-get %f %p"
So in nutshell, my patroni.yml looks like this:
Patroni Yaml
scope: stampede name: ${host} restapi: listen: ${host}:8008 connect_address: ${host}:8008 etcd: hosts: etcd00:2379, etcd01:2379, etcd02:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 maximum_lag_on_syncnode: 15000000 synchronous_mode: true postgresql: use_pg_rewind: true use_slots: true initdb: - encoding: UTF8 - data-checksums pg_hba: - host replication rep_user ${subnet}.0/24 md5 - host all all ${subnet}.0/24 md5 postgresql: listen: ${host}:5432 connect_address: ${host}:5432 data_dir: /db/pgdata bin_dir: ${bindir} pgpass: /tmp/pgpass0 authentication: replication: username: rep_user password: newpass superuser: username: postgres password: newpass parameters: unix_socket_directories: '/var/run/postgresql' external_pid_file: '/var/run/postgresql/17-main.pid' logging_collector: "on" log_directory: "/var/log/postgresql" log_filename: "postgresql-17-main.log" shared_buffers: 100MB work_mem: 16MB maintenance_work_mem: 10MB max_worker_processes: 16 wal_buffers: 16MB max_wal_size: 200MB min_wal_size: 100MB effective_cache_size: 50MB fsync: on checkpoint_completion_target: 0.9 log_rotation_size: 100MB listen_addresses: '*' max_connections: 2000 temp_buffers: 4MB archive_mode: "on" wal_level: "replica" archive_command: "pgbackrest --stanza=cluster_backup archive-push %p" restore_command: "pgbackrest --stanza=cluster_backup archive-get %f %p"
After that, we need to create the Stanza (Configuration of server and a backup). That is done from the backup server.
Create Stanza
Create Stanza
pgbackrest --stanza=cluster_backup stanza-create
That command will create you the stanza and then the archive command from the data nodes will work. You can also manually archive a file as follows, ran from the data node.
Manually archive a file
pgbackrest --stanza=cluster_backup archive-push /db/pgdata/pg_wal/000000010000000000000001 --log-level-console=debug
Backup
Now, pgbackrest has the following backups:
- Full
- Differential (What has changed since FULL)
- Incremental (What has changed since FULL or Differential)
I know, it is the opposite of Oracle, but hey, we cannot live in perfect world. If you try to make DIfferential or Incremental without Full, pgbackrest is “smart” and it will do FULL instead.
Create Full backup
[postgres@backup ~]$ pgbackrest --stanza=cluster_backup --type=full --log-level-console=info backup 2025-05-17 04:41:41.193 P00 INFO: backup command begin 2.55.1: --no-archive-check --delta --exec-id=17762-aead3b0a --log-level-console=info --log-level-file=debug --log-path=/backups --pg1-host=node00 --pg2-host=node01 --pg3-host=node02 --pg1-host-user=postgres --pg2-host-user=postgres --pg3-host-user=postgres --pg1-path=/db/pgdata --pg2-path=/db/pgdata --pg3-path=/db/pgdata --pg1-port=5432 --pg2-port=5432 --pg3-port=5432 --process-max=2 --repo1-path=/backups --repo1-retention-full=14 --repo1-retention-full-type=time --stanza=cluster_backup --start-fast --type=full 2025-05-17 04:41:42.409 P00 INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes 2025-05-17 04:41:42.757 P00 INFO: backup start archive = 00000003000000000000000C, lsn = 0/C000028 2025-05-17 04:41:48.940 P00 INFO: execute non-exclusive backup stop and wait for all WAL segments to archive 2025-05-17 04:41:48.966 P00 INFO: backup stop archive = 00000003000000000000000C, lsn = 0/C000120 2025-05-17 04:41:49.080 P00 INFO: new backup label = 20250517-044142F 2025-05-17 04:41:49.139 P00 INFO: full backup size = 29.4MB, file total = 1272 2025-05-17 04:41:49.140 P00 INFO: backup command end: completed successfully (7950ms) 2025-05-17 04:41:49.140 P00 INFO: expire command begin 2.55.1: --exec-id=17762-aead3b0a --log-level-console=info --log-level-file=debug --log-path=/backups --repo1-path=/backups --repo1-retention-full=14 --repo1-retention-full-type=time --stanza=cluster_backup 2025-05-17 04:41:49.142 P00 INFO: repo1: time-based archive retention not met - archive logs will not be expired 2025-05-17 04:41:49.345 P00 INFO: expire command end: completed successfully (205ms)
After that we can create an incremental or differential backup:
Create Incremental & Differential backup
#Incremental pgbackrest --stanza=cluster_backup --type=incr --log-level-console=info backup #Differential pgbackrest --stanza=cluster_backup --type=diff --log-level-console=info backup
With that at hand, we can drop and delete all database:
Restore
Now, the restore can be:
- Complete
- PITR: Time, LSN, Checkpoint, Transactions, etc
Don't forget that you need also the WAL Files in either case. The cluster has to be consistent after all. So make the difference between:
- Restore (Moving the data files from backup to the data dir)
- Recover (Applying the WAL Files, until we are consistent state and/or until we are told (recovery_target_time)
These are important concepts. To restore a backup we can use the following command.
Restore latest backup
pgbackrest --stanza=cluster_backup restore --type=immediate
Or we can restore only to a PITR (using time)
Restore to PITR
pgbackrest --stanza=cluster_backup \ --type=time --target="2025-05-16 19:58:08.384753+00" \ --target-action=pause restore
Despite, whatever you chose, MY BIGGEST PROBLEM was that Patroni continued to apply the WAL Files, despite what I was telling him. So my HUMBLE opinion. After restore with pgbackrest, DO NOT START IT WITH PATRONI, BUT start it outside of Patroni:
Starting outside of Patroni
/usr/pgsql-17/bin/pg_ctl start -D /db/pgdata -w
Check if that is the correct state and if you are happy. Then promote it and THEN stop it:
stopping outside of Patroni
/usr/pgsql-17/bin/pg_ctl stop -D /db/pgdata -w
After that, you can start it from patroni and maybe failover:
Start within patroni
service patroni start su - postgres patronictl -c /etc/patroni/stampede.yml failover stampede
That is how I was able to make it work. I am 99.(9)8% (Math nerds here), sure I do something wrong, but still. If you cannot get it work with patroni initially, that way works.