=====Overview===== In order to increase the performance and reduce the delay when replicating. PostgreSQL created several ways how we can ensure that our WAL files are replicated on time, synchronously (if we need) and that the master won't recycle them. In a nutshell, PostgreSQL, created 2.5 ways to replicate: * **File based** - Uses archived WAL files defined by the archive command and restore from the restore command. That was covered in the Basic Replication Section * **Streaming based** - Uses WAL files which the master isn't obliged to keep for long time, BUT to make it more synchronous, the slave also uses a user to connect to the master and pull the WAL segments. By the default that process is async but it can be made sync. * **Slots** - Slots are used in addition of streaming replication. The goal here is to not allow the master to recycle WAL which haven't been applyied on the slave. Here we will discuss the Stream based and the replication slots. So let's get going : =====Stream based===== Configuring stream based replication is needed and usually used in conjunction with the file based. Having two replication mechanisms(PostgreSQL will use streaming with priority over file based) increases the security and robustness of the replication. In order to configure stream based replication, you should edit the recovery.conf on the slave as follows: ===On the Master=== -bash-4.2$ psql psql (10.11) Type "help" for help. postgres=# CREATE ROLE repl WITH REPLICATION LOGIN; CREATE ROLE postgres=# \password repl Enter new password: Enter it again: -bash-4.2$ Ensure that you have the following setting in the pg_hba.conf configuration file: host replication all ip_of_the_slave/32 md5 Reload the configuration using: -bash-4.2$ /usr/pgsql-10/bin/pg_ctl reload -D /var/lib/pgsql/10/data server signaled After that continue with the slave :) ===On the Slave=== [root@postegresqlslave data]# cat recovery.conf standby_mode = 'on' primary_conninfo = 'host=192.168.0.178 port=5432 user=rep password=yourpassword' <-**This line will add the streaming** restore_command = 'cp /walshared/%f %p' trigger_file = '/tmp/postgresql.trigger.5432' [root@postegresqlslave data]# Restart the slave -bash-4.2$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data start waiting for server to start....2020-01-23 10:24:46.735 EST [4600] LOG: listening on IPv4 address "0.0.0.0", port 5432 2020-01-23 10:24:46.735 EST [4600] LOG: listening on IPv6 address "::", port 5432 2020-01-23 10:24:46.737 EST [4600] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-01-23 10:24:46.753 EST [4600] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2020-01-23 10:24:46.760 EST [4600] LOG: redirecting log output to logging collector process 2020-01-23 10:24:46.760 EST [4600] HINT: Future log output will appear in directory "log". done server started You can also check the log if everything is fine: -bash-4.2$ cd log -bash-4.2$ ls -lart total 584 -rw-------. 1 postgres postgres 2206 Jan 22 09:05 postgresql-Mon.log -rw-------. 1 postgres postgres 531731 Jan 22 17:46 postgresql-Wed.log drwx------. 2 postgres postgres 84 Jan 23 09:30 . drwx------. 20 postgres postgres 4096 Jan 23 10:24 .. -rw-------. 1 postgres postgres 56353 Jan 23 10:24 postgresql-Thu.log -bash-4.2$ tail -1 postgresql-Thu.log 2020-01-23 10:26:47.027 EST [4671] LOG: started streaming WAL from primary at 0/10000000 on timeline 1 Congratulation, you have streaming replication, between the slave and the master. =====Slots===== Slots are great way to make PostgreSQL aware of replication as they provide various information to the server about which WAL files can be removed and which are needed: There are also 2 kind of slots: * Physical - For physical replication * Logical - For Logical replication (Requires higher archiving method e.g. logical). Doesn't support DDL, Sequences and others. Let's configure Physical slot. ====Physical==== Physical slots are configure 99% of the time, when you do not require logical replication. Furthermore, the configuration is rather simple: ===On the master=== On the master server we have to create the slots as follows: [root@postegresqlmaster pg_log]# su - postgres -bash-4.1$ psql psql (9.6.7) Type "help" for help. postgres=# select * from pg_create_physical_replication_slot('repl_slot'); slot_name | xlog_position -----------+--------------- repl_slot | (1 row) postgres=# postgres=# \q Please be sure that you have max_replication_slots setting higher than 0. Otherwise, you will receive the following error: -bash-4.1$ psql psql (9.6.7) Type "help" for help. postgres=# select * from pg_create_physical_replication_slot('repl_slot'); ERROR: replication slots can only be used if max_replication_slots > 0 postgres=# ===On the Slave=== After the slot has been created, edit the recovery.conf file on the slave, to indicate this feature. [root@postegresqlslave data]# cat recovery.conf standby_mode = 'on' primary_conninfo = 'host=192.168.0.10 port=5432 user=rep password=yourpassword' <-This line will add the streaming replication restore_command = 'cp /walshared/%f %p' <- This command is used for File based replication (Log shipping) trigger_file = '/tmp/postgresql.trigger.5432' primary_slot_name='repl_slot' <-This line is needed for slot replication (can be made sync) [root@postegresqlslave data]# And restart the master and the slave. Since the slot has been created, the master will not delete any file which is needed by the slave until it has been applied. ====Logical==== Physical streaming replication creates a byte-by-byte read-only replica of the primary server. The replica contains all the databases, tables, roles, tablespaces, etc., of the original. With streaming replication we get all or nothing. But what if we want a replica of only a single table? This is where logical replication comes into play. Logical replication can replay DML operations happening on a subset of tables in a primary server on a standby server by: Logically decoding the WAL records Streaming them over to the standby server Applying them to the table in the standby server in the correct transactional order ===Publication and subscription=== Logical replication defines two entities: a publisher and a subscriber. A publisher is a node that defines a certain group of tables (called a publication) to which a subscriber can subscribe, by creating a subscription to receive the changes to that particular group of tables. {{:pgsql_logical_replication.jpg?500|}} ===Configuration=== Let's get going :) First thing first, we have to enabel logical replication on the primary(master) ===On the Publication server=== su - postgres cat >> /tmp/data_pub/postgresql.conf < > port = 6432 > wal_level = logical > logging_collector = on > > EOF $ After that we can start the cluster: postgres@host01:~$ /usr/lib/postgresql/12/bin/pg_ctl -D /tmp/data_pub start waiting for server to start....2021-05-29 05:42:41.624 UTC [5615] LOG: starting PostgreSQL 12.6 (Ubuntu 12.6-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit 2021-05-29 05:42:41.624 UTC [5615] LOG: listening on IPv6 address "::1", port 6432 2021-05-29 05:42:41.624 UTC [5615] LOG: listening on IPv4 address "127.0.0.1", port 6432 2021-05-29 05:42:41.625 UTC [5615] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432" 2021-05-29 05:42:41.635 UTC [5615] LOG: redirecting log output to logging collector process 2021-05-29 05:42:41.635 UTC [5615] HINT: Future log output will appear in directory "log". done server started postgres@host01:~$ Then we connect to it and create a database with one table: postgres@host01:~$ /usr/lib/postgresql/12/bin/psql --port=6432 --host=localhost --username=postgres --dbname=postgres psql (12.6 (Ubuntu 12.6-0ubuntu0.20.04.1)) Type "help" for help. postgres=# create database src_db; CREATE DATABASE postgres=# postgres=# \c src_db src_db=# create table t1(id integer primary key, val text); CREATE TABLE src_db=# After that we have to configure the replicant and add data to our table: src_db=# create user replicant with replication; CREATE ROLE src_db=# grant select on t1 to replicant; GRANT src_db=# insert into t1 (id,val) values (10,'ten'),(20,'twenty'),(30,'thirty'); INSERT 0 3 src_db=# create publication pub1 for table t1; CREATE PUBLICATION src_db=# ===On the subscriber server=== Modify the postgresql.conf to enable subscription server: cat >> /tmp/data_sub/postgresql.conf < Start the postgreSQL Cluster> /usr/lib/postgresql/12/bin/pg_ctl -D /tmp/data_sub start waiting for server to start....2021-05-29 05:42:41.624 UTC [5615] LOG: starting PostgreSQL 12.6 (Ubuntu 12.6-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit 2021-05-29 05:42:41.624 UTC [5615] LOG: listening on IPv6 address "::1", port 6432 2021-05-29 05:42:41.624 UTC [5615] LOG: listening on IPv4 address "127.0.0.1", port 6432 2021-05-29 05:42:41.625 UTC [5615] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432" 2021-05-29 05:42:41.635 UTC [5615] LOG: redirecting log output to logging collector process 2021-05-29 05:42:41.635 UTC [5615] HINT: Future log output will appear in directory "log". After that, we have to create a database with the same table name and create a subscription to it: /usr/lib/postgresql/12/bin/psql --port=6432--host=localhost --username=postgres --dbname=postgres psql (12.6 (Ubuntu 12.6-0ubuntu0.20.04.1)) Type "help" for help. postgres=# create database dst_db; CREATE DATABASE postgres=# postgres=# \c dst_db dst_db=# create table t1(id integer primary key, val text); CREATE TABLE dst_db=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=publication_server port=6432 dbname=src_db user=replicant' publication pub1; CREATE