postgresql_management

This is an old revision of the document!


Overview

In this section we will discuss several things you can do with postgresql in terms of management. Let's start with user management:

User & Role Management

Every role has a unique name and also an OID value, which represents the role as a numerical value. This is similar to how users are represented in the Unix system (and many others) where the numerical value of a role is used only internally.

User management in postgresql is quite different than other databases. In PostgreSQL, you should remember one thing, EVERYTHING IS ROLE. You have roles which can connect (e.g. users) and roles which cannot connect (groups). You can nest roles one into another and thus granding privileges and groups to other users.

In order to create a user in PostgreSQL, you have to:

  1. Create the role with password
  2. Allow the role in the pg_hba.conf

So let's get going:

Again, user is a role which can connect and has password, that is the only difference between user and group in PostgreSQL. So to create a user we can use the following:

Create a User

CREATE ROLE jcaruana  with password 'yxKEt045ZKcWiiu2YBxJ' INHERIT LOGIN;

Grant role

grant dev to jcaruana;
grant select on all tables in schema public to dba;

To reset the password, we can use the following command:

Reset password

ALTER USER integration_playson WITH PASSWORD 'xjaR3IvMDXF9NvPHwVSe';

List Roles

forumdb=> SELECT * FROM pg_roles WHERE rolname = 'luca';
-[ RECORD 1 ]--+---------
rolname        | luca
rolsuper       | f
rolinherit     | t
rolcreaterole  | f
rolcreatedb    | f
rolcanlogin    | t
rolreplication | f
rolconnlimit   | 1
rolpassword    | ********
rolvaliduntil  |
rolbypassrls   | f
rolconfig      |
oid            | 16390

forumdb=# \x
Expanded display is on.
forumdb=# SELECT * FROM pg_authid WHERE rolname = 'luca';
-[ RECORD 1 ]--+------------------------------------
oid            | 16390
rolname        | luca
rolsuper       | f
rolinherit     | t
rolcreaterole  | f
rolcreatedb    | f
rolcanlogin    | t
rolreplication | f
rolbypassrls   | f
rolconnlimit   | 1
rolpassword    | md5bd18b4163ec8a3ad833d867a5933c8ec
rolvaliduntil  |

List Users with their roles

forumdb=> SELECT r.rolname, g.rolname AS group,
                 m.admin_option AS is_admin
          FROM pg_auth_members m
               JOIN pg_roles r ON r.oid = m.member
               JOIN pg_roles g ON g.oid = m.roleid
          ORDER BY r.rolname;
  rolname   |        group         | is_admin
------------+----------------------+----------
 enrico     | book_authors         | f
 enrico     | forum_admins         | f
 luca       | forum_stats          | f
 luca       | book_authors         | f
 pg_monitor | pg_read_all_settings | f
 pg_monitor | pg_read_all_stats    | f
 pg_monitor | pg_stat_scan_tables  | f
 test       | forum_stats          | f
(8 rows)

PostgreSQL stores permissions assigned to roles and objects as Access Control Lists (ACLs), and, when needed, it examines the ACLs for a specific role and a database object in order to understand whether the command or query can be performed. In this section, you will learn what ACLs are, how they are stored, and how to interpret them to understand what permissions an ACL provides.

An ACL is a representation of a group of permissions with the following structure:

Example

luca=arw/enrico

First of all, identify the roles involed: luca and enrico. luca is the role before the equals sign, hence it is the role the ACL refers to, which means this ACL describes what permissions the luca role has. The other role, enrico, is after the slash sign and therefore is the role that granted luca permissions. Now, with respect to the flags, the ACL provides an append (a), read ®, and write (w) permissions. The above reads as “enrico granted luca to perform INSERT, UPDATE, and SELECT on the table.”

Let's now see an example of ACLs from a table in the database: you can use the special \dp psql command to get information about a table:

Display ACL

forumdb=> \dp categories
                                  Access privileges
 Schema |    Name    | Type  |   Access privileges   | Column privileges  | Policies
--------+------------+-------+-----------------------+--------------------+----------
 public | categories | table | enrico=arwdDxt/enrico+|                    |
        |            |       | luca=arw/enrico      +|                    |
        |            |       | =d/enrico             |                    |

Display ACL v2

forumdb=> WITH acl AS (
             SELECT relname, 
                    (aclexplode(relacl)).grantor, 
                    (aclexplode(relacl)).grantee, 
                    (aclexplode(relacl)).privilege_type 
            FROM pg_class )
         SELECT g.rolname AS grantee, 
                acl.privilege_type AS permission, 
                gg.rolname AS grantor
         FROM acl 
         JOIN pg_roles g ON g.oid = acl.grantee 
         JOIN pg_roles gg ON gg.oid = acl.grantor
         WHERE acl.relname = 'categories';
         
   grantee   | permission | grantor 
-------------+------------+---------
 luca        | INSERT     | luca
 luca        | SELECT     | luca
 luca        | UPDATE     | luca
 luca        | DELETE     | luca
 luca        | TRUNCATE   | luca
 luca        | REFERENCES | luca
 luca        | TRIGGER    | luca
 forum_stats | INSERT     | luca
 forum_stats | SELECT     | luca
 forum_stats | UPDATE     | luca

Database Management

When it comes to database management, there are MANY possibilities of what you want to do. Let's see how we can dump data to CSV:

We can use the COPY tool from WITHIN the database as follows:

Copy data to CSV

hunter_dev=# \copy (SQL STATEMENT) to 'out.csv' with csv header

That will execute teh SQL statement and put the data into a CSV file with the name of: “out.csv” and it will put the header on the top

Now, to clean pg_wal folder, we need to back them up. Assuming, we have done that OR that we don't care, we can use the following process:

Find latest checkpoint file

[root@s24-se-db01 pg_wal]# /usr/pgsql-11/bin/pg_controldata -D /var/lib/pgsql/11/data/
pg_control version number:            1100
Catalog version number:               201809051
Database system identifier:           6742286774524545149
Database cluster state:               in production
pg_control last modified:             Fri 21 May 2021 21:37:51 CEST
Latest checkpoint location:           B6D/62D62990
Latest checkpoint's REDO location:    B6D/62941780
Latest checkpoint's REDO WAL file:    0000000100000B6D00000062                    <- This entry
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:774839045
Latest checkpoint's NextOID:          300868
Latest checkpoint's NextMultiXactId:  75180307
Latest checkpoint's NextMultiOffset:  30502311
Latest checkpoint's oldestXID:        600002360
Latest checkpoint's oldestXID's DB:   16406
Latest checkpoint's oldestActiveXID:  760450727
Latest checkpoint's oldestMultiXid:   62073622
Latest checkpoint's oldestMulti's DB: 16407
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Fri 21 May 2021 21:35:46 CEST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    logical
wal_log_hints setting:                off
max_connections setting:              1500
max_worker_processes setting:         28
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996

Before officially cleaning we can do a dry run:

Dry run

[root@s24-se-db01 pg_wal]# /usr/pgsql-11/bin/pg_archivecleanup -n /var/lib/pgsql/11/data/pg_wal/ 0000000100000B6D00000061
[root@s24-se-db01 pg_wal]#

Then we can do the actual delete:

Clean WAL Files

[root@s24-se-db01 pg_wal]# /usr/pgsql-11/bin/pg_archivecleanup -d /var/lib/pgsql/11/data/pg_wal/ 0000000100000B6D00000061
pg_archivecleanup: keeping WAL file "/var/lib/pgsql/11/data/pg_wal//0000000100000B6D00000061" and later
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B6A0000008C"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B6400000053"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B5400000008"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B550000007A"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B4C0000000E"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B65000000E1"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B480000008F"
pg_archivecleanup: removing file "/var/lib/pgsql/11/data/pg_wal//0000000100000B61000000BC"
[root@s24-se-db01 pg_wal]#

Table Management

Partitioning in PostgreSQL, can be achieved in several ways, but in general are these:

  • Declarative
  • Using Inheritance

Each has it's pros and cons, we have to start somewhere, so let's start with the declarative partitioning.

PostgreSQL offers a way to specify how to divide a table into pieces called partitions. The table that is divided is referred to as a partitioned table. The specification consists of the partitioning method and a list of columns or expressions to be used as the partition key.

Pros

  • All rows inserted into a partitioned table will be routed to one of the partitions based on the value of the partition key. Each partition has a subset of the data defined by its partition bounds. Currently supported partitioning methods include range and list, where each partition is assigned a range of keys and a list of keys, respectively.
  • Partitions may themselves be defined as partitioned tables, using what is called sub-partitioning. Partitions may have their own indexes, constraints and default values, distinct from those of other partitions. Indexes must be created separately for each partition. See CREATE TABLE for more details on creating partitioned tables and partitions.
  • It is not possible to turn a regular table into a partitioned table or vice versa. However, it is possible to add a regular or partitioned table containing data as a partition of a partitioned table, or remove a partition from a partitioned table turning it into a standalone table; see ALTER TABLE to learn more about the ATTACH PARTITION and DETACH PARTITION sub-commands.

Cons

  • There is no facility available to create the matching indexes on all partitions automatically. Indexes must be added to each partition with separate commands. This also means that there is no way to create a primary key, unique constraint, or exclusion constraint spanning all partitions; it is only possible to constrain each leaf partition individually.
  • Since primary keys are not supported on partitioned tables, foreign keys referencing partitioned tables are not supported, nor are foreign key references from a partitioned table to some other table.
  • Using the ON CONFLICT clause with partitioned tables will cause an error, because unique or exclusion constraints can only be created on individual partitions. There is no support for enforcing uniqueness (or an exclusion constraint) across an entire partitioning hierarchy.
  • An UPDATE that causes a row to move from one partition to another fails, because the new value of the row fails to satisfy the implicit partition constraint of the original partition.
  • Row triggers, if necessary, must be defined on individual partitions, not the partitioned table.
  • Mixing temporary and permanent relations in the same partition tree is not allowed. Hence, if the partitioned table is permanent, so must be its partitions and likewise if the partitioned table is temporary. When using temporary relations, all members of the partition tree have to be from the same session.

So let's see how they are done:

Implemention

We can create a table, using declarative partitioning as follows:

Create table

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);

Then we can define a partition as folows:

Define partition

CREATE TABLE measurement_y2021m09 PARTITION OF measurement
    FOR VALUES FROM ('2021-09-01') TO ('2021-10-01')
    WITH (parallel_workers = 4)
    TABLESPACE tbs_fast;

Let's say, we want to move the partition to a new tablespace (because it became old). We can do that (with a cost) but first, let's create new partition and fill our old partition.

Fill the partition

do
$$
declare
counter integer := 0;
begin


while counter != 100000000
LOOP

insert into measurement values (counter,'2021-09-04',counter,counter+1);
counter = counter + 1;

END LOOP;

end;
$$;


CREATE TABLE measurement_y2021m10 PARTITION OF measurement
    FOR VALUES FROM ('2021-10-02') TO ('2021-11-01')
    WITH (parallel_workers = 4)
    TABLESPACE tbs_fast;

That is simple, procedure, but quite effective. Now we have 2 partitions, one filled with data and one who doesn't have anything. Now, let's move the old partition (09) into the slower tablespace:

Move partition

repmgr=# alter table measurement_y2021m09 set tablespace tbs_slow;
ALTER TABLE
repmgr=#

During the move, PostgreSQL locks the PARTITION with Access Exclusive Lock. Which means, no one can do anything.

Partition Lock

repmgr=#   SELECT a.datname,
         c.relname,
         l.transactionid,
         l.mode,
         l.GRANTED,
         a.usename,
         a.query,
         a.query_start,
         age(now(), a.query_start) AS "age",
         a.pid
    FROM  pg_stat_activity a
     JOIN pg_locks         l ON l.pid = a.pid
     JOIN pg_class         c ON c.oid = l.relation
 where c.relname like 'measurement%'
    ORDER BY a.query_start;
-[ RECORD 1 ]-+----------------------------------------------------------
datname       | repmgr
relname       | measurement_y2021m09
transactionid |
mode          | AccessExclusiveLock
granted       | t
usename       | postgres
query         | alter table measurement_y2021m09 set tablespace tbs_slow;
query_start   | 2021-09-17 07:16:55.676676+00
age           | 00:00:44.439482
pid           | 20994

*We can eliminate the lock (RW) if add NOT VALID constraint (TODO)

BEGIN;
create schema partman;
create extension pg_partman schema partman;
ALTER TABLE auditlog RENAME TO auditlog_old;
CREATE TABLE auditlog ( LIKE auditlog_old including all) PARTITION BY RANGE (inserted_at);
CREATE TABLE auditlog_template (LIKE auditlog_old including all);
alter table auditlog_old add constraint old_data_constraint check ( inserted_at >= '1900-01-01' and inserted_at <= '2024-04-30' ) NOT VALID;
alter table auditlog_old validate constraint old_data_constraint;
ALTER TABLE auditlog ATTACH PARTITION auditlog_old FOR VALUES FROM ('1900-01-01') TO ('2024-05-01');
SELECT partman.create_parent('public.auditlog', 
p_control := 'inserted_at',
p_type := 'native',
p_template_table:= 'public.auditlog_template',
p_interval := 'monthly',
p_premake := 4,
p_start_partition := '2024-05-01');
CALL partman.run_maintenance_proc(0,false,false);

Of course, operations on other partitions can contonue:

Other partitions

repmgr=# insert into measurement values(2,'2021-10-05',1,2);
INSERT 0 1
repmgr=# insert into measurement values(2,'2021-10-05',1,2);
INSERT 0 1
repmgr=# commit
repmgr-# ;
WARNING:  there is no transaction in progress
COMMIT
repmgr=# select * from measurement_y2021m10;
 city_id |  logdate   | peaktemp | unitsales
---------+------------+----------+-----------
       2 | 2021-10-05 |        1 |         2
       2 | 2021-10-05 |        1 |         2

If you try, to even select data from the partition which we move, you will be blocked:

Try select on moved partition

repmgr=# select * from measurement_y2021m09 limit 1;

Blocked Select

postgres=# \c repmgr
You are now connected to database "repmgr" as user "postgres".
repmgr=#   SELECT blocked_locks.pid     AS blocked_pid,
repmgr-#          blocked_activity.usename  AS blocked_user,
repmgr-#          blocking_locks.pid     AS blocking_pid,
repmgr-#          blocking_activity.usename AS blocking_user,
repmgr-#          blocked_activity.query    AS blocked_statement,
repmgr-#          blocking_activity.query   AS current_statement_in_blocking_process
repmgr-#    FROM  pg_catalog.pg_locks         blocked_locks
repmgr-#     JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
repmgr-#     JOIN pg_catalog.pg_locks         blocking_locks
repmgr-#         ON blocking_locks.locktype = blocked_locks.locktype
repmgr-#         AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
repmgr-#         AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
repmgr-#         AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
repmgr-#         AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
repmgr-#         AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
repmgr-#         AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
repmgr-#         AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
repmgr-#         AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
repmgr-#         AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
repmgr-#         AND blocking_locks.pid != blocked_locks.pid
repmgr-#
repmgr-#     JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
repmgr-#    WHERE NOT blocked_locks.GRANTED;
 blocked_pid | blocked_user | blocking_pid | blocking_user |              blocked_statement              |           current_statement_in_blocking_process
-------------+--------------+--------------+---------------+---------------------------------------------+-----------------------------------------------------------
       21142 | postgres     |        20994 | postgres      | select * from measurement_y2021m09 limit 1; | alter table measurement_y2021m09 set tablespace tbs_fast;
(1 row)

We can check the location of the table using the following function:

Check table location

repmgr=# SELECT pg_relation_filepath('measurement_y2021m09');
            pg_relation_filepath
---------------------------------------------
 pg_tblspc/16455/PG_12_201909212/16385/16465
(1 row)

While the built-in declarative partitioning is suitable for most common use cases, there are some circumstances where a more flexible approach may be useful. Partitioning can be implemented using table inheritance, which allows for several features which are not supported by declarative partitioning, such as:

Benefits over Declarative

  • Partitioning enforces a rule that all partitions must have exactly the same set of columns as the parent, but table inheritance allows children to have extra columns not present in the parent.
  • Table inheritance allows for multiple inheritance.
  • Declarative partitioning only supports list and range partitioning, whereas table inheritance allows data to be divided in a manner of the user's choosing. (Note, however, that if constraint exclusion is unable to prune partitions effectively, query performance will be very poor.)
  • Some operations require a stronger lock when using declarative partitioning than when using table inheritance. For example, adding or removing a partition to or from a partitioned table requires taking an ACCESS EXCLUSIVE lock on the parent table, whereas a SHARE UPDATE EXCLUSIVE lock is enough in the case of regular inheritance.

The autovacuum functionality doesn’t recover disk space taken up by dead tuples. Running a VACUUM FULL command will do so, but has performance implications. The target table is exclusively locked during the operation, preventing even reads on the table. The process also makes a full copy of the table, which requires extra disk space when it runs. We recommend only running VACUUM FULL if there is a very high percentage of bloat and queries are suffering badly. We also recommend using periods of lowest database activity for it. PostgreSQL uses two configuration parameters to decide when to kick off an autovacuum:

  autovacuum_vacuum_threshold: this has a default value of 50
  autovacuum_vacuum_scale_factor: this has a default value of 0.2

pg_stat_user_tables.n_dead_tup > (pg_class.reltuples x autovacuum_vacuum_scale_factor) + autovacuum_vacuum_threshold

This may be sufficient for small to medium-sized tables. For example, in a table with 10,000 rows, the number of dead rows has to be over 2,050 ((10,000 x 0.2) + 50) before an autovacuum kicks off.

ALTER TABLE <table_name> SET (autovacuum_analyze_threshold = <threshold rows>)

Examples

  • postgresql_management.1721246443.txt.gz
  • Last modified: 2024/07/17 20:00
  • by andonovj