Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
postgresql_management [2021/06/08 18:08] – andonovj | postgresql_management [2024/07/18 04:41] (current) – [VACUUMING] andonovj | ||
---|---|---|---|
Line 117: | Line 117: | ||
| | | luca=arw/ | | | | luca=arw/ | ||
| | | =d/ | | | | =d/ | ||
+ | </ | ||
+ | |||
+ | |||
+ | < | ||
+ | forumdb=> | ||
+ | | ||
+ | (aclexplode(relacl)).grantor, | ||
+ | (aclexplode(relacl)).grantee, | ||
+ | (aclexplode(relacl)).privilege_type | ||
+ | FROM pg_class ) | ||
+ | | ||
+ | acl.privilege_type AS permission, | ||
+ | gg.rolname AS grantor | ||
+ | FROM acl | ||
+ | JOIN pg_roles g ON g.oid = acl.grantee | ||
+ | JOIN pg_roles gg ON gg.oid = acl.grantor | ||
+ | WHERE acl.relname = ' | ||
+ | |||
+ | | ||
+ | -------------+------------+--------- | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
</ | </ | ||
Line 207: | Line 238: | ||
[root@s24-se-db01 pg_wal]# | [root@s24-se-db01 pg_wal]# | ||
</ | </ | ||
+ | |||
+ | |||
+ | ======Table Management====== | ||
+ | |||
+ | =====Creation===== | ||
+ | |||
+ | |||
+ | =====Partitioning===== | ||
+ | Partitioning in PostgreSQL, can be achieved in several ways, but in general are these: | ||
+ | |||
+ | * Declarative | ||
+ | * Using Inheritance | ||
+ | |||
+ | Each has it's pros and cons, we have to start somewhere, so let's start with the declarative partitioning. | ||
+ | |||
+ | |||
+ | ====Declarative==== | ||
+ | PostgreSQL offers a way to specify how to divide a table into pieces called partitions. The table that is divided is referred to as a partitioned table. The specification consists of the partitioning method and a list of columns or expressions to be used as the partition key. | ||
+ | |||
+ | ===Pros=== | ||
+ | * All rows inserted into a partitioned table will be routed to one of the partitions based on the value of the partition key. Each partition has a subset of the data defined by its partition bounds. Currently supported partitioning methods include range and list, where each partition is assigned a range of keys and a list of keys, respectively. | ||
+ | * Partitions may themselves be defined as partitioned tables, using what is called sub-partitioning. Partitions may have their own indexes, constraints and default values, distinct from those of other partitions. Indexes must be created separately for each partition. See CREATE TABLE for more details on creating partitioned tables and partitions. | ||
+ | * It is not possible to turn a regular table into a partitioned table or vice versa. However, it is possible to add a regular or partitioned table containing data as a partition of a partitioned table, or remove a partition from a partitioned table turning it into a standalone table; see ALTER TABLE to learn more about the ATTACH PARTITION and DETACH PARTITION sub-commands. | ||
+ | |||
+ | ===Cons=== | ||
+ | * There is no facility available to create the matching indexes on all partitions automatically. Indexes must be added to each partition with separate commands. This also means that there is no way to create a primary key, unique constraint, or exclusion constraint spanning all partitions; it is only possible to constrain each leaf partition individually. | ||
+ | * Since primary keys are not supported on partitioned tables, foreign keys referencing partitioned tables are not supported, nor are foreign key references from a partitioned table to some other table. | ||
+ | * Using the ON CONFLICT clause with partitioned tables will cause an error, because unique or exclusion constraints can only be created on individual partitions. There is no support for enforcing uniqueness (or an exclusion constraint) across an entire partitioning hierarchy. | ||
+ | * An UPDATE that causes a row to move from one partition to another fails, because the new value of the row fails to satisfy the implicit partition constraint of the original partition. | ||
+ | * Row triggers, if necessary, must be defined on individual partitions, not the partitioned table. | ||
+ | * Mixing temporary and permanent relations in the same partition tree is not allowed. Hence, if the partitioned table is permanent, so must be its partitions and likewise if the partitioned table is temporary. When using temporary relations, all members of the partition tree have to be from the same session. | ||
+ | |||
+ | |||
+ | So let's see how they are done: | ||
+ | |||
+ | |||
+ | ===Implemention=== | ||
+ | We can create a table, using declarative partitioning as follows: | ||
+ | |||
+ | < | ||
+ | CREATE TABLE measurement ( | ||
+ | city_id | ||
+ | logdate | ||
+ | peaktemp | ||
+ | unitsales | ||
+ | ) PARTITION BY RANGE (logdate); | ||
+ | </ | ||
+ | |||
+ | Then we can define a partition as folows: | ||
+ | |||
+ | < | ||
+ | CREATE TABLE measurement_y2021m09 PARTITION OF measurement | ||
+ | FOR VALUES FROM (' | ||
+ | WITH (parallel_workers = 4) | ||
+ | TABLESPACE tbs_fast; | ||
+ | </ | ||
+ | |||
+ | |||
+ | Let's say, we want to move the partition to a new tablespace (because it became old). We can do that (with a cost) but first, let's create new partition and fill our old partition. | ||
+ | |||
+ | < | ||
+ | do | ||
+ | $$ | ||
+ | declare | ||
+ | counter integer := 0; | ||
+ | begin | ||
+ | |||
+ | |||
+ | while counter != 100000000 | ||
+ | LOOP | ||
+ | |||
+ | insert into measurement values (counter,' | ||
+ | counter = counter + 1; | ||
+ | |||
+ | END LOOP; | ||
+ | |||
+ | end; | ||
+ | $$; | ||
+ | |||
+ | |||
+ | CREATE TABLE measurement_y2021m10 PARTITION OF measurement | ||
+ | FOR VALUES FROM (' | ||
+ | WITH (parallel_workers = 4) | ||
+ | TABLESPACE tbs_fast; | ||
+ | </ | ||
+ | |||
+ | That is simple, procedure, but quite effective. Now we have 2 partitions, one filled with data and one who doesn' | ||
+ | Now, let's move the old partition (09) into the slower tablespace: | ||
+ | |||
+ | |||
+ | < | ||
+ | repmgr=# alter table measurement_y2021m09 set tablespace tbs_slow; | ||
+ | ALTER TABLE | ||
+ | repmgr=# | ||
+ | </ | ||
+ | |||
+ | During the move, PostgreSQL locks the PARTITION with Access Exclusive Lock. Which means, no one can do anything. | ||
+ | |||
+ | < | ||
+ | repmgr=# | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | a.pid | ||
+ | FROM pg_stat_activity a | ||
+ | JOIN pg_locks | ||
+ | JOIN pg_class | ||
+ | where c.relname like ' | ||
+ | ORDER BY a.query_start; | ||
+ | -[ RECORD 1 ]-+---------------------------------------------------------- | ||
+ | datname | ||
+ | relname | ||
+ | transactionid | | ||
+ | mode | AccessExclusiveLock | ||
+ | granted | ||
+ | usename | ||
+ | query | alter table measurement_y2021m09 set tablespace tbs_slow; | ||
+ | query_start | ||
+ | age | 00: | ||
+ | pid | 20994 | ||
+ | </ | ||
+ | |||
+ | *We can eliminate the lock (RW) if add NOT VALID constraint (TODO) | ||
+ | |||
+ | < | ||
+ | BEGIN; | ||
+ | create schema partman; | ||
+ | create extension pg_partman schema partman; | ||
+ | ALTER TABLE auditlog RENAME TO auditlog_old; | ||
+ | CREATE TABLE auditlog ( LIKE auditlog_old including all) PARTITION BY RANGE (inserted_at); | ||
+ | CREATE TABLE auditlog_template (LIKE auditlog_old including all); | ||
+ | alter table auditlog_old add constraint old_data_constraint check ( inserted_at >= ' | ||
+ | alter table auditlog_old validate constraint old_data_constraint; | ||
+ | ALTER TABLE auditlog ATTACH PARTITION auditlog_old FOR VALUES FROM (' | ||
+ | SELECT partman.create_parent(' | ||
+ | p_control := ' | ||
+ | p_type := ' | ||
+ | p_template_table: | ||
+ | p_interval := ' | ||
+ | p_premake := 4, | ||
+ | p_start_partition := ' | ||
+ | CALL partman.run_maintenance_proc(0, | ||
+ | |||
+ | </ | ||
+ | |||
+ | Of course, operations on other partitions can contonue: | ||
+ | |||
+ | < | ||
+ | repmgr=# insert into measurement values(2,' | ||
+ | INSERT 0 1 | ||
+ | repmgr=# insert into measurement values(2,' | ||
+ | INSERT 0 1 | ||
+ | repmgr=# commit | ||
+ | repmgr-# ; | ||
+ | WARNING: | ||
+ | COMMIT | ||
+ | repmgr=# select * from measurement_y2021m10; | ||
+ | | ||
+ | ---------+------------+----------+----------- | ||
+ | 2 | 2021-10-05 | 1 | 2 | ||
+ | 2 | 2021-10-05 | 1 | 2 | ||
+ | </ | ||
+ | |||
+ | |||
+ | If you try, to even select data from the partition which we move, you will be blocked: | ||
+ | |||
+ | < | ||
+ | repmgr=# select * from measurement_y2021m09 limit 1; | ||
+ | </ | ||
+ | |||
+ | |||
+ | < | ||
+ | postgres=# \c repmgr | ||
+ | You are now connected to database " | ||
+ | repmgr=# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | repmgr-# | ||
+ | | ||
+ | -------------+--------------+--------------+---------------+---------------------------------------------+----------------------------------------------------------- | ||
+ | 21142 | postgres | ||
+ | (1 row) | ||
+ | </ | ||
+ | |||
+ | |||
+ | We can check the location of the table using the following function: | ||
+ | |||
+ | < | ||
+ | repmgr=# SELECT pg_relation_filepath(' | ||
+ | pg_relation_filepath | ||
+ | --------------------------------------------- | ||
+ | | ||
+ | (1 row) | ||
+ | </ | ||
+ | ====Using Inheritance==== | ||
+ | While the built-in declarative partitioning is suitable for most common use cases, there are some circumstances where a more flexible approach may be useful. Partitioning can be implemented using table inheritance, | ||
+ | |||
+ | ===Benefits over Declarative=== | ||
+ | * Partitioning enforces a rule that all partitions must have exactly the same set of columns as the parent, but table inheritance allows children to have extra columns not present in the parent. | ||
+ | * Table inheritance allows for multiple inheritance. | ||
+ | * Declarative partitioning only supports list and range partitioning, | ||
+ | * Some operations require a stronger lock when using declarative partitioning than when using table inheritance. For example, adding or removing a partition to or from a partitioned table requires taking an ACCESS EXCLUSIVE lock on the parent table, whereas a SHARE UPDATE EXCLUSIVE lock is enough in the case of regular inheritance. | ||
+ | |||
+ | |||
+ | =====VACUUMING===== | ||
+ | Vacuuming should be part of routine database maintenance, | ||
+ | Don’t run manual VACUUM or ANALYZE without reason. | ||
+ | Database administrators should refrain from running manual vacuums too often on the entire database, as the autovacuum process might already have optimally vacuumed the target database. As a result, a manual vacuum may not remove any dead tuples but cause unnecessary I/O loads or CPU spikes. | ||
+ | If necessary, manual vacuums should be run on a table-by-table basis only when necessary, like when there are low ratios of live rows to dead rows or large gaps between autovacuum operations. They should also be run when user activity is minimum. | ||
+ | Autovacuum also keeps a table’s data distribution statistics up-to-date (it doesn’t rebuild them). When manually run, the ANALYZE command rebuilds these statistics instead of updating them. Again, rebuilding statistics when they’re already optimally updated by a regular autovacuum might cause unnecessary pressure on system resources. | ||
+ | The time when you must run ANALYZE manually is immediately after bulk loading data into the target table. A large number (even a few hundred) of new rows in an existing table will significantly skew its column data distribution. The new rows will cause any existing column statistics to be out-of-date. When the query optimizer uses such statistics, query performance can be really slow. | ||
+ | In these cases, running the ANALYZE command immediately after a data load to rebuild the statistics completely is better than waiting for the autovacuum to kick in. | ||
+ | Select VACUUM FULL only when performance degrades badly | ||
+ | The autovacuum functionality doesn’t recover disk space taken up by dead tuples. Running a VACUUM FULL command will do so, but has performance implications. The target table is exclusively locked during the operation, preventing even reads on the table. The process also makes a full copy of the table, which requires extra disk space when it runs. We recommend only running VACUUM FULL if there is a very high percentage of bloat and queries are suffering badly. We also recommend using periods of lowest database activity for it. | ||
+ | |||
+ | Fine-tune Autovacuum Threshold | ||
+ | It’s essential to check or tune the autovacuum and analyze configuration parameters in the postgresql.conf file or in individual table properties to strike a balance between autovacuum and performance gain. | ||
+ | PostgreSQL uses two configuration parameters to decide when to kick off an autovacuum: | ||
+ | - autovacuum_vacuum_threshold: | ||
+ | - autovacuum_vacuum_scale_factor: | ||
+ | |||
+ | Together, these parameters tell PostgreSQL to start an autovacuum when the number of dead rows in a table exceeds the number of rows in that table multiplied by the scale factor plus the vacuum threshold. In other words, PostgreSQL will start autovacuum on a table when: | ||
+ | pg_stat_user_tables.n_dead_tup > (pg_class.reltuples x autovacuum_vacuum_scale_factor) | ||
+ | |||
+ | This may be sufficient for small to medium-sized tables. For example, in a table with 10,000 rows, the number of dead rows has to be over 2,050 ((10,000 x 0.2) + 50) before an autovacuum kicks off. | ||
+ | Not every table in a database experiences the same rate of data modification. Usually, a few large tables will experience frequent data modifications, | ||
+ | |||
+ | Therefore, the goal should be to set these thresholds to optimal values so autovacuum can happen at regular intervals and don’t take a long time (and affect user sessions) while keeping the number of dead rows relatively low. | ||
+ | |||
+ | One approach is to use one or the other parameter. So, if we set autovacuum_vacuum_scale_factor to 0 and instead set autovacuum_vacuum_threshold to, say, 5,000, a table will be autovacuumed when its number of dead rows is more than 5,000. | ||
+ | |||
+ | Fine-tune Autoanalyze Threshold | ||
+ | Similar to autovacuum, autoanalyze also uses two parameters that decide when autovacuum will also trigger an autoanalyze: | ||
+ | - autovacuum_analyze_threshold: | ||
+ | - autovacuum_analyze_scale_factor: | ||
+ | |||
+ | Like autovacuum, the autovacuum_analyze_threshold parameter can be set to a value that dictates the number of inserted, deleted, or updated tuples in a table before an autoanalyze starts. We recommend setting this parameter separately on large and high-transaction tables. The table configuration will override the postgresql.conf values. | ||
+ | The code snippet below shows the SQL syntax for modifying the autovacuum_analyze_threshold setting for a table. | ||
+ | ALTER TABLE < | ||
+ | |||
+ | Fine-tune Autovacuum workers | ||
+ | Another parameter often overlooked is autovacuum_max_workers, | ||
+ | A common practice by PostgreSQL DBAs is to increase the number of maximum worker threads to speed up autovacuum. This doesn’t work as all the threads share the same autovacuum_vacuum_cost_limit, | ||
+ | |||
+ | individual thread’s cost_limit = autovacuum_vacuum_cost_limit / autovacuum_max_workers | ||
+ | |||
+ | The cost of work done by an autovacuum thread is calculated using three parameters: | ||
+ | - vacuum_cost_page_hit: | ||
+ | - vacuum_cost_page_miss: | ||
+ | - vacuum_cost_page_dirty: | ||
+ | |||
+ | What these parameters mean is this: | ||
+ | When a vacuum thread finds the data page that it’s supposed to clean in the shared buffer, the cost is 1. | ||
+ | If the data page is not in the shared buffer but the OS cache, the cost will be 10. | ||
+ | If the page has to be marked dirty because the vacuum thread had to delete dead rows, the cost will be 20. | ||
+ | An increased number of worker threads will lower the cost limit for each thread. As each thread is assigned a lower cost limit, it will go to sleep more often as the cost threshold is easily reached, ultimately causing the whole vacuum process to run slow. We recommend increasing the autovacuum_vacuum_cost_limit to a higher value, like 2000, and then adjusting the maximum number of worker threads. | ||
+ | A better way is to tune these parameters for individual tables only when necessary. For example, if the autovacuum of a large transactional table is taking too long, the table may be temporarily configured to use its own vacuum cost limit and cost delays. The cost limit and delay will override the system-wide values set in postgresql.conf. | ||
+ | The code snippet below shows how to configure individual tables. | ||
+ | |||
+ | * ALTER TABLE < | ||
+ | * ALTER TABLE < | ||
+ | |||
+ | Using the first parameter will ensure the autovacuum thread assigned to the table performs more work before going to sleep. Lowering the autovacuum_vacuum_cost_delay will also mean the thread sleeps for less time. | ||
+ | Get more best practice tips from our professional team of PostgreSQL experts: | ||
+ | |||
+ | ===Examples=== |