Освобождение свободного места после delete в PostgreSQL
Суть проблемы
У вас есть таблицы, либо ряд таблиц, строки которых нужно очистить и единственный способ, которым вы можете это сделать - это операция DELETE
.
delete from schema.table where condition;
Конечная цель изменения данных - это особождение места (дискового пространства), но на деле, после удаления данных - свободного места больше не становится.
Частично эту проблему решают команды vacuum
, analyze
либо vacuum full
, но у первых двух есть свои проблемы, подробности есть в документации, либо краткое пояснение:
In PostgreSQL, an UPDATE
or DELETE
of a row does not immediately remove the old version of the row. This approach is necessary to gain the benefits of multiversion concurrency control (MVCC, see Chapter 13): the row version must not be deleted while it is still potentially visible to other transactions. But eventually, an outdated or deleted row version is no longer of interest to any transaction. The space it occupies must then be reclaimed for reuse by new rows, to avoid unbounded growth of disk space requirements. This is done by running VACUUM
.
When ever you delete a row, you essentially mark it as inactive for your snapshot. Other snapshots started by prior transactions can still see it. This means it’s still on disk.
Rows aren’t stored in their own file on disk. In order to delete that row, even when you want, you have to essentially rewrite the files stored on disk without that row. DELETE does not do this and under normal circumstances that’s perfectly fine because dead rows that aren’t visible to a transaction can be reclaimed internally by a simply VACUUM. After being marked for reuse new data can be stored in the heap.
So essentially, only operations that internally cause the table’s heap to be rewritten reclaim space. Those operations are:
CLUSTER
VACUUM FULL
TRUNCATE
ALTER TABLE
(forms that rewrite the table)
All of these operations require an ACCESS EXCLUSIVE lock. And, TRUNCATE
is such a sledge hammer it even violates MVCC.
Решение
Исправлять ситуацию будем с помощью утилиты pgcompacttable
, ссылка github.
Ставим зависимости:
- on Debian-based Linux OS with apt-get install libdbi-perl libdbd-pg-perl
- RedHat/Centos with yum install perl-Time-HiRes perl-DBI perl-DBD-Pg
Переходим в интерактивный режим psql
:
sudo -u postgres psql
\c your_database
create extension if not exists pgstattuple;
grant execute on function pgstattuple(text) to <db user>;
grant execute on function pgstattuple(regclass) to <db user>;
grant execute on function pgstattuple_approx to <db user>;
alter user postgres with password 'your new admin password';
\q
Запускаем SQL запрос для проверки данных в таблицах.
select
table_name,
pg_size_pretty(relation_size + toast_relation_size) as total_size,
pg_size_pretty(toast_relation_size) as toast_size,
round(
((relation_size - (relation_size - free_space)*100/fillfactor)*100
/greatest(relation_size, 1))::numeric
, 1) table_waste_percent,
pg_size_pretty(
(relation_size - (relation_size - free_space)*100/fillfactor)::bigint
) table_waste,
round(
((toast_free_space + relation_size - (relation_size - free_space)*100/fillfactor)*100
/greatest(relation_size + toast_relation_size, 1))::numeric
, 1) total_waste_percent,
pg_size_pretty(
(toast_free_space + relation_size - (relation_size - free_space)*100/fillfactor)::bigint
) total_waste
from (
select
(case
when n.nspname = 'public'
then format('%I', c.relname)
else format('%I.%I', n.nspname, c.relname)
end) as table_name,
(select approx_free_space from pgstattuple_approx(c.oid)) as free_space,
pg_relation_size(c.oid) as relation_size,
(case
when reltoastrelid = 0
then 0
else (select free_space from pgstattuple(c.reltoastrelid))
end) as toast_free_space,
coalesce(pg_relation_size(c.reltoastrelid), 0) as toast_relation_size,
coalesce(
(SELECT (regexp_matches(reloptions::text, E'.*fillfactor=(\\d+).*'))[1])
,'100')::real AS fillfactor
from pg_class c
left join pg_namespace n on (n.oid = c.relnamespace)
where
nspname not in ('pg_catalog', 'information_schema')
and nspname !~ '^pg_toast'
and nspname !~ '^pg_temp'
and relkind in ('r', 'm')
and (relpersistence = 'p' or not pg_is_in_recovery())
--put your table name/mask here
and relname ~ :'tablename'
) t
order by (toast_free_space + relation_size - (relation_size - free_space)*100/fillfactor) desc
limit 20;
Далее забираем себе локально perl
скрипт.
git clone https://github.com/dataegret/pgcompacttable.git
И запускаем с необходимыми параметрами, полный список можно посмотреть через:
perl pgcompacttable --man
Команда:
perl pgcompacttable \
--dbname <database_name> \
-n <schema_name> \
-h localhost \
-p 5432 \
-U <user> \
-W <password> \
-t <table_name> \
-f
В дополнение можно провести:
vacuum <table_name>
analyze <table_name>
Большое спасибо всем за внимание! Мой телеграм канал artydev & Co — подписывайтесь!