
- Which postgresql rpm package installed
rpm -qa|grep postgres
- Kernel Settings Check
cat /etc/sysctl.conf
- Check Memory and HugePage Settings
free -ggrep PageTables /proc/meminfogrep Huge /proc/meminfo
- Check disk usage
df -h
- Check linux server performance and stranger processes other than postgresql
top
- check postgresql log for errors and warning
tail -1000f postgresql.log
- Check user and roles
- Application user must not have superuser priviliges
postgres=# \du
- Check public schema and the other schemas privliges.
SELECT n.nspname AS "Name", pg_catalog.pg_get_userbyid(n.nspowner) AS "Owner", pg_catalog.array_to_string(n.nspacl, E'\n') AS "Access privileges",
pg_catalog.obj_description(n.oid, 'pg_namespace') AS "Description"
FROM pg_catalog.pg_namespace n WHERE n.nspname !~ '^pg_' AND n.nspname <> 'information_schema'
- Check extensions
\dxpg_stat_statements⇒ for monitoring sql performancepg_audit⇒ fine grained auditingpgpasswordcheck⇒ for hardening password extensionspgstattuple⇒ analyzing table bloatspgcrypto=> Did you decide Mission or Security critical tables and do logging for these tables,
- also encrypt sensitive data with pg_crypto ( with application vendor)
- Check whether only allowed ip’s are in
pg_hba.conf- check
pg_hba.confif it has true password authentications rules cat pg_hba.conf
- check
- Check backups and did you do any restore recover test from backups ?
- Check Replica Stream replica , cascade replica
select * from pg_stat_replication;
- Check HA (High Availibility) BFM or other.
- What is hit ratio of the database ?
SELECT (blks_hit*100/(blks_hit+blks_read))::numeric as hit_ratio FROM pg_stat_database WHERE datname='<datname>';
-- The hit ratio should be near to 100%, if it is less than 90% then their might me some issue with shared buffers or queries. Increase shared buffers or tune the queries doing more IO.
- IDLE or IDLE in Transaction Connections
- Is there is job Killing idle or IDLE in Transaction connections after <xxx> minutes in crontab
crontab -l
- IDLE Connections
- Is there is job Killing idle connections after <xxx> minutes in crontab
https://github.com/farukcevik/for_PostgreSQL_DBA/blob/main/kill_idle_sessions.sh
-- kill_idle_sessions.sql
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE usename in ( 'user1' ,'user2')
AND pid <> pg_backend_pid()
AND state = 'idle'
AND state_change < current_timestamp - INTERVAL '10' MINUTE;
- Monitoring Tools, notification and alert mechanism
- Is there a monitoring tools in production and alert mechanism such as if disk capacity is %90 full or cpu usage is more then %90 or active sessions in pg_stat_activitiy is bigger than 10 send an alert email to system and database admins. And notification if backup has succesfully taken etc.
- Frequency of checkpoints ?
select * from pg_stat_bgwriter;- If the
checkpoints_reqis more thancheckpoints_timedthen we need to think of increasing themax_wal_size. Also set thecheckpoint_completion_target=0.9to spread the checkpoint. (checkpoints_req > checkpoints_timedis bad).
- By below query we can find the frequency of checkpoint, if it is less than “
checkpoint_timeout” then reduce the frequency of checkpoint by increasingcheckpoint_timeout, max_wal_size.
WITH sub as (SELECT EXTRACT(EPOCH FROM (now() - stats_reset)) AS seconds_since_start,
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
)
SELECT total_checkpoints,seconds_since_start/total_checkpoints/60 AS minutes_between_checkpoints
FROM sub;
set log_checkpoints so that we can find how much data it is writing for checkpoint. else we can use this query to find the number of blocks written during the checkpoint.
- Average number of buffers written per checkpoint ?
select buffers_checkpoint/(checkpoints_timed+checkpoints_req) from pg_stat_bgwriter;
This helps in tuning the max_wal_size/wal_keep_segments
- Get top five tables with highest sequential scans ?
SELECT schemaname, relname, seq_scan, seq_tup_read, seq_tup_read / seq_scan as avg_seq_tup_read FROM pg_stat_all_tables WHERE seq_scan > 0
And pg_total_relation_size(schemaname.relname) > 104857600
ORDER BY 5 DESC LIMIT 5;
- Get list of unused indexes ?
select * from pg_stat_all_indexes where idx_scan=0;
- Get average response time of an instance ?
select (sum(total_time)/sum(calls))::numeric(6,3) from pg_stat_statements;
- PostgreSQL Best Practice postgresql.conf for 32GB Memories
listen_addresses = '*'
max_connections = 150
shared_buffer: 8GB # (must be maximum one in four of total RAM )
shared_preload_libraries = 'pg_stat_statements'
maintenance_work_mem=1GB
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0
wal_level = replica
wal_log_hints = on # (if checksum not enabled)
max_wal_size = 12GB
min_wal_size = 6GB
checkpoint_completion_target = 0.7
checkpoint_timeout =10
archive_mode = on
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
wal_keep_segments = 1000
log_filename = 'postgresql-%a.log'
log_min_duration_statement = 1000
log_line_prefix = '%m [%p] app=%a user=%u db=%d host=%h pid=%p '
log_lock_waits = on
log_temp_files = 0
track_io_timing = on
autovacuum = on #(It is not recommended to turn off)
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.01
effective_cache_size= 24GB
Bir yanıt yazın