PostgreSQL Installation and Configuration: Enterprise-Grade Database Setup Guide

Introduction

PostgreSQL, often called "Postgres," stands as one of the most advanced open-source relational database management systems available today. With over 35 years of active development, PostgreSQL has earned its reputation as a powerful, reliable, and feature-rich database that rivals commercial solutions in capabilities while maintaining its open-source roots.

Unlike many database systems that prioritize speed over features, PostgreSQL takes a comprehensive approach, offering advanced features such as complex queries, foreign keys, triggers, updatable views, transactional integrity, and multiversion concurrency control (MVCC). This makes it an excellent choice for applications requiring data integrity, complex queries, and enterprise-level features.

This comprehensive guide provides step-by-step instructions for installing and configuring PostgreSQL on various Linux distributions, covering everything from basic installation to advanced configuration, security hardening, and performance optimization.

Why Choose PostgreSQL?

PostgreSQL is trusted by organizations worldwide, from startups to Fortune 500 companies. Major users include Apple, Fujitsu, Red Hat, Cisco, and Instagram. The database excels in several key areas:

Advanced Features:

  • Full ACID compliance
  • Complex SQL queries and subqueries
  • JSON and JSONB support for document storage
  • Full-text search capabilities
  • Geospatial data support with PostGIS
  • Custom data types and functions
  • Advanced indexing methods (B-tree, Hash, GiST, SP-GiST, GIN, BRIN)

Enterprise Capabilities:

  • Multi-version concurrency control (MVCC)
  • Point-in-time recovery
  • Tablespaces for flexible storage management
  • Asynchronous replication
  • Hot standby servers
  • Logical replication

Prerequisites

Before beginning the installation process, ensure your system meets these requirements:

System Requirements

Minimum Requirements:

  • Linux server (Ubuntu 20.04+, Debian 11+, CentOS 8+, Rocky Linux 8+)
  • Root or sudo privileges
  • 1GB RAM minimum
  • 10GB available disk space
  • Internet connectivity for package downloads

Recommended for Production:

  • 4+ CPU cores
  • 16GB+ RAM
  • SSD storage with 100GB+ space
  • Dedicated database server
  • Reliable network connectivity

Required Knowledge

  • Basic Linux command-line proficiency
  • Understanding of SQL fundamentals
  • Familiarity with text editors (nano, vim)
  • Basic networking concepts

Installation

Installing PostgreSQL on Ubuntu/Debian

PostgreSQL provides official APT repositories for Debian-based distributions:

# Update package index
sudo apt update

# Install PostgreSQL (latest version from default repository)
sudo apt install postgresql postgresql-contrib -y

# Verify installation
sudo systemctl status postgresql

# Enable PostgreSQL to start on boot
sudo systemctl enable postgresql

For the latest PostgreSQL version from official repository:

# Create the file repository configuration
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

# Import the repository signing key
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

# Update package lists
sudo apt update

# Install PostgreSQL 16 (or specify your desired version)
sudo apt install postgresql-16 postgresql-contrib-16 -y

# Verify installation
psql --version

Check the service status:

# Check PostgreSQL status
sudo systemctl status postgresql

# View all PostgreSQL processes
ps aux | grep postgres

Installing PostgreSQL on CentOS/Rocky Linux

For Red Hat-based distributions:

# Update system packages
sudo dnf update -y

# Install PostgreSQL from default repository
sudo dnf install postgresql-server postgresql-contrib -y

# Initialize database cluster
sudo postgresql-setup --initdb

# Start PostgreSQL service
sudo systemctl start postgresql

# Enable PostgreSQL to start on boot
sudo systemctl enable postgresql

# Verify installation
sudo systemctl status postgresql

For the latest PostgreSQL version:

# Install PostgreSQL repository RPM
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# Disable built-in PostgreSQL module (if exists)
sudo dnf -qy module disable postgresql

# Install PostgreSQL 16
sudo dnf install -y postgresql16-server postgresql16-contrib

# Initialize the database
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb

# Start and enable service
sudo systemctl enable postgresql-16
sudo systemctl start postgresql-16

# Verify installation
/usr/pgsql-16/bin/psql --version

Verifying Installation

Confirm PostgreSQL is running correctly:

# Check version
psql --version

# Check service status
sudo systemctl status postgresql

# View PostgreSQL data directory
ls -la /var/lib/pgsql/data/
# or for Debian/Ubuntu
ls -la /var/lib/postgresql/16/main/

# Check listening ports
sudo ss -tulpn | grep postgres
sudo netstat -tulpn | grep postgres

# View PostgreSQL logs
sudo tail -f /var/log/postgresql/postgresql-16-main.log
# or
sudo journalctl -u postgresql -f

Initial Configuration

Understanding PostgreSQL Directory Structure

Ubuntu/Debian:

  • Configuration: /etc/postgresql/16/main/
  • Data directory: /var/lib/postgresql/16/main/
  • Binaries: /usr/lib/postgresql/16/bin/
  • Logs: /var/log/postgresql/

CentOS/Rocky Linux:

  • Configuration: /var/lib/pgsql/16/data/
  • Data directory: /var/lib/pgsql/16/data/
  • Binaries: /usr/pgsql-16/bin/
  • Logs: /var/lib/pgsql/16/data/log/

Accessing PostgreSQL

Switch to the postgres user and access the database:

# Switch to postgres user
sudo -i -u postgres

# Access PostgreSQL prompt
psql

# View current databases
\l

# View current users
\du

# Exit psql
\q

# Exit postgres user session
exit

Alternative one-liner access:

# Execute command as postgres user
sudo -u postgres psql

# Execute specific SQL command
sudo -u postgres psql -c "SELECT version();"

Setting Up Authentication

PostgreSQL uses two main configuration files for authentication:

1. postgresql.conf - Main configuration file 2. pg_hba.conf - Client authentication configuration

Edit pg_hba.conf:

# Ubuntu/Debian
sudo nano /etc/postgresql/16/main/pg_hba.conf

# CentOS/Rocky
sudo nano /var/lib/pgsql/16/data/pg_hba.conf

Understanding pg_hba.conf format:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# Local connections
local   all             postgres                                peer
local   all             all                                     md5

# IPv4 connections
host    all             all             127.0.0.1/32            md5
host    all             all             0.0.0.0/0               md5

# IPv6 connections
host    all             all             ::1/128                 md5

Common authentication methods:

  • peer: Uses operating system username (secure for local connections)
  • md5: Password-based authentication with MD5 hashing
  • scram-sha-256: Modern password authentication (recommended)
  • trust: No authentication (never use in production)
  • reject: Reject connection

Recommended production configuration:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# Local connections
local   all             postgres                                peer
local   all             all                                     scram-sha-256

# IPv4 local connections
host    all             all             127.0.0.1/32            scram-sha-256

# IPv6 local connections
host    all             all             ::1/128                 scram-sha-256

# Remote connections from specific subnet
host    all             all             192.168.1.0/24          scram-sha-256

# Reject all other connections
host    all             all             0.0.0.0/0               reject

Configuring postgres User Password

Set a password for the postgres superuser:

# Access PostgreSQL as postgres user
sudo -u postgres psql

# Set password
ALTER USER postgres WITH PASSWORD 'your_strong_password';

# Exit
\q

Creating Database Users

Create application-specific users:

-- Create new user with password
CREATE USER myapp_user WITH PASSWORD 'secure_password';

-- Create user with specific privileges
CREATE USER readonly_user WITH PASSWORD 'password';

-- Create database
CREATE DATABASE myapp_db;

-- Grant privileges to user
GRANT ALL PRIVILEGES ON DATABASE myapp_db TO myapp_user;

-- Connect to database
\c myapp_db

-- Grant schema privileges
GRANT ALL ON SCHEMA public TO myapp_user;

-- Grant table privileges (after tables are created)
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO myapp_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO myapp_user;

-- Make privileges default for future objects
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO myapp_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO myapp_user;

Create read-only user:

-- Create read-only user
CREATE USER readonly_user WITH PASSWORD 'readonly_password';

-- Grant connection
GRANT CONNECT ON DATABASE myapp_db TO readonly_user;

-- Switch to database
\c myapp_db

-- Grant usage on schema
GRANT USAGE ON SCHEMA public TO readonly_user;

-- Grant SELECT on all existing tables
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;

-- Grant SELECT on future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly_user;

Enabling Remote Connections

Edit postgresql.conf:

# Ubuntu/Debian
sudo nano /etc/postgresql/16/main/postgresql.conf

# CentOS/Rocky
sudo nano /var/lib/pgsql/16/data/postgresql.conf

Modify the listen_addresses parameter:

# Listen on all interfaces
listen_addresses = '*'

# Or specific IP address
listen_addresses = '192.168.1.100'

# Or multiple addresses
listen_addresses = 'localhost,192.168.1.100'

# Default port (change if needed)
port = 5432

Update pg_hba.conf to allow remote connections:

# Allow connections from specific IP
host    all             all             192.168.1.50/32         scram-sha-256

# Allow from subnet
host    all             all             192.168.1.0/24          scram-sha-256

Apply changes:

# Restart PostgreSQL
sudo systemctl restart postgresql

# Or reload configuration without restart
sudo systemctl reload postgresql

# Verify listening addresses
sudo ss -tulpn | grep 5432

Security Hardening

Firewall Configuration

Configure firewall to restrict PostgreSQL access:

# UFW (Ubuntu/Debian)
# Allow from specific IP
sudo ufw allow from 192.168.1.50 to any port 5432

# Allow from subnet
sudo ufw allow from 192.168.1.0/24 to any port 5432

# Check rules
sudo ufw status numbered

# firewalld (CentOS/Rocky)
# Allow from specific IP
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.50" port protocol="tcp" port="5432" accept'

# Allow from subnet
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="5432" accept'

# Reload firewall
sudo firewall-cmd --reload

# Verify rules
sudo firewall-cmd --list-all

SSL/TLS Configuration

Generate SSL certificates:

# Switch to postgres user
sudo -i -u postgres

# Navigate to data directory
cd /var/lib/postgresql/16/main  # Ubuntu/Debian
# or
cd /var/lib/pgsql/16/data       # CentOS/Rocky

# Generate self-signed certificate
openssl req -new -x509 -days 365 -nodes -text \
    -out server.crt \
    -keyout server.key \
    -subj "/CN=postgres.example.com"

# Set proper permissions
chmod 600 server.key
chmod 644 server.crt

# Exit postgres user
exit

Enable SSL in postgresql.conf:

# SSL Settings
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
ssl_prefer_server_ciphers = on
ssl_min_protocol_version = 'TLSv1.2'

Enforce SSL in pg_hba.conf:

# Require SSL for remote connections
hostssl    all             all             192.168.1.0/24          scram-sha-256

Restart PostgreSQL:

sudo systemctl restart postgresql

# Test SSL connection
psql "sslmode=require host=localhost dbname=postgres user=postgres"

Password Policy Enforcement

Install passwordcheck extension:

# Ubuntu/Debian
sudo apt install postgresql-16-passwordcheck

# CentOS/Rocky
sudo dnf install postgresql16-contrib

Configure in postgresql.conf:

# Load passwordcheck extension
shared_preload_libraries = 'passwordcheck'

# Password requirements (customize as needed)
passwordcheck.minimum_length = 12

Restart and test:

sudo systemctl restart postgresql

# Test with weak password (should fail)
sudo -u postgres psql -c "CREATE USER testuser WITH PASSWORD '123';"

Role-Based Access Control

Implement proper role hierarchy:

-- Create roles for different access levels
CREATE ROLE readonly;
CREATE ROLE readwrite;
CREATE ROLE admin;

-- Grant privileges to roles
GRANT CONNECT ON DATABASE myapp_db TO readonly;
GRANT CONNECT ON DATABASE myapp_db TO readwrite;
GRANT CONNECT ON DATABASE myapp_db TO admin;

-- Switch to database
\c myapp_db

-- Readonly role privileges
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly;

-- Readwrite role privileges
GRANT USAGE, CREATE ON SCHEMA public TO readwrite;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO readwrite;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO readwrite;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO readwrite;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT USAGE ON SEQUENCES TO readwrite;

-- Admin role privileges
GRANT ALL PRIVILEGES ON DATABASE myapp_db TO admin;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO admin;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO admin;

-- Create users and assign roles
CREATE USER app_user WITH PASSWORD 'password';
GRANT readwrite TO app_user;

CREATE USER report_user WITH PASSWORD 'password';
GRANT readonly TO report_user;

CREATE USER admin_user WITH PASSWORD 'password';
GRANT admin TO admin_user;

Auditing and Logging

Configure comprehensive logging in postgresql.conf:

# Logging Configuration
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_age = 1d
log_rotation_size = 100MB
log_truncate_on_rotation = on

# What to log
log_connections = on
log_disconnections = on
log_duration = off
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_lock_waits = on
log_statement = 'ddl'
log_temp_files = 0

# Log slow queries
log_min_duration_statement = 1000  # Log queries taking > 1 second

# Error logging
log_min_error_statement = error
log_min_messages = warning

Install and configure pgAudit for detailed auditing:

# Ubuntu/Debian
sudo apt install postgresql-16-pgaudit

# CentOS/Rocky
sudo dnf install pgaudit_16

Configure pgAudit:

# Load pgaudit
shared_preload_libraries = 'pgaudit'

# Audit settings
pgaudit.log = 'all'
pgaudit.log_catalog = off
pgaudit.log_parameter = on
pgaudit.log_relation = on
pgaudit.log_statement_once = on

Performance Optimization

Memory Configuration

Edit postgresql.conf with optimized memory settings:

# Memory Settings
shared_buffers = 4GB                    # 25% of RAM for dedicated server
effective_cache_size = 12GB             # 75% of RAM
maintenance_work_mem = 1GB              # For maintenance operations
work_mem = 64MB                         # Per query operation memory
huge_pages = try                        # Use huge pages if available

# WAL Settings
wal_buffers = 16MB
min_wal_size = 1GB
max_wal_size = 4GB
checkpoint_completion_target = 0.9

Memory calculation guidelines:

For dedicated database servers:

  • shared_buffers: 25% of total RAM (max 8-16GB)
  • effective_cache_size: 75% of total RAM
  • maintenance_work_mem: 5% of RAM (max 2GB)
  • work_mem: RAM / (max_connections * 2-3)

Example for 32GB RAM server:

shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 2GB
work_mem = 64MB
max_connections = 100

Query Planning and Execution

Optimize query planner settings:

# Query Planning
random_page_cost = 1.1                  # For SSD storage
effective_io_concurrency = 200          # For SSD storage
default_statistics_target = 100         # Statistics detail level

For traditional HDD:

random_page_cost = 4.0
effective_io_concurrency = 2

Connection and Concurrency Settings

# Connection Settings
max_connections = 100                   # Adjust based on workload
superuser_reserved_connections = 3

# Background Worker Settings
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_parallel_maintenance_workers = 4

Autovacuum Optimization

Autovacuum is critical for PostgreSQL performance:

# Autovacuum Settings
autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.05
autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 1000

Manual vacuum operations:

-- Analyze all databases
VACUUM ANALYZE;

-- Vacuum specific table
VACUUM ANALYZE table_name;

-- Full vacuum (reclaims space, requires exclusive lock)
VACUUM FULL table_name;

-- Check table bloat
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
    n_live_tup,
    n_dead_tup,
    round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

Index Optimization

-- Find missing indexes
SELECT
    schemaname,
    tablename,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_tup_read / seq_scan AS avg_seq_tup_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 10;

-- Find unused indexes
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(indexrelid) DESC;

-- Rebuild bloated index
REINDEX INDEX index_name;

-- Rebuild all indexes on table
REINDEX TABLE table_name;

Monitoring and Statistics

Enable track activities:

# Statistics Settings
track_activities = on
track_counts = on
track_io_timing = on
track_functions = all
stats_temp_directory = '/var/run/postgresql/stats_temp'

Useful monitoring queries:

-- Current activity
SELECT
    pid,
    usename,
    application_name,
    client_addr,
    state,
    query,
    query_start
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;

-- Long-running queries
SELECT
    pid,
    now() - query_start AS duration,
    state,
    query
FROM pg_stat_activity
WHERE state != 'idle'
AND now() - query_start > interval '5 minutes'
ORDER BY duration DESC;

-- Database sizes
SELECT
    datname,
    pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

-- Table sizes with indexes
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
    pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS index_size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 20;

-- Cache hit ratio (should be > 99%)
SELECT
    sum(heap_blks_read) as heap_read,
    sum(heap_blks_hit)  as heap_hit,
    sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100 AS cache_hit_ratio
FROM pg_statio_user_tables;

Backup Strategies

Logical Backups with pg_dump

Basic backup commands:

# Backup single database
pg_dump -U postgres -d myapp_db -f backup_$(date +%Y%m%d).sql

# Backup with compression
pg_dump -U postgres -d myapp_db | gzip > backup_$(date +%Y%m%d).sql.gz

# Custom format backup (compressed, allows parallel restore)
pg_dump -U postgres -d myapp_db -Fc -f backup_$(date +%Y%m%d).dump

# Directory format (allows parallel backup and restore)
pg_dump -U postgres -d myapp_db -Fd -j 4 -f backup_$(date +%Y%m%d)

# Backup all databases
pg_dumpall -U postgres -f all_databases_$(date +%Y%m%d).sql

# Backup only schema
pg_dump -U postgres -d myapp_db -s -f schema_$(date +%Y%m%d).sql

# Backup only data
pg_dump -U postgres -d myapp_db -a -f data_$(date +%Y%m%d).sql

# Backup specific tables
pg_dump -U postgres -d myapp_db -t table1 -t table2 -f tables_$(date +%Y%m%d).sql

Physical Backups with pg_basebackup

# Create base backup
pg_basebackup -U postgres -D /backup/pgdata_$(date +%Y%m%d) -Ft -z -P

# Backup with WAL files
pg_basebackup -U postgres -D /backup/pgdata_$(date +%Y%m%d) -Ft -z -Xs -P

# Options explained:
# -D: Output directory
# -Ft: Tar format
# -z: Compress with gzip
# -Xs: Include WAL files in backup
# -P: Show progress

Automated Backup Script

Create comprehensive backup script:

sudo nano /usr/local/bin/pg-backup.sh

Add content:

#!/bin/bash

# Configuration
BACKUP_DIR="/backups/postgresql"
PG_USER="postgres"
RETENTION_DAYS=7
DATE=$(date +%Y%m%d_%H%M%S)
LOG_FILE="$BACKUP_DIR/backup.log"

# Create backup directory
mkdir -p $BACKUP_DIR

# Function to log messages
log_message() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a $LOG_FILE
}

# Start backup
log_message "Starting PostgreSQL backup"

# Backup all databases
pg_dumpall -U $PG_USER | gzip > $BACKUP_DIR/all_databases_$DATE.sql.gz

if [ $? -eq 0 ]; then
    log_message "Backup completed successfully: all_databases_$DATE.sql.gz"
else
    log_message "ERROR: Backup failed"
    exit 1
fi

# Backup individual databases
psql -U $PG_USER -t -c "SELECT datname FROM pg_database WHERE datistemplate = false AND datname != 'postgres';" | while read dbname; do
    if [ ! -z "$dbname" ]; then
        pg_dump -U $PG_USER -Fc $dbname > $BACKUP_DIR/${dbname}_$DATE.dump
        log_message "Database backup completed: ${dbname}_$DATE.dump"
    fi
done

# Remove old backups
find $BACKUP_DIR -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
find $BACKUP_DIR -name "*.dump" -mtime +$RETENTION_DAYS -delete
log_message "Old backups removed (older than $RETENTION_DAYS days)"

# Calculate backup size
TOTAL_SIZE=$(du -sh $BACKUP_DIR | cut -f1)
log_message "Total backup size: $TOTAL_SIZE"

log_message "Backup process completed"

Make executable and schedule:

# Make executable
sudo chmod +x /usr/local/bin/pg-backup.sh

# Test the script
sudo /usr/local/bin/pg-backup.sh

# Schedule with cron (daily at 2 AM)
sudo crontab -e

# Add line:
0 2 * * * /usr/local/bin/pg-backup.sh

Point-in-Time Recovery (PITR)

Configure WAL archiving:

# WAL Archiving
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /backup/wal_archive/%f && cp %p /backup/wal_archive/%f'
archive_timeout = 300

Create WAL archive directory:

sudo mkdir -p /backup/wal_archive
sudo chown postgres:postgres /backup/wal_archive
sudo chmod 700 /backup/wal_archive

Restart PostgreSQL:

sudo systemctl restart postgresql

Restore Operations

Restore from pg_dump:

# Restore SQL format
psql -U postgres -d myapp_db -f backup_20260111.sql

# Restore compressed backup
gunzip < backup_20260111.sql.gz | psql -U postgres -d myapp_db

# Restore custom format (with parallel jobs)
pg_restore -U postgres -d myapp_db -j 4 backup_20260111.dump

# Restore all databases
psql -U postgres -f all_databases_20260111.sql

# Create database before restore
createdb -U postgres myapp_db
pg_restore -U postgres -d myapp_db backup_20260111.dump

Restore from base backup:

# Stop PostgreSQL
sudo systemctl stop postgresql

# Clear data directory
sudo rm -rf /var/lib/postgresql/16/main/*

# Extract backup
sudo tar -xzf /backup/base.tar.gz -C /var/lib/postgresql/16/main/

# Create recovery.signal file
sudo touch /var/lib/postgresql/16/main/recovery.signal

# Configure recovery (if needed)
sudo nano /var/lib/postgresql/16/main/postgresql.auto.conf

# Add:
restore_command = 'cp /backup/wal_archive/%f %p'
recovery_target_time = '2026-01-11 14:30:00'

# Fix permissions
sudo chown -R postgres:postgres /var/lib/postgresql/16/main

# Start PostgreSQL
sudo systemctl start postgresql

Troubleshooting

Connection Issues

Cannot connect to PostgreSQL:

# Check if PostgreSQL is running
sudo systemctl status postgresql

# Check listening ports
sudo ss -tulpn | grep postgres

# Check logs
sudo tail -f /var/log/postgresql/postgresql-16-main.log
sudo journalctl -u postgresql -f

# Test local connection
psql -U postgres

# Test remote connection
psql -h 192.168.1.100 -U postgres -d postgres

# Check pg_hba.conf
sudo cat /etc/postgresql/16/main/pg_hba.conf | grep -v "^#" | grep -v "^$"

Authentication failures:

# Verify user exists
sudo -u postgres psql -c "\du"

# Reset user password
sudo -u postgres psql -c "ALTER USER username WITH PASSWORD 'newpassword';"

# Check authentication method in pg_hba.conf
sudo grep "^host" /etc/postgresql/16/main/pg_hba.conf

Performance Issues

Identify slow queries:

-- Enable query timing
\timing

-- Find slow queries
SELECT
    pid,
    now() - pg_stat_activity.query_start AS duration,
    query,
    state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';

-- Kill slow query
SELECT pg_cancel_backend(pid);  -- Graceful
SELECT pg_terminate_backend(pid);  -- Forceful

-- Analyze query performance
EXPLAIN ANALYZE
SELECT * FROM large_table WHERE column = 'value';

-- Find tables needing vacuum
SELECT
    schemaname,
    tablename,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Disk Space Issues

# Check disk usage
df -h

# Check PostgreSQL data directory
sudo du -sh /var/lib/postgresql/16/main/

# Check individual databases
sudo -u postgres psql -c "SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database ORDER BY pg_database_size(datname) DESC;"

# Clean up WAL files (if not using archiving)
sudo -u postgres pg_archivecleanup /var/lib/postgresql/16/main/pg_wal/ $(ls -t /var/lib/postgresql/16/main/pg_wal/ | tail -1)

# Vacuum to reclaim space
sudo -u postgres psql -d myapp_db -c "VACUUM FULL;"

# Find large tables
sudo -u postgres psql -d myapp_db -c "
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;"

Replication Issues

Check replication status:

-- On primary server
SELECT * FROM pg_stat_replication;

-- On standby server
SELECT * FROM pg_stat_wal_receiver;

-- Check replication lag
SELECT
    client_addr,
    state,
    pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) AS send_lag,
    pg_wal_lsn_diff(pg_current_wal_lsn(), write_lsn) AS write_lag,
    pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn) AS flush_lag,
    pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS replay_lag
FROM pg_stat_replication;

Database Corruption

# Check for corruption
sudo -u postgres pg_checksums -D /var/lib/postgresql/16/main/

# Attempt automatic repair
sudo -u postgres psql -d myapp_db -c "REINDEX DATABASE myapp_db;"

# Check specific table
sudo -u postgres psql -d myapp_db -c "REINDEX TABLE table_name;"

# If corruption persists, restore from backup

Conclusion

PostgreSQL stands as one of the most capable and reliable database systems available, combining enterprise-grade features with open-source accessibility. This comprehensive guide has covered the complete installation and configuration process, from initial setup through advanced optimization and security hardening.

Key Takeaways

  1. Proper Authentication: Use strong authentication methods like scram-sha-256 and properly configure pg_hba.conf
  2. Performance Tuning: Optimize memory settings based on your hardware and workload
  3. Regular Maintenance: Schedule autovacuum and regular VACUUM ANALYZE operations
  4. Comprehensive Backups: Implement multiple backup strategies including logical and physical backups
  5. Security First: Enable SSL, use firewalls, and implement role-based access control
  6. Monitor Actively: Track performance metrics, slow queries, and resource utilization

Best Practices Summary

  • Always use the latest stable PostgreSQL version
  • Enable SSL/TLS for all remote connections
  • Implement proper role-based access control
  • Schedule regular automated backups
  • Test backup restoration procedures regularly
  • Monitor performance and optimize queries
  • Keep shared_buffers at 25% of RAM for dedicated servers
  • Use connection pooling (PgBouncer) for high-traffic applications
  • Enable comprehensive logging for troubleshooting
  • Regular VACUUM and ANALYZE operations
  • Document your configuration and procedures
  • Plan for high availability with replication

Next Steps

After mastering the basics, explore these advanced topics:

  • Replication: Set up streaming replication for high availability
  • Connection Pooling: Implement PgBouncer or pgpool-II
  • Extensions: Explore PostGIS, pg_stat_statements, and other extensions
  • Partitioning: Implement table partitioning for large datasets
  • Monitoring: Deploy comprehensive monitoring with Prometheus and Grafana
  • High Availability: Configure automatic failover with Patroni
  • Load Balancing: Distribute read queries across replicas

Additional Resources

PostgreSQL's strength lies in its reliability, feature richness, and active community. By following this guide and continuing to learn, you'll be well-equipped to manage PostgreSQL databases in any environment, from development to enterprise production systems.