Recuperación en un Momento Específico para Bases de Datos

Point-in-time recovery (PITR) enables you to recover a database to any specific moment, protecting against data corruption and accidental deletions. This guide covers MySQL binary logs, PostgreSQL WAL archiving, PITR procedures, validation techniques, and practical recovery scenarios.

Tabla de Contenidos

  1. PITR Concepts
  2. MySQL PITR with Binary Logs
  3. PostgreSQL PITR with WAL
  4. Recovery Procedures
  5. Validation and Testing
  6. Automated PITR
  7. Troubleshooting
  8. Conclusion

Conceptos de PITR

Point-in-time recovery combines:

  • Base Backup: A consistent snapshot of the database at a specific time
  • Transaction Logs: Incremental changes recorded since the base backup
  • Recovery Mechanism: Replaying logs to reach the desired point in time
# PITR Timeline Visualization
cat > /tmp/pitr-timeline.txt << 'EOF'
Timeline of Database States:

Time     Event
----     -----
00:00 -- Base Backup taken
04:30 -- Incremental changes logged (00:00-04:30)
09:15 -- ERROR: Data corruption detected
         GOAL: Recover to 09:10 (before corruption)
12:00 -- Recovery complete from logs

Database state at different times:
- 00:00: Base backup
- 04:30: Base + logs 00:00-04:30
- 09:10: Base + logs 00:00-09:10 ← Recovery target
- 09:15: Base + logs 00:00-09:15 (includes corruption)
- 12:00: Recovered to 09:10 state

Recovery Requirements:
- Base backup from before target time: YES
- Transaction logs to target time: YES
- Transaction logs integrity: YES
EOF

cat /tmp/pitr-timeline.txt

PITR de MySQL con Logs Binarios

Habilitar Binary Logging

# Configure MySQL for binary logging and PITR
cat > /etc/mysql/mysql.conf.d/pitr.cnf << 'EOF'
[mysqld]
# Binary logging setup
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = MIXED
# ROW format for better recovery, MIXED for performance
# Use STATEMENT for older versions

# Expire binary logs after 7 days
expire_logs_days = 7

# Optional: Set max_binlog_size for rotation
max_binlog_size = 1073741824  # 1GB

# Server ID (required for replication/PITR)
server_id = 1

# Ensure durability for PITR
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1

# Replication settings (for backup-based PITR)
binlog_cache_size = 32768
EOF

systemctl restart mysql

# Verify binary logging is enabled
mysql -u root -e "SHOW VARIABLES LIKE 'log_bin%';"
mysql -u root -e "SHOW VARIABLES LIKE 'binlog_format';"

Crear Copia de Seguridad Base para PITR

# Create base backup with binary log position
create_pitr_base_backup() {
    local backup_dir="/backup/pitr"
    local timestamp=$(date +%Y%m%d_%H%M%S)
    local backup_file="$backup_dir/base-backup-$timestamp.sql.gz"
    
    mkdir -p "$backup_dir"
    
    echo "Creating base backup for PITR at $timestamp"
    
    # Backup with binary log position recorded
    mysqldump \
        -u root \
        --all-databases \
        --single-transaction \
        --master-data=2 \
        --quick \
        --lock-tables=false | gzip > "$backup_file"
    
    if [ $? -eq 0 ]; then
        # Extract binary log position from backup
        log_info=$(zcat "$backup_file" | grep -A 5 "CHANGE MASTER")
        echo "Backup completed: $backup_file"
        echo "Binary log information:"
        echo "$log_info"
        
        # Save log position for recovery reference
        echo "$log_info" > "$backup_dir/binlog-position-$timestamp.txt"
    else
        echo "Error: Backup failed"
        return 1
    fi
}

create_pitr_base_backup

Recopilar Logs Binarios

# Collect and archive binary logs
manage_binary_logs() {
    local binlog_dir="/var/log/mysql"
    local archive_dir="/backup/pitr/binlogs"
    
    mkdir -p "$archive_dir"
    
    echo "Archiving binary logs for PITR"
    
    # Copy binary logs to archive location
    find "$binlog_dir" -name "mysql-bin.*" -type f | while read logfile; do
        # Don't copy the current log being written to
        if ! lsof "$logfile" 2>/dev/null | grep -q mysqld; then
            gzip -c "$logfile" > "$archive_dir/$(basename $logfile).gz"
        fi
    done
    
    # Verify archived logs
    ls -lh "$archive_dir" | tail -10
}

# Export binary logs for transport to other servers
export_binary_logs() {
    local binlog_dir="/var/log/mysql"
    local export_file="/backup/binary-logs-export-$(date +%Y%m%d).tar.gz"
    
    # Tar and compress binary logs (but not current log)
    find "$binlog_dir" -name "mysql-bin.*" ! -name "mysql-bin.index" | \
        tar -czf "$export_file" -T -
    
    echo "Binary logs exported to: $export_file"
}

Realizar PITR de MySQL

# Recover MySQL to point-in-time
perform_mysql_pitr() {
    local base_backup=$1
    local target_time=$2  # Format: '2024-01-15 14:30:00'
    local recovery_db="recovered_db"
    
    echo "Performing MySQL PITR"
    echo "Base backup: $base_backup"
    echo "Target time: $target_time"
    
    # Step 1: Restore base backup
    echo "Step 1: Restoring base backup..."
    if [[ "$base_backup" == *.gz ]]; then
        gunzip < "$base_backup" | mysql -u root
    else
        mysql -u root < "$base_backup"
    fi
    
    if [ $? -ne 0 ]; then
        echo "Error: Base backup restoration failed"
        return 1
    fi
    
    # Step 2: Get binary log position from backup
    local binlog_file=$(zcat "$base_backup" 2>/dev/null | \
        grep "CHANGE MASTER TO" | \
        grep -o "MASTER_LOG_FILE='[^']*'" | cut -d"'" -f2)
    
    local binlog_pos=$(zcat "$base_backup" 2>/dev/null | \
        grep "CHANGE MASTER TO" | \
        grep -o "MASTER_LOG_POS=[0-9]*" | cut -d'=' -f2)
    
    echo "Starting from binlog: $binlog_file at position $binlog_pos"
    
    # Step 3: Replay binary logs until target time
    echo "Step 3: Replaying binary logs to target time..."
    
    # Extract binary log and filter by time
    mysqlbinlog \
        --start-datetime="$(date -d "$binlog_file" '+%Y-%m-%d %H:%M:%S' 2>/dev/null || echo '2024-01-01 00:00:00')" \
        --stop-datetime="$target_time" \
        "/var/log/mysql/$binlog_file" | \
        mysql -u root
    
    if [ $? -eq 0 ]; then
        echo "✓ PITR completed successfully"
        echo "Database recovered to: $target_time"
    else
        echo "✗ PITR failed"
        return 1
    fi
}

Avanzado: PITR Selectivo por Base de Datos

# Recover specific database using PITR
recover_specific_database_pitr() {
    local base_backup=$1
    local target_db=$2
    local target_time=$3
    
    echo "Recovering specific database: $target_db to $target_time"
    
    # Create temporary database for recovery
    local temp_db="temp_recovery_$$"
    
    # Step 1: Restore base backup to temp database
    mysql -u root -e "CREATE DATABASE $temp_db;"
    
    # Restore only target database
    if [[ "$base_backup" == *.gz ]]; then
        gunzip < "$base_backup" | grep "^-- Database: $target_db" -A 10000 | \
            mysql -u root "$temp_db"
    fi
    
    # Step 2: Find and apply binary logs for this database only
    mysqlbinlog \
        --stop-datetime="$target_time" \
        --database="$target_db" \
        /var/log/mysql/mysql-bin.* | \
        mysql -u root "$temp_db"
    
    # Step 3: Verify recovery
    echo "Verifying recovered database..."
    mysql -u root "$temp_db" -e "SHOW TABLES; SELECT COUNT(*) as row_count FROM information_schema.TABLES;"
    
    # Step 4: Swap databases
    mysql -u root << EOF
RENAME TABLE $target_db TO ${target_db}_old;
RENAME TABLE $temp_db TO $target_db;
EOF
    
    echo "✓ Database recovery complete"
    echo "Original backed up as: ${target_db}_old"
}

PITR de PostgreSQL con WAL

Habilitar Archivado de WAL

# Configure PostgreSQL for PITR
configure_postgresql_pitr() {
    local wal_archive_dir="/var/lib/postgresql/wal-archive"
    
    # Create WAL archive directory
    mkdir -p "$wal_archive_dir"
    chown postgres:postgres "$wal_archive_dir"
    chmod 700 "$wal_archive_dir"
    
    # Update PostgreSQL configuration
    cat >> /etc/postgresql/14/main/postgresql.conf << 'EOF'

# PITR Configuration
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/wal-archive/%f && cp %p /var/lib/postgresql/wal-archive/%f'
archive_timeout = 300
restore_command = 'cp /var/lib/postgresql/wal-archive/%f %p'
recovery_target_timeline = 'latest'
EOF
    
    # Reload configuration
    systemctl reload postgresql
    
    echo "PostgreSQL PITR configured"
}

# Verify WAL archiving is working
verify_wal_archiving() {
    sudo -u postgres psql << 'EOF'
-- Check WAL level
SHOW wal_level;

-- Check archive status
SELECT * FROM pg_stat_archiver;

-- Check available WAL files
EOF
    
    # Check for archived WAL files
    ls -lh /var/lib/postgresql/wal-archive/ | head -20
}

Crear Copia de Seguridad Base de PostgreSQL

# Create base backup for PostgreSQL PITR
create_postgresql_pitr_backup() {
    local backup_dir="/backup/pitr"
    local timestamp=$(date +%Y%m%d_%H%M%S)
    
    mkdir -p "$backup_dir"
    
    echo "Creating PostgreSQL base backup for PITR"
    
    # Use pg_basebackup for PITR-capable backup
    sudo -u postgres pg_basebackup \
        -D "$backup_dir/base-backup-$timestamp" \
        -Xstream \
        -Pv \
        -l "PITR-backup-$timestamp"
    
    if [ $? -eq 0 ]; then
        echo "✓ Base backup created successfully"
        
        # Record backup start/stop LSN
        sudo -u postgres psql << EOF
SELECT pg_current_wal_lsn() as current_lsn;
EOF
        
        # Verify backup contents
        ls -la "$backup_dir/base-backup-$timestamp/backup_label"
    fi
}

Realizar PITR de PostgreSQL

# Recover PostgreSQL to point-in-time
perform_postgresql_pitr() {
    local base_backup_dir=$1
    local target_time=$2  # Format: '2024-01-15 14:30:00'
    local recovery_dir="/var/lib/postgresql/14/main"
    
    echo "Performing PostgreSQL PITR"
    echo "Base backup: $base_backup_dir"
    echo "Target time: $target_time"
    
    # Step 1: Stop PostgreSQL
    echo "Stopping PostgreSQL..."
    systemctl stop postgresql
    
    # Step 2: Backup current data directory
    echo "Backing up current data directory..."
    cp -r "$recovery_dir" "$recovery_dir.backup-$(date +%s)"
    
    # Step 3: Restore base backup
    echo "Restoring base backup..."
    sudo -u postgres rm -rf "$recovery_dir"
    sudo -u postgres cp -r "$base_backup_dir" "$recovery_dir"
    
    # Step 4: Create recovery configuration
    echo "Configuring recovery parameters..."
    cat > "$recovery_dir/recovery.conf" << EOF
restore_command = 'cp /var/lib/postgresql/wal-archive/%f %p'
recovery_target_time = '$target_time'
recovery_target_timeline = 'latest'
recovery_target_xid = ''
recovery_target_inclusive = true
pause_at_recovery_target = false
EOF
    
    # Step 5: Set correct ownership
    chown -R postgres:postgres "$recovery_dir"
    chmod 700 "$recovery_dir"
    
    # Step 6: Start PostgreSQL (will perform recovery)
    echo "Starting PostgreSQL for recovery..."
    systemctl start postgresql
    
    # Step 7: Wait for recovery to complete
    sleep 10
    
    # Verify recovery
    if sudo -u postgres psql -c "SELECT now();" > /dev/null 2>&1; then
        echo "✓ PostgreSQL PITR completed successfully"
        echo "Database recovered to: $target_time"
    else
        echo "✗ Recovery failed"
        return 1
    fi
}

Procedimientos de Recuperación

Recuperación PITR Paso a Paso

# Complete PITR recovery procedure
complete_pitr_recovery() {
    local db_type=$1  # "mysql" or "postgresql"
    local target_time=$2
    local base_backup=$3
    
    cat << EOF
====================================
Point-in-Time Recovery Procedure
====================================

Database Type: $db_type
Target Time: $target_time
Base Backup: $base_backup

SAFETY CHECKS
=============
1. Verify target_time is in the past: YES/NO
2. Verify base backup exists: YES/NO
3. Verify transaction logs exist: YES/NO
4. Backup current database: YES/NO
5. Notify stakeholders: YES/NO

RECOVERY STEPS
==============
1. Verify base backup integrity
2. Restore base backup
3. Identify binary/WAL logs to apply
4. Apply transaction logs to target time
5. Verify recovered data
6. Validate application connectivity
7. Compare data with production backup

ROLLBACK PLAN
=============
If recovery fails or produces incorrect data:
1. Stop database
2. Restore from backup-$(date +%s)
3. Contact database administrator
4. Review error logs in /var/log/

COMPLETION CHECKLIST
====================
[ ] Database online and accessible
[ ] All critical tables present
[ ] Data at target time verified
[ ] Application tests passed
[ ] Monitoring alerts cleared
[ ] Recovery documentation updated
[ ] Stakeholders notified
[ ] Lessons learned captured
EOF
}

# Test recovery to specific point
test_recovery_to_point() {
    local test_time=$1
    
    echo "Testing PITR to: $test_time"
    
    # Determine database type and call appropriate function
    if command -v mysql &> /dev/null; then
        perform_mysql_pitr "/backup/pitr/base-backup.sql.gz" "$test_time"
    elif command -v psql &> /dev/null; then
        perform_postgresql_pitr "/backup/pitr/base-backup-dir" "$test_time"
    fi
}

Validación y Pruebas

Validar Resultados de PITR

# Comprehensive PITR validation
validate_pitr_recovery() {
    local recovered_db=$1
    local reference_backup=$2
    
    echo "Validating PITR recovery"
    
    # 1. Check table count
    echo "Step 1: Verifying table structure..."
    mysql -u root "$recovered_db" -sNe \
        "SELECT COUNT(*) FROM information_schema.TABLES WHERE TABLE_SCHEMA='$recovered_db';"
    
    # 2. Check row counts for critical tables
    echo "Step 2: Verifying row counts..."
    mysql -u root "$recovered_db" -sNe \
        "SELECT TABLE_NAME, TABLE_ROWS FROM information_schema.TABLES WHERE TABLE_SCHEMA='$recovered_db' ORDER BY TABLE_ROWS DESC LIMIT 5;"
    
    # 3. Run table integrity checks
    echo "Step 3: Checking table integrity..."
    mysql -u root "$recovered_db" -sNe \
        "CHECK TABLE $(mysql -u root "$recovered_db" -sNe 'SELECT GROUP_CONCAT(TABLE_NAME) FROM information_schema.TABLES WHERE TABLE_SCHEMA=DATABASE();' | head -1);"
    
    # 4. Verify data consistency
    echo "Step 4: Comparing with reference..."
    # Calculate hash of critical data
    recovered_hash=$(mysql -u root "$recovered_db" -sNe \
        "SELECT MD5(GROUP_CONCAT(MD5(CONCAT_WS('|', *)) ORDER BY 1)) FROM users;" 2>/dev/null)
    
    reference_hash=$(mysql -u root "$reference_backup" -sNe \
        "SELECT MD5(GROUP_CONCAT(MD5(CONCAT_WS('|', *)) ORDER BY 1)) FROM users;" 2>/dev/null)
    
    if [ "$recovered_hash" = "$reference_hash" ]; then
        echo "✓ Data integrity verified"
    else
        echo "⚠ Data mismatch detected"
    fi
}

# Automated PITR testing
test_pitr_regularly() {
    local test_log="/var/log/pitr-tests.log"
    local test_interval_days=7
    
    echo "[$(date)] Testing PITR capability" >> "$test_log"
    
    # Test recovery to 1 hour ago
    local target_time=$(date -d '-1 hour' '+%Y-%m-%d %H:%M:%S')
    
    if test_recovery_to_point "$target_time"; then
        echo "[$(date)] ✓ PITR test passed" >> "$test_log"
    else
        echo "[$(date)] ✗ PITR test failed" >> "$test_log"
    fi
}

PITR Automatizado

Script de Recuperación Automatizado

# Automated PITR with minimal manual intervention
cat > /usr/local/bin/automated-pitr.sh << 'EOF'
#!/bin/bash

# Configuration
DB_TYPE=${1:-"mysql"}  # mysql or postgresql
TARGET_TIME=$2
BASE_BACKUP=$3
RECOVERY_MODE=${4:-"manual"}  # manual or automatic

validate_inputs() {
    if [ -z "$TARGET_TIME" ] || [ -z "$BASE_BACKUP" ]; then
        echo "Usage: $0 <db_type> <target_time> <base_backup> [recovery_mode]"
        echo "Example: $0 mysql '2024-01-15 14:30:00' /backup/base.sql.gz"
        return 1
    fi
}

check_prerequisites() {
    local checks_passed=0
    
    echo "Checking prerequisites..."
    
    # Check if backup exists
    [ -f "$BASE_BACKUP" ] && echo "✓ Base backup found" && ((checks_passed++))
    
    # Check target time is in past
    target_epoch=$(date -d "$TARGET_TIME" +%s)
    current_epoch=$(date +%s)
    [ $target_epoch -lt $current_epoch ] && echo "✓ Target time is in past" && ((checks_passed++))
    
    # Check database connectivity
    if [ "$DB_TYPE" = "mysql" ]; then
        mysql -u root -e "SELECT 1" > /dev/null 2>&1 && echo "✓ MySQL accessible" && ((checks_passed++))
    else
        sudo -u postgres psql -c "SELECT 1" > /dev/null 2>&1 && echo "✓ PostgreSQL accessible" && ((checks_passed++))
    fi
    
    [ $checks_passed -eq 3 ] && return 0 || return 1
}

main() {
    validate_inputs || exit 1
    check_prerequisites || exit 1
    
    echo "Starting PITR to: $TARGET_TIME"
    
    if [ "$RECOVERY_MODE" = "automatic" ]; then
        # Automated recovery
        if [ "$DB_TYPE" = "mysql" ]; then
            perform_mysql_pitr "$BASE_BACKUP" "$TARGET_TIME"
        else
            perform_postgresql_pitr "$BASE_BACKUP" "$TARGET_TIME"
        fi
    else
        # Manual recovery with prompts
        read -p "Backup current database? (y/n) " -r
        [ "$REPLY" = "y" ] && backup_current_database
        
        read -p "Proceed with PITR? (y/n) " -r
        [ "$REPLY" = "y" ] && perform_pitr
    fi
    
    echo "PITR process complete"
}

main
EOF

chmod +x /usr/local/bin/automated-pitr.sh

Resolución de Problemas

Problemas Comunes de PITR

# Troubleshoot PITR problems
troubleshoot_pitr() {
    local issue=$1
    
    case "$issue" in
        "missing_binlogs")
            echo "Solution: Binary logs were purged before PITR target"
            echo "- Check expire_logs_days setting"
            echo "- Increase retention period"
            echo "- Cannot recover further than oldest available binlog"
            ;;
        "transaction_log_corruption")
            echo "Solution: Transaction log file is corrupted"
            echo "- Check disk health for I/O errors"
            echo "- Verify checksums: mysqlbinlog --verify-binlog-checksum"
            echo "- May need to recover to earlier point"
            ;;
        "insufficient_space")
            echo "Solution: Not enough disk space for recovery"
            echo "- Free up disk space on recovery destination"
            echo "- Compress backup files to save space"
            echo "- Use external storage for recovery"
            ;;
        "application_incompatible")
            echo "Solution: Application cannot connect to recovered database"
            echo "- Verify database version compatibility"
            echo "- Check character set and collation"
            echo "- Update application connection strings"
            ;;
    esac
}

# Verify PITR capability
verify_pitr_readiness() {
    echo "PITR Readiness Check"
    echo "===================="
    
    # Check MySQL
    if command -v mysql &> /dev/null; then
        echo ""
        echo "MySQL Status:"
        mysql -u root -e "SHOW VARIABLES LIKE 'log_bin';"
        mysql -u root -e "SHOW VARIABLES LIKE 'expire_logs_days';"
        mysql -u root -e "SHOW BINARY LOGS;" | head -5
    fi
    
    # Check PostgreSQL
    if command -v psql &> /dev/null; then
        echo ""
        echo "PostgreSQL Status:"
        sudo -u postgres psql -c "SHOW wal_level;"
        sudo -u postgres psql -c "SHOW archive_mode;"
        
        echo ""
        echo "Available WAL files:"
        ls -lh /var/lib/postgresql/wal-archive/ | wc -l
        ls -lh /var/lib/postgresql/wal-archive/ | tail -5
    fi
}

Conclusión

Point-in-time recovery requires:

  1. Proper Configuration: Enable binary logging (MySQL) or WAL archiving (PostgreSQL)
  2. Base Backups: Regular full backups at known positions
  3. Log Retention: Keep transaction logs for the desired recovery window
  4. Testing: Regular recovery drills to verify procedures work
  5. Monitoring: Track backup and log file availability

PITR is most useful for:

  • Accidental data deletion recovery
  • Correction of bad schema migrations
  • Investigation of when data changed
  • Recovering from logical errors (not physical corruption)

Always maintain multiple backup strategies, as PITR is not a substitute for regular backups but rather a complement to them.