Automatic Backup with rsync: Complete Guide for Linux Systems
Introduction
Rsync (Remote Sync) is one of the most powerful and versatile file synchronization and backup tools available for Linux systems. First released in 1996, rsync has become the de facto standard for efficient file transfers and incremental backups across the Linux ecosystem. Its ability to transfer only the differences between source and destination files makes it exceptionally efficient for both local and remote backups.
In production environments, rsync's reliability and efficiency have made it the backbone of countless backup strategies. From small single-server deployments to large-scale enterprise infrastructures, rsync provides a flexible foundation for implementing the 3-2-1 backup rule discussed in comprehensive backup strategies.
This guide explores rsync from fundamentals to advanced automation, covering local and remote backups, incremental synchronization, scheduling, monitoring, and real-world implementation scenarios. Whether you're backing up a small web server or orchestrating complex multi-server backup operations, mastering rsync is essential for any Linux system administrator.
Understanding rsync Fundamentals
How rsync Works
Rsync uses a sophisticated algorithm to minimize data transfer by sending only the differences between source and destination files. The process works as follows:
- File comparison: Rsync compares files on both source and destination using file size and modification times
- Checksum calculation: For modified files, rsync divides files into blocks and calculates checksums
- Delta transfer: Only changed blocks are transmitted, dramatically reducing bandwidth usage
- Reconstruction: The destination reassembles files using unchanged blocks and received deltas
This approach makes rsync extraordinarily efficient compared to traditional file copy methods, especially for large datasets with small changes.
Key Features and Benefits
Incremental transfers: Only changed portions of files are transferred, saving bandwidth and time.
Preserve attributes: Maintains file permissions, ownership, timestamps, symbolic links, and other metadata.
Compression: Built-in compression reduces network transfer sizes.
Partial transfer resume: Interrupted transfers can resume from the point of failure.
Delete synchronization: Can remove files from destination that no longer exist at source.
Flexible filtering: Powerful include/exclude patterns for selective backups.
Local and remote: Works seamlessly for local copies or transfers over SSH.
rsync vs Traditional Copy Methods
Comparison with common alternatives:
rsync vs cp (copy):
- cp copies entire files every time
- rsync transfers only changes
- rsync preserves attributes more reliably
- rsync supports remote transfers natively
rsync vs scp (secure copy):
- scp copies entire files
- rsync resumes interrupted transfers
- rsync is significantly faster for updates
- Both use SSH for security
rsync vs tar + transfer:
- tar creates archives, requiring extraction
- rsync maintains live filesystem structure
- rsync enables incremental updates
- tar may be better for full snapshots
Installation and Basic Configuration
Installing rsync
Rsync is typically pre-installed on most Linux distributions, but you can install or update it:
Ubuntu/Debian:
# Check if installed
rsync --version
# Install or update
sudo apt update
sudo apt install rsync
CentOS/Rocky Linux/RHEL:
# Check if installed
rsync --version
# Install or update
sudo dnf install rsync
# or on older systems
sudo yum install rsync
Verify installation:
rsync --version
# Expected output:
# rsync version 3.2.x protocol version 31
Basic rsync Syntax
The general syntax structure:
rsync [OPTIONS] SOURCE DESTINATION
Critical syntax considerations:
Trailing slashes matter significantly:
# Copy directory itself to destination
rsync -av /source/directory /destination/
# Result: /destination/directory/
# Copy directory contents to destination
rsync -av /source/directory/ /destination/
# Result: /destination/[contents]
Essential rsync Options
Understanding key options is crucial for effective usage:
-a (archive mode): Combines most common options, equivalent to -rlptgoD
- -r: Recursive
- -l: Copy symlinks as symlinks
- -p: Preserve permissions
- -t: Preserve modification times
- -g: Preserve group
- -o: Preserve owner
- -D: Preserve device files and special files
-v (verbose): Display detailed progress information
-z (compress): Compress data during transfer (useful for remote backups)
-h (human-readable): Display numbers in human-readable format
-P (progress + partial): Show transfer progress and keep partially transferred files
--delete: Remove files from destination that don't exist in source
-n (dry-run): Simulate operation without making changes
Basic Usage Examples
Local directory backup:
# Copy /home/user to /backup
rsync -av /home/user/ /backup/user-backup/
# With progress indicator
rsync -avhP /home/user/ /backup/user-backup/
Remote backup over SSH:
# Push local to remote
rsync -avz /local/data/ user@remote-server:/backup/data/
# Pull remote to local
rsync -avz user@remote-server:/var/www/ /local/backup/www/
Dry-run test before actual backup:
# Test what would be transferred
rsync -avhn --delete /source/ /destination/
# Review output, then run actual sync
rsync -avh --delete /source/ /destination/
Implementing Local Backups with rsync
Simple Local Backup Script
Create a basic backup script for local data protection:
#!/bin/bash
# /usr/local/bin/rsync-local-backup.sh
# Configuration
SOURCE_DIR="/home"
BACKUP_DIR="/backup/home"
LOG_FILE="/var/log/backup/rsync-local.log"
DATE=$(date +"%Y-%m-%d %H:%M:%S")
# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"
mkdir -p "$(dirname "$LOG_FILE")"
# Log backup start
echo "[$DATE] Starting local backup" >> "$LOG_FILE"
# Perform backup
rsync -av \
--delete \
--exclude='*.tmp' \
--exclude='.cache' \
--exclude='Downloads/*' \
"$SOURCE_DIR/" \
"$BACKUP_DIR/" \
>> "$LOG_FILE" 2>&1
# Check exit status
if [ $? -eq 0 ]; then
echo "[$DATE] Backup completed successfully" >> "$LOG_FILE"
exit 0
else
echo "[$DATE] Backup failed with errors" >> "$LOG_FILE"
exit 1
fi
Make script executable:
sudo chmod +x /usr/local/bin/rsync-local-backup.sh
Advanced Local Backup with Rotation
Implement backup rotation to maintain multiple historical versions:
#!/bin/bash
# /usr/local/bin/rsync-rotating-backup.sh
# Configuration
SOURCE_DIR="/var/www"
BACKUP_ROOT="/backup/www"
CURRENT="$BACKUP_ROOT/current"
DAILY_DIR="$BACKUP_ROOT/daily"
WEEKLY_DIR="$BACKUP_ROOT/weekly"
MONTHLY_DIR="$BACKUP_ROOT/monthly"
DATE=$(date +%Y%m%d)
DAY_OF_WEEK=$(date +%u) # 1-7, Monday=1
DAY_OF_MONTH=$(date +%d)
# Create directory structure
mkdir -p "$CURRENT" "$DAILY_DIR" "$WEEKLY_DIR" "$MONTHLY_DIR"
# Perform incremental backup to current
rsync -av \
--delete \
--link-dest="$CURRENT" \
"$SOURCE_DIR/" \
"$CURRENT/"
# Create daily snapshot
if [ ! -d "$DAILY_DIR/$DATE" ]; then
cp -al "$CURRENT" "$DAILY_DIR/$DATE"
fi
# Create weekly snapshot (Sunday)
if [ "$DAY_OF_WEEK" -eq 7 ]; then
cp -al "$CURRENT" "$WEEKLY_DIR/$DATE"
fi
# Create monthly snapshot (1st of month)
if [ "$DAY_OF_MONTH" -eq "01" ]; then
cp -al "$CURRENT" "$MONTHLY_DIR/$DATE"
fi
# Cleanup old backups
# Keep 7 daily backups
find "$DAILY_DIR" -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
# Keep 4 weekly backups
find "$WEEKLY_DIR" -maxdepth 1 -type d -mtime +28 -exec rm -rf {} \;
# Keep 12 monthly backups
find "$MONTHLY_DIR" -maxdepth 1 -type d -mtime +365 -exec rm -rf {} \;
echo "Backup completed: $DATE"
This script uses hard links (cp -al) to create space-efficient snapshots where unchanged files share the same disk space.
Excluding Files and Directories
Optimize backups by excluding unnecessary data:
Using exclude options:
rsync -av \
--exclude='*.log' \
--exclude='*.tmp' \
--exclude='.cache/' \
--exclude='node_modules/' \
--exclude='*.bak' \
/source/ /backup/
Using exclude-from file:
Create /etc/rsync/exclude-list.txt:
*.log
*.tmp
.cache/
node_modules/
vendor/
*.swp
.git/
tmp/
cache/
sessions/
Use in rsync command:
rsync -av \
--exclude-from=/etc/rsync/exclude-list.txt \
/source/ /backup/
Pattern matching rules:
# Exclude all .log files recursively
--exclude='*.log'
# Exclude specific directory
--exclude='/path/to/directory'
# Exclude pattern in any directory
--exclude='**/cache/'
# Include specific files, exclude rest
--include='*.php' --include='*.html' --exclude='*'
Remote Backup Implementation
SSH-Based Remote Backups
Setting up secure remote backups using SSH:
1. Configure SSH key-based authentication:
# Generate SSH key pair (on backup client)
ssh-keygen -t ed25519 -C "backup-automation"
# Copy public key to remote server
ssh-copy-id -i ~/.ssh/id_ed25519.pub user@backup-server
# Test passwordless connection
ssh user@backup-server "echo SSH connection successful"
2. Create remote backup script:
#!/bin/bash
# /usr/local/bin/rsync-remote-backup.sh
# Configuration
SOURCE_DIR="/var/www"
REMOTE_USER="backup"
REMOTE_HOST="backup-server.example.com"
REMOTE_PATH="/backups/web-server"
LOG_FILE="/var/log/backup/remote-backup.log"
DATE=$(date +"%Y-%m-%d %H:%M:%S")
# Log start
echo "[$DATE] Starting remote backup to $REMOTE_HOST" >> "$LOG_FILE"
# Perform remote backup
rsync -avz \
--delete \
--compress-level=9 \
--exclude-from=/etc/rsync/exclude-list.txt \
-e "ssh -i /root/.ssh/backup_key -p 22" \
"$SOURCE_DIR/" \
"$REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH/" \
>> "$LOG_FILE" 2>&1
RSYNC_EXIT=$?
# Check result
if [ $RSYNC_EXIT -eq 0 ]; then
echo "[$DATE] Remote backup completed successfully" >> "$LOG_FILE"
exit 0
else
echo "[$DATE] Remote backup failed with exit code $RSYNC_EXIT" >> "$LOG_FILE"
# Send alert email
echo "Remote backup failed on $(hostname)" | \
mail -s "BACKUP FAILURE" [email protected]
exit 1
fi
Optimizing Remote Transfers
Bandwidth limiting (prevent network saturation):
# Limit to 5000 KB/s (5 MB/s)
rsync -avz \
--bwlimit=5000 \
/source/ user@remote:/backup/
Custom SSH options:
# Use custom port and cipher
rsync -av \
-e "ssh -p 2222 -c [email protected] -o Compression=no" \
/source/ user@remote:/backup/
Compression tuning:
# Maximum compression (slower, best for slow links)
rsync -avz --compress-level=9 /source/ user@remote:/backup/
# Light compression (faster, for fast links)
rsync -avz --compress-level=1 /source/ user@remote:/backup/
# Skip compression for already compressed files
rsync -av \
--skip-compress=gz/zip/z/rpm/deb/iso/bz2/jpg/jpeg/png/gif/mp3/mp4 \
/source/ user@remote:/backup/
Pull vs Push Backup Strategies
Push strategy (backup client initiates):
# From web server to backup server
rsync -avz /var/www/ backup-server:/backups/web/
Advantages: Simple, backup runs on production server Disadvantages: Requires outbound access from production
Pull strategy (backup server initiates):
# From backup server, pull from web server
rsync -avz web-server:/var/www/ /backups/web/
Advantages: Production server doesn't need backup logic, better security Disadvantages: Backup server needs access to all production servers
Automation and Scheduling
Cron-Based Automation
Schedule automatic backups using cron:
Edit crontab:
sudo crontab -e
Example cron schedules:
# Daily backup at 2 AM
0 2 * * * /usr/local/bin/rsync-local-backup.sh
# Every 6 hours
0 */6 * * * /usr/local/bin/rsync-remote-backup.sh
# Weekdays at 11 PM
0 23 * * 1-5 /usr/local/bin/rsync-remote-backup.sh
# Sundays at 1 AM (weekly full backup)
0 1 * * 0 /usr/local/bin/rsync-weekly-backup.sh
Enhanced cron entry with logging:
# Daily backup with output to log
0 2 * * * /usr/local/bin/rsync-local-backup.sh >> /var/log/backup/cron.log 2>&1
# Email on failure only
0 2 * * * /usr/local/bin/rsync-local-backup.sh || echo "Backup failed" | mail -s "Backup Alert" [email protected]
Systemd Timer Implementation
Modern alternative to cron with better dependency management:
Create service file (/etc/systemd/system/rsync-backup.service):
[Unit]
Description=Rsync Backup Service
After=network-online.target
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/rsync-remote-backup.sh
User=root
StandardOutput=journal
StandardError=journal
# Timeout after 2 hours
TimeoutSec=7200
[Install]
WantedBy=multi-user.target
Create timer file (/etc/systemd/system/rsync-backup.timer):
[Unit]
Description=Daily Rsync Backup Timer
Requires=rsync-backup.service
[Timer]
# Run daily at 2 AM
OnCalendar=*-*-* 02:00:00
# If system was off, run after boot
Persistent=true
# Randomize start time by up to 10 minutes
RandomizedDelaySec=10min
[Install]
WantedBy=timers.target
Enable and manage timer:
# Reload systemd
sudo systemctl daemon-reload
# Enable timer (start on boot)
sudo systemctl enable rsync-backup.timer
# Start timer immediately
sudo systemctl start rsync-backup.timer
# Check timer status
sudo systemctl status rsync-backup.timer
# List all timers
sudo systemctl list-timers
# View logs
sudo journalctl -u rsync-backup.service
Comprehensive Backup Script with Error Handling
Production-ready script with robust error handling and notifications:
#!/bin/bash
# /usr/local/bin/rsync-production-backup.sh
set -euo pipefail # Exit on error, undefined variables, pipe failures
# Configuration
SOURCE_DIR="/var/www"
BACKUP_DIR="/backup/www"
LOG_DIR="/var/log/backup"
LOG_FILE="$LOG_DIR/rsync-backup-$(date +%Y%m%d).log"
EXCLUDE_FILE="/etc/rsync/exclude-list.txt"
ADMIN_EMAIL="[email protected]"
MAX_RUNTIME=7200 # 2 hours in seconds
# Lock file to prevent concurrent runs
LOCK_FILE="/var/run/rsync-backup.lock"
# Create log directory
mkdir -p "$LOG_DIR"
# Function: Send email notification
send_notification() {
local subject="$1"
local message="$2"
echo "$message" | mail -s "$subject - $(hostname)" "$ADMIN_EMAIL"
}
# Function: Cleanup on exit
cleanup() {
rm -f "$LOCK_FILE"
}
trap cleanup EXIT
# Check for existing backup process
if [ -f "$LOCK_FILE" ]; then
echo "Backup already running (lock file exists)"
exit 1
fi
# Create lock file
echo $$ > "$LOCK_FILE"
# Start logging
exec 1>>"$LOG_FILE" 2>&1
echo "================================"
echo "Backup started: $(date)"
echo "================================"
# Pre-backup checks
echo "Performing pre-backup checks..."
# Check source directory exists
if [ ! -d "$SOURCE_DIR" ]; then
echo "ERROR: Source directory does not exist: $SOURCE_DIR"
send_notification "BACKUP FAILED" "Source directory missing"
exit 1
fi
# Check available disk space (require at least 10GB)
AVAILABLE_SPACE=$(df "$BACKUP_DIR" | awk 'NR==2 {print $4}')
REQUIRED_SPACE=10485760 # 10GB in KB
if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then
echo "ERROR: Insufficient disk space"
send_notification "BACKUP FAILED" "Insufficient disk space"
exit 1
fi
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Start timing
START_TIME=$(date +%s)
# Perform backup
echo "Starting rsync backup..."
rsync -av \
--delete \
--delete-excluded \
--exclude-from="$EXCLUDE_FILE" \
--stats \
--log-file="$LOG_FILE" \
"$SOURCE_DIR/" \
"$BACKUP_DIR/"
RSYNC_EXIT=$?
# Calculate duration
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
echo "================================"
echo "Backup completed: $(date)"
echo "Duration: $DURATION seconds"
echo "Exit code: $RSYNC_EXIT"
echo "================================"
# Check result
if [ $RSYNC_EXIT -eq 0 ]; then
echo "Backup completed successfully"
# Create success marker file
echo "$(date)" > "$BACKUP_DIR/.last-backup-success"
# Optional: Send success notification
# send_notification "BACKUP SUCCESS" "Backup completed in $DURATION seconds"
exit 0
else
echo "Backup failed with exit code $RSYNC_EXIT"
send_notification "BACKUP FAILED" "Rsync exit code: $RSYNC_EXIT"
exit 1
fi
Monitoring and Verification
Backup Success Verification
Create a monitoring script to verify backup completion:
#!/bin/bash
# /usr/local/bin/verify-backup-status.sh
BACKUP_DIR="/backup/www"
MARKER_FILE="$BACKUP_DIR/.last-backup-success"
MAX_AGE_HOURS=26 # Alert if backup older than 26 hours
if [ ! -f "$MARKER_FILE" ]; then
echo "WARNING: Backup marker file not found"
exit 1
fi
LAST_BACKUP=$(stat -c %Y "$MARKER_FILE")
CURRENT_TIME=$(date +%s)
AGE_HOURS=$(( (CURRENT_TIME - LAST_BACKUP) / 3600 ))
if [ $AGE_HOURS -gt $MAX_AGE_HOURS ]; then
echo "WARNING: Last backup is $AGE_HOURS hours old"
exit 1
else
echo "OK: Last backup is $AGE_HOURS hours old"
exit 0
fi
Log Analysis and Reporting
Generate backup reports from rsync logs:
#!/bin/bash
# /usr/local/bin/backup-report.sh
LOG_FILE="/var/log/backup/rsync-backup-$(date +%Y%m%d).log"
if [ ! -f "$LOG_FILE" ]; then
echo "Log file not found: $LOG_FILE"
exit 1
fi
echo "Backup Report for $(date +%Y-%m-%d)"
echo "===================================="
echo ""
# Extract statistics
echo "Transfer Statistics:"
grep -A 20 "Number of files:" "$LOG_FILE" | head -20
echo ""
echo "Files with errors:"
grep "error\|failed\|ERROR" "$LOG_FILE" || echo "None"
echo ""
echo "Top 10 largest files transferred:"
grep ">" "$LOG_FILE" | awk '{print $5, $6}' | sort -rn | head -10
Nagios/Icinga Monitoring Check
Create a monitoring check for integration with monitoring systems:
#!/bin/bash
# /usr/local/lib/nagios/plugins/check_backup_age
BACKUP_DIR="/backup/www"
WARNING_HOURS=25
CRITICAL_HOURS=49
MARKER_FILE="$BACKUP_DIR/.last-backup-success"
if [ ! -f "$MARKER_FILE" ]; then
echo "CRITICAL: Backup marker file not found"
exit 2
fi
LAST_BACKUP=$(stat -c %Y "$MARKER_FILE")
CURRENT_TIME=$(date +%s)
AGE_HOURS=$(( (CURRENT_TIME - LAST_BACKUP) / 3600 ))
if [ $AGE_HOURS -ge $CRITICAL_HOURS ]; then
echo "CRITICAL: Backup is $AGE_HOURS hours old"
exit 2
elif [ $AGE_HOURS -ge $WARNING_HOURS ]; then
echo "WARNING: Backup is $AGE_HOURS hours old"
exit 1
else
echo "OK: Backup is $AGE_HOURS hours old"
exit 0
fi
Real-World Implementation Scenarios
Scenario 1: Web Server Backup to Remote NAS
Requirements:
- Daily backup of web files
- Remote storage on NAS
- Minimal impact on production
- 30-day retention
Implementation:
#!/bin/bash
# /usr/local/bin/web-to-nas-backup.sh
SOURCE="/var/www"
REMOTE_USER="backup"
REMOTE_HOST="nas.local"
REMOTE_PATH="/volume1/backups/web-server"
DATE=$(date +%Y%m%d)
# Create daily snapshot on NAS
rsync -avz \
--delete \
--backup \
--backup-dir="../archive/$DATE" \
--exclude='*.log' \
--exclude='cache/' \
--bwlimit=10000 \
-e "ssh -i /root/.ssh/backup_key" \
"$SOURCE/" \
"$REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH/current/"
# Cleanup old archives on NAS (keep 30 days)
ssh -i /root/.ssh/backup_key "$REMOTE_USER@$REMOTE_HOST" \
"find $REMOTE_PATH/archive/ -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;"
Cron schedule (3 AM daily):
0 3 * * * /usr/local/bin/web-to-nas-backup.sh
Scenario 2: Database Server Incremental Backup
Requirements:
- Incremental backups of database files
- Multiple historical versions
- Fast restoration capability
- Space-efficient storage
Implementation:
#!/bin/bash
# /usr/local/bin/database-incremental-backup.sh
SOURCE="/var/lib/mysql"
BACKUP_ROOT="/backup/mysql"
DATE=$(date +%Y%m%d-%H%M%S)
LATEST="$BACKUP_ROOT/latest"
SNAPSHOT_DIR="$BACKUP_ROOT/snapshots/$DATE"
mkdir -p "$BACKUP_ROOT/snapshots"
# Stop MySQL for consistent backup (optional, use with caution)
# systemctl stop mysql
# Create snapshot using hard links
if [ -d "$LATEST" ]; then
rsync -av \
--delete \
--link-dest="$LATEST" \
"$SOURCE/" \
"$SNAPSHOT_DIR/"
else
rsync -av "$SOURCE/" "$SNAPSHOT_DIR/"
fi
# Update latest link
rm -f "$LATEST"
ln -s "$SNAPSHOT_DIR" "$LATEST"
# Restart MySQL
# systemctl start mysql
# Keep only last 14 snapshots
ls -1dt "$BACKUP_ROOT/snapshots/"* | tail -n +15 | xargs rm -rf
echo "Backup completed: $DATE"
Scenario 3: Multi-Server Centralized Backup
Requirements:
- Multiple application servers
- Centralized backup storage
- Parallel backup execution
- Consolidated reporting
Implementation (on backup server):
#!/bin/bash
# /usr/local/bin/multi-server-backup.sh
SERVERS=(
"web1.example.com:/var/www web1"
"web2.example.com:/var/www web2"
"app1.example.com:/opt/application app1"
"db1.example.com:/var/lib/postgresql db1"
)
BACKUP_ROOT="/backups"
LOG_DIR="/var/log/multi-backup"
DATE=$(date +%Y%m%d)
mkdir -p "$LOG_DIR"
# Function to backup single server
backup_server() {
local server_path="$1"
local server_name="$2"
local log_file="$LOG_DIR/$server_name-$DATE.log"
echo "Starting backup: $server_name" | tee -a "$log_file"
rsync -avz \
--delete \
--timeout=3600 \
-e "ssh -i /root/.ssh/backup_key" \
"$server_path/" \
"$BACKUP_ROOT/$server_name/" \
>> "$log_file" 2>&1
if [ $? -eq 0 ]; then
echo "SUCCESS: $server_name" | tee -a "$log_file"
return 0
else
echo "FAILED: $server_name" | tee -a "$log_file"
return 1
fi
}
# Backup all servers in parallel
SUCCESS_COUNT=0
FAIL_COUNT=0
for server_info in "${SERVERS[@]}"; do
server_path=$(echo "$server_info" | cut -d' ' -f1)
server_name=$(echo "$server_info" | cut -d' ' -f2)
# Run in background for parallel execution
backup_server "$server_path" "$server_name" &
done
# Wait for all backups to complete
wait
# Generate summary report
echo "Multi-Server Backup Summary - $DATE"
echo "====================================="
for server_info in "${SERVERS[@]}"; do
server_name=$(echo "$server_info" | cut -d' ' -f2)
log_file="$LOG_DIR/$server_name-$DATE.log"
if grep -q "SUCCESS" "$log_file"; then
echo "$server_name: SUCCESS"
((SUCCESS_COUNT++))
else
echo "$server_name: FAILED"
((FAIL_COUNT++))
fi
done
echo ""
echo "Total: $SUCCESS_COUNT successful, $FAIL_COUNT failed"
# Send email report
mail -s "Backup Summary - $DATE" [email protected] < \
<(echo "Check detailed logs at: $LOG_DIR/")
Troubleshooting Common Issues
Permission Denied Errors
Symptom: rsync: send_files failed to open "file": Permission denied
Solutions:
# Run rsync as root (use with caution)
sudo rsync -av /source/ /destination/
# Fix source permissions
sudo chown -R backup-user:backup-group /source/
chmod -R u+rX /source/
# Use --fake-super for non-root backups (requires extended attributes)
rsync -av --fake-super /source/ /destination/
Connection Timeouts
Symptom: ssh: connect to host timeout or Connection timed out
Solutions:
# Increase timeout
rsync -av --timeout=600 /source/ remote:/destination/
# Test SSH connection separately
ssh -v user@remote-host
# Check firewall rules
sudo iptables -L -n | grep 22
# Use different port
rsync -av -e "ssh -p 2222" /source/ remote:/destination/
Partial Transfer Due to Vanished Files
Symptom: rsync: file has vanished or rsync warning: some files vanished
Explanation: Files deleted during backup (common with log files, temporary files)
Solutions:
# Ignore vanished file warnings
rsync -av --ignore-errors /source/ /destination/
# Exclude rapidly changing files
rsync -av --exclude='*.log' --exclude='tmp/' /source/ /destination/
# Use filesystem snapshots
lvcreate -L 10G -s -n snap /dev/vg0/data
rsync -av /mnt/snap/ /backup/
lvremove -f /dev/vg0/snap
Bandwidth Saturation
Symptom: Backup saturates network link, impacting production
Solutions:
# Limit bandwidth to 5 MB/s
rsync -av --bwlimit=5000 /source/ /destination/
# Schedule during off-hours
0 2 * * * /usr/local/bin/rsync-backup.sh
# Use ionice for I/O priority
ionice -c3 rsync -av /source/ /destination/
# Use nice for CPU priority
nice -n 19 rsync -av /source/ /destination/
Disk Space Exhaustion
Symptom: rsync: write failed: No space left on device
Solutions:
# Check available space before backup
AVAILABLE=$(df /backup | awk 'NR==2 {print $4}')
if [ $AVAILABLE -lt 10485760 ]; then
echo "Insufficient space"
exit 1
fi
# Implement retention policy
find /backup/old -mtime +30 -delete
# Use compression
rsync -avz /source/ /destination/
# Exclude unnecessary files
rsync -av --exclude='*.iso' --exclude='*.log' /source/ /destination/
Performance Optimization
Tuning for Large File Transfers
# Increase buffer sizes for large files
rsync -av \
--inplace \
--no-whole-file \
--partial \
/large-files/ /destination/
Optimizing for Many Small Files
# Disable incremental recursion for faster directory scanning
rsync -av \
--no-inc-recursive \
--delete \
/many-small-files/ /destination/
Parallel rsync for Maximum Throughput
#!/bin/bash
# Parallel rsync using GNU parallel
find /source/ -maxdepth 1 -type d | \
parallel -j 4 'rsync -av {}/ /destination/{/}/'
Conclusion
Rsync is an indispensable tool for implementing efficient, reliable backup strategies on Linux systems. Its combination of incremental transfer capabilities, flexible filtering options, and robust error handling makes it suitable for everything from simple local backups to complex multi-server disaster recovery implementations.
Key takeaways from this guide:
-
Master the fundamentals: Understanding rsync's options and syntax is crucial for effective usage.
-
Implement automation: Use cron or systemd timers to ensure consistent, hands-off backup execution.
-
Monitor and verify: Always implement monitoring and verification to ensure backups are completing successfully.
-
Test restoration: Regular restoration testing is critical—backups are only valuable if they can be restored.
-
Optimize for your use case: Tune compression, bandwidth, and scheduling based on your specific requirements.
-
Implement the 3-2-1 rule: Use rsync as part of a comprehensive strategy with multiple copies, different media, and offsite storage.
-
Handle errors gracefully: Implement robust error handling, logging, and alerting in production scripts.
Whether you're protecting a single web server or orchestrating backups across a large infrastructure, rsync provides the flexibility and efficiency needed for production-grade backup solutions. Combined with proper planning, automation, and monitoring, rsync forms the foundation of a reliable disaster recovery strategy that protects your critical data against loss.
Start with simple implementations, test thoroughly, and gradually expand your backup strategy to meet your organization's specific recovery time objectives and recovery point objectives. Remember: the best backup strategy is one that's actually implemented, tested, and maintained.


