Disk Full: How to Find and Clean Space

Introduction

A full disk is one of the most common and critical issues affecting Linux servers. When filesystems reach 100% capacity, applications fail to write data, logs stop recording, databases become corrupted, and services crash. Understanding how to quickly identify what's consuming disk space and safely reclaim it is essential for every system administrator.

This comprehensive guide provides practical command-line tools and systematic methodologies for diagnosing and resolving disk space issues. You'll learn how to find large files, identify space-consuming directories, clean up unnecessary data, and implement preventive measures to avoid future disk space problems.

Disk space issues often escalate rapidly, especially on servers with heavy logging or data processing. This guide will teach you to recognize early warning signs, perform detailed analysis, and implement automated cleanup procedures to maintain healthy disk utilization across your infrastructure.

Understanding Disk Space Issues

Common Causes of Disk Full

Log Files: Rapidly growing application, system, or web server logs Temporary Files: Accumulated files in /tmp, /var/tmp, and cache directories Database Growth: Expanding database files and transaction logs Backups: Old backup files not being rotated Core Dumps: Large core dump files from crashed applications Package Cache: Downloaded packages not cleaned up User Data: Uploaded files, email attachments, user home directories Orphaned Files: Files from uninstalled applications Inode Exhaustion: Too many small files consuming all inodes

Critical Thresholds

Understanding disk usage thresholds:

  • 0-70%: Normal healthy range
  • 70-85%: Monitor closely
  • 85-95%: Warning - take action soon
  • 95-98%: Critical - immediate action needed
  • 98-100%: Emergency - service degradation likely

Important: Some filesystems (like ext4) reserve 5% for root, so 95% may appear as 100% for regular users.

Initial Disk Space Assessment

Quick Disk Status Check

Start with these rapid assessment commands:

# Overview of all filesystems
df -h

# Human-readable with filesystem type
df -hT

# Inode usage
df -i

# Specific filesystem
df -h /

# Exclude specific filesystem types
df -h -x tmpfs -x devtmpfs

# Sort by usage
df -h | sort -k5 -rh

# Alert on high usage
df -h | awk '$5+0 > 85 {print $0}'

Quick interpretation:

# If Use% > 95%
# THEN immediate action required

# If IUse% > 90%
# THEN too many small files (inode exhaustion)

# If /var, /var/log full
# THEN likely log file issue

# If /home full
# THEN user data accumulation

Understanding df Output

df -hT

Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/sda1      ext4       50G   47G  1.5G  97% /
/dev/sda2      ext4      100G   15G   80G  16% /home
tmpfs          tmpfs     7.8G     0  7.8G   0% /dev/shm

Key columns:

  • Filesystem: Device or partition
  • Type: Filesystem type (ext4, xfs, etc.)
  • Size: Total capacity
  • Used: Currently used space
  • Avail: Available space
  • Use%: Percentage used
  • Mounted on: Mount point

Step 1: Identifying Large Directories

Using du Command

The du (disk usage) command shows directory sizes:

# Current directory size
du -sh .

# All subdirectories
du -sh *

# Top-level directories from root
du -sh /* 2>/dev/null

# Sort by size
du -sh /* 2>/dev/null | sort -rh

# Top 10 largest directories
du -h / 2>/dev/null | sort -rh | head -10

# Specific directory analysis
du -h /var/log | sort -rh | head -20

# Show only directories over 1GB
du -h / 2>/dev/null | grep "^[0-9\.]*G"

# Exclude certain paths
du -h --exclude=/proc --exclude=/sys / 2>/dev/null | sort -rh | head -20

# Maximum depth
du -h --max-depth=1 /var | sort -rh

Analyzing Common Directories

# Log directory
du -sh /var/log/*  | sort -rh | head -10

# Home directories
du -sh /home/* | sort -rh

# Temporary directories
du -sh /tmp /var/tmp /dev/shm

# Package cache
du -sh /var/cache/apt/archives  # Debian/Ubuntu
du -sh /var/cache/yum           # CentOS/RHEL

# Docker
du -sh /var/lib/docker

# Database
du -sh /var/lib/mysql
du -sh /var/lib/postgresql

# Web directories
du -sh /var/www/*

# Mail directories
du -sh /var/mail/*

Automated Directory Analysis Script

cat > /tmp/disk-analysis.sh << 'EOF'
#!/bin/bash

echo "Disk Space Analysis - $(date)"
echo "========================================"
echo ""

echo "Filesystem Usage:"
df -hT | grep -v "tmpfs\|devtmpfs"
echo ""

echo "Largest Directories in /:"
du -sh /* 2>/dev/null | sort -rh | head -10
echo ""

echo "Largest Directories in /var:"
du -sh /var/* 2>/dev/null | sort -rh | head -10
echo ""

echo "Log Files:"
du -sh /var/log/* 2>/dev/null | sort -rh | head -10
echo ""

echo "Largest Users:"
du -sh /home/* 2>/dev/null | sort -rh | head -10
echo ""

echo "Temporary Files:"
du -sh /tmp /var/tmp 2>/dev/null
EOF

chmod +x /tmp/disk-analysis.sh
/tmp/disk-analysis.sh

Step 2: Finding Large Files

Using find Command

Locate specific large files:

# Files larger than 100MB
find / -type f -size +100M -exec ls -lh {} \; 2>/dev/null

# Files larger than 1GB with details
find / -type f -size +1G -exec ls -lh {} \; 2>/dev/null | sort -k5 -rh

# Top 20 largest files
find / -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -20 | awk '{print $1/1024/1024 "MB", $2}'

# Large files in specific directory
find /var -type f -size +50M -exec ls -lh {} \; 2>/dev/null

# Large files modified in last 7 days
find / -type f -size +100M -mtime -7 -exec ls -lh {} \; 2>/dev/null

# Files larger than 500MB sorted
find / -type f -size +500M -printf '%s %p\n' 2>/dev/null | sort -rn | awk '{printf "%.2f GB  %s\n", $1/1024/1024/1024, $2}'

# Large log files
find /var/log -type f -size +50M -exec ls -lh {} \; 2>/dev/null

# Core dump files
find / -name "core.*" -o -name "*.core" 2>/dev/null -exec ls -lh {} \;

Finding Files by Age

# Files older than 30 days
find / -type f -mtime +30 -size +10M 2>/dev/null

# Files not accessed in 90 days
find / -type f -atime +90 -size +50M 2>/dev/null

# Recently created large files
find / -type f -size +100M -mtime -1 -exec ls -lh {} \; 2>/dev/null

# Old temporary files
find /tmp -type f -mtime +7 -exec ls -lh {} \; 2>/dev/null
find /var/tmp -type f -mtime +30 -exec ls -lh {} \; 2>/dev/null

Quick Large File Finder Script

cat > /tmp/find-large-files.sh << 'EOF'
#!/bin/bash

SIZE_MB=${1:-100}
echo "Finding files larger than ${SIZE_MB}MB..."
echo ""

find / -type f -size +${SIZE_MB}M 2>/dev/null -printf '%s %p\n' |
sort -rn |
head -20 |
while read size path; do
    size_mb=$(echo "scale=2; $size/1024/1024" | bc)
    printf "%8.2f MB  %s\n" $size_mb "$path"
done
EOF

chmod +x /tmp/find-large-files.sh

# Find files > 100MB
/tmp/find-large-files.sh 100

# Find files > 500MB
/tmp/find-large-files.sh 500

Step 3: Analyzing Specific Problem Areas

Log Files Analysis

Logs are the most common cause of disk space issues:

# Log directory size
du -sh /var/log
du -sh /var/log/* | sort -rh | head -10

# Largest log files
find /var/log -type f -printf '%s %p\n' | sort -rn | head -20 | awk '{print $1/1024/1024 "MB", $2}'

# Actively growing logs
ls -lth /var/log/*.log | head -10

# Find logs not rotated
find /var/log -name "*.log" -size +100M

# Journal logs size
du -sh /var/log/journal
journalctl --disk-usage

# Check log rotation configuration
cat /etc/logrotate.conf
ls -la /etc/logrotate.d/

# Test logrotate
logrotate -d /etc/logrotate.conf

Package Manager Cache

Package caches can consume significant space:

# Debian/Ubuntu APT cache
du -sh /var/cache/apt/archives
ls -lh /var/cache/apt/archives/*.deb 2>/dev/null | wc -l

# Clean APT cache
apt clean
apt autoclean

# CentOS/RHEL YUM cache
du -sh /var/cache/yum
yum clean all

# DNF cache (newer systems)
du -sh /var/cache/dnf
dnf clean all

# List cached packages
ls -lh /var/cache/apt/archives/*.deb 2>/dev/null

Docker Cleanup

Docker can accumulate significant storage:

# Docker disk usage
docker system df

# Detailed breakdown
docker system df -v

# Remove unused containers
docker container prune -f

# Remove unused images
docker image prune -a -f

# Remove unused volumes
docker volume prune -f

# Remove everything unused
docker system prune -a -f --volumes

# Check Docker root directory size
du -sh /var/lib/docker
du -sh /var/lib/docker/overlay2

Database Files

Database files can grow rapidly:

# MySQL/MariaDB data directory
du -sh /var/lib/mysql
du -sh /var/lib/mysql/* | sort -rh

# MySQL binary logs
du -sh /var/lib/mysql/mysql-bin.*
mysql -e "SHOW BINARY LOGS;"

# Purge old binary logs (keep last 3 days)
mysql -e "PURGE BINARY LOGS BEFORE DATE(NOW() - INTERVAL 3 DAY);"

# PostgreSQL
du -sh /var/lib/postgresql
du -sh /var/lib/postgresql/*/main/base/* | sort -rh

# MongoDB
du -sh /var/lib/mongodb

# Check database sizes
mysql -e "SELECT table_schema AS 'Database', ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)' FROM information_schema.tables GROUP BY table_schema ORDER BY SUM(data_length + index_length) DESC;"

Web Server Files

# Web directories
du -sh /var/www/* | sort -rh

# Upload directories
find /var/www -name uploads -o -name files | xargs du -sh

# Old uploaded files
find /var/www -name uploads -type d -exec find {} -type f -mtime +90 \; | wc -l

# Temporary web files
find /var/www -name "*.tmp" -o -name "*.cache" | xargs du -ch

Step 4: Inode Exhaustion

Checking Inode Usage

# Inode usage
df -i

# Find directories with most files
for dir in /*; do
    echo "$dir: $(find $dir -type f 2>/dev/null | wc -l) files"
done | sort -t: -k2 -rn

# Count files in subdirectories
find /var -maxdepth 2 -type d -exec sh -c 'echo "{}: $(find "{}" -type f | wc -l)"' \; | sort -t: -k2 -rn | head -20

# Find directories with excessive small files
du -a / 2>/dev/null | sort -n -r | head -n 50 | awk '{if($1 < 100) print $0}'

# Session files
ls /tmp | wc -l
ls /var/tmp | wc -l
ls /var/lib/php/sessions 2>/dev/null | wc -l

Cleaning Inode Issues

# Remove old PHP sessions
find /var/lib/php/sessions -type f -mtime +7 -delete

# Remove old temporary files
find /tmp -type f -mtime +7 -delete
find /var/tmp -type f -mtime +30 -delete

# Clean up mail queue
postqueue -p | wc -l
postsuper -d ALL deferred

Step 5: Safe Cleanup Procedures

Log File Cleanup

# Truncate large logs (preserve file)
truncate -s 0 /var/log/large.log

# Alternative: empty log while keeping file
> /var/log/large.log

# Remove old rotated logs
find /var/log -name "*.log.*" -mtime +30 -delete
find /var/log -name "*.gz" -mtime +30 -delete

# Clean journal logs (keep 2 days)
journalctl --vacuum-time=2d

# Clean journal logs (keep 1GB)
journalctl --vacuum-size=1G

# Rotate logs manually
logrotate -f /etc/logrotate.conf

# Remove old kern.log files
find /var/log -name "kern.log.*" -mtime +7 -delete

Temporary Files Cleanup

# Clean /tmp (files older than 7 days)
find /tmp -type f -atime +7 -delete

# Clean /var/tmp (older than 30 days)
find /var/tmp -type f -atime +30 -delete

# Clean user cache
rm -rf ~/.cache/*

# Clean thumbnail cache
rm -rf ~/.thumbnails/*

# System temporary files
rm -rf /tmp/*
rm -rf /var/tmp/*
# Warning: Only if no active processes using these files

Package Cleanup

# Debian/Ubuntu
apt autoremove -y
apt autoclean
apt clean

# Remove old kernels (keep current and one previous)
apt autoremove --purge -y

# List installed kernels
dpkg --list | grep linux-image

# CentOS/RHEL
yum autoremove -y
yum clean all

# Remove old kernels
package-cleanup --oldkernels --count=2

# List installed kernels
rpm -qa kernel

User Data Cleanup

# Find large user files
du -sh /home/* | sort -rh

# Old downloads
find /home/*/Downloads -type f -mtime +30

# Large email files
du -sh /var/mail/*

# Clear bash history
history -c
> ~/.bash_history

# Remove old core dumps
find /home -name "core.*" -delete
find / -name "core" -type f -delete 2>/dev/null

Step 6: Advanced Cleanup Techniques

Finding Deleted But Open Files

Files deleted but still held open by processes:

# List deleted files still open
lsof | grep deleted

# Detailed view
lsof +L1

# Size of deleted open files
lsof +L1 | awk '{sum+=$7} END {print "Total:", sum/1024/1024 "MB"}'

# Kill process holding deleted file
kill -9 $(lsof +L1 | grep filename | awk '{print $2}')

# Alternative: restart service
systemctl restart service-name

Sparse Files Detection

# Find sparse files
find / -type f -printf "%S %p\n" 2>/dev/null | awk '$1 < 1.0 {print}'

# Check if file is sparse
du --apparent-size -h file.img
du -h file.img
# If different, file is sparse

Finding Duplicates

# Find duplicate files with fdupes
apt install fdupes
fdupes -r /home

# Delete duplicates interactively
fdupes -rd /home

# Find duplicate files manually
find /path -type f -exec md5sum {} + | sort | uniq -w32 -dD

Solutions and Prevention

Implementing Log Rotation

# Configure logrotate for custom application
cat > /etc/logrotate.d/myapp << 'EOF'
/var/log/myapp/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 0640 www-data www-data
    sharedscripts
    postrotate
        systemctl reload myapp >/dev/null 2>&1 || true
    endscript
}
EOF

# Test configuration
logrotate -d /etc/logrotate.d/myapp

# Force rotation
logrotate -f /etc/logrotate.d/myapp

Automated Cleanup Scripts

cat > /usr/local/bin/disk-cleanup.sh << 'EOF'
#!/bin/bash

LOG_FILE="/var/log/disk-cleanup.log"

echo "$(date): Starting disk cleanup" >> "$LOG_FILE"

# Clean package cache
apt autoclean -y >> "$LOG_FILE" 2>&1
apt autoremove -y >> "$LOG_FILE" 2>&1

# Clean old logs
find /var/log -name "*.log.*" -mtime +30 -delete
find /var/log -name "*.gz" -mtime +30 -delete

# Clean temporary files
find /tmp -type f -atime +7 -delete
find /var/tmp -type f -atime +30 -delete

# Clean journal
journalctl --vacuum-time=7d >> "$LOG_FILE" 2>&1

# Clean Docker (if installed)
if command -v docker &> /dev/null; then
    docker system prune -f >> "$LOG_FILE" 2>&1
fi

# Report disk usage
echo "$(date): Disk cleanup completed" >> "$LOG_FILE"
df -h >> "$LOG_FILE"
EOF

chmod +x /usr/local/bin/disk-cleanup.sh

# Schedule weekly cleanup
echo "0 2 * * 0 /usr/local/bin/disk-cleanup.sh" | crontab -

Disk Usage Monitoring

cat > /usr/local/bin/disk-monitor.sh << 'EOF'
#!/bin/bash

THRESHOLD=85
ALERT_EMAIL="[email protected]"

df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{print $5 " " $1 " " $6}' | while read output;
do
    usage=$(echo $output | awk '{print $1}' | cut -d'%' -f1)
    partition=$(echo $output | awk '{print $2}')
    mount=$(echo $output | awk '{print $3}')

    if [ $usage -ge $THRESHOLD ]; then
        echo "Alert: Disk usage on $partition ($mount) is ${usage}%" | \
            mail -s "Disk Space Alert on $(hostname)" "$ALERT_EMAIL"

        # Log top space consumers
        echo "Top consumers in $mount:" | mail -s "Disk Details" "$ALERT_EMAIL"
        du -sh $mount/* 2>/dev/null | sort -rh | head -10 | \
            mail -s "Disk Usage Details" "$ALERT_EMAIL"
    fi
done
EOF

chmod +x /usr/local/bin/disk-monitor.sh

# Run every hour
echo "0 * * * * /usr/local/bin/disk-monitor.sh" | crontab -

Quota Implementation

# Install quota tools
apt install quota

# Edit /etc/fstab, add usrquota,grpquota
/dev/sda2  /home  ext4  defaults,usrquota,grpquota  0  2

# Remount
mount -o remount /home

# Create quota files
quotacheck -cum /home
quotacheck -cgm /home

# Enable quotas
quotaon -v /home

# Set user quota (1GB soft, 1.5GB hard)
setquota -u username 1000000 1500000 0 0 /home

# Set group quota
setquota -g groupname 5000000 6000000 0 0 /home

# Check quota
quota -u username
repquota -a

Setting Up Alerts

cat > /etc/cron.hourly/disk-alert << 'EOF'
#!/bin/bash

USAGE=$(df / | tail -1 | awk '{print $5}' | cut -d'%' -f1)

if [ $USAGE -gt 90 ]; then
    echo "Critical: Root filesystem is ${USAGE}% full on $(hostname)" | \
        mail -s "CRITICAL: Disk Space Alert" [email protected]
elif [ $USAGE -gt 80 ]; then
    echo "Warning: Root filesystem is ${USAGE}% full on $(hostname)" | \
        mail -s "WARNING: Disk Space Alert" [email protected]
fi
EOF

chmod +x /etc/cron.hourly/disk-alert

Emergency Procedures

When Disk is 100% Full

# 1. Immediately free some space
rm -f /var/log/*.log.1 /var/log/*.log.*.gz

# 2. Truncate largest log
largest_log=$(find /var/log -type f -printf '%s %p\n' | sort -rn | head -1 | awk '{print $2}')
truncate -s 0 "$largest_log"

# 3. Clear package cache
apt clean
yum clean all

# 4. Clear temporary files
rm -rf /tmp/*
rm -rf /var/tmp/*

# 5. Check and restart affected services
systemctl status
systemctl restart failing-service

Boot Partition Full

# List installed kernels
dpkg --list | grep linux-image

# Remove old kernels (keep current + 1)
apt autoremove --purge -y

# Manually remove old kernel
apt remove linux-image-VERSION-generic

# Clean up /boot
rm -f /boot/*.old-dkms
update-grub

Conclusion

Disk space management is crucial for system stability and requires proactive monitoring and maintenance. Key takeaways:

  1. Monitor proactively: Don't wait until 100% to take action
  2. Implement log rotation: Prevent logs from consuming all space
  3. Automate cleanup: Schedule regular maintenance tasks
  4. Set up alerts: Get notified before critical thresholds
  5. Use quotas: Prevent individual users from consuming all space
  6. Know your data: Understand what should and shouldn't be on the system
  7. Document procedures: Keep cleanup scripts and procedures ready

Regular monitoring, automated cleanup procedures, and understanding these diagnostic tools will help you prevent disk space emergencies and quickly resolve issues when they occur. Keep these commands readily available for rapid troubleshooting when disk space issues arise.