Linux System Log Analysis (/var/log/)
Introduction
System logs are the black boxes of Linux servers, containing invaluable information about system events, security incidents, application errors, and operational activities. Understanding how to effectively analyze logs in the /var/log/ directory is a fundamental skill for system administrators, security professionals, and DevOps engineers.
Every significant event on a Linux system is recorded in log files, from authentication attempts and kernel messages to application-specific errors and system warnings. These logs provide the forensic trail needed to troubleshoot issues, investigate security incidents, monitor system health, and maintain compliance with regulatory requirements.
This comprehensive guide explores the /var/log/ directory structure, teaches you how to interpret various log files, demonstrates powerful log analysis techniques using command-line tools, and provides practical examples for common troubleshooting scenarios. Whether you're diagnosing a failed service, investigating a security breach, or simply monitoring system health, mastering log analysis is essential for effective Linux system administration.
Prerequisites
Before diving into log analysis, ensure you have:
- A Linux server or workstation (Ubuntu 20.04/22.04, Debian 10/11, CentOS 7/8, Rocky Linux 8/9, or similar)
- Root or sudo access to read protected log files
- Basic understanding of Linux command-line interface
- SSH access to your server (for remote log analysis)
- Familiarity with basic text processing commands (grep, awk, sed)
Recommended Tools:
lessormorefor viewing logsgrepfor searching log contentawkandsedfor advanced log parsingtailandheadfor viewing recent or first log entriesjournalctlfor systemd journal logs
Understanding /var/log/ Directory Structure
The /var/log/ directory is the standard location for system and application log files on Linux systems. Let's explore the key log files and their purposes.
Common Log Files Overview
# List all log files in /var/log/
ls -lh /var/log/
# View directory structure
tree /var/log/ -L 2
Essential System Log Files
1. /var/log/syslog (Debian/Ubuntu) or /var/log/messages (RHEL/CentOS)
General system activity log containing messages from the kernel, system daemons, and applications.
# View recent syslog entries (Ubuntu/Debian)
sudo tail -f /var/log/syslog
# View messages log (CentOS/Rocky Linux)
sudo tail -f /var/log/messages
2. /var/log/auth.log (Debian/Ubuntu) or /var/log/secure (RHEL/CentOS)
Authentication and authorization logs including SSH logins, sudo usage, and user authentication events.
# View authentication log (Ubuntu/Debian)
sudo tail -f /var/log/auth.log
# View secure log (CentOS/Rocky Linux)
sudo tail -f /var/log/secure
3. /var/log/kern.log
Kernel messages including hardware detection, driver loading, and kernel-level errors.
# View kernel log
sudo tail -f /var/log/kern.log
# Alternative: Use dmesg for kernel ring buffer
dmesg | tail -50
4. /var/log/dmesg
Boot-time kernel messages captured during system startup.
# View boot messages
sudo cat /var/log/dmesg
# Recent kernel ring buffer
dmesg -T | less
5. /var/log/boot.log
System boot and startup messages from init system and services.
# View boot log
sudo less /var/log/boot.log
Application-Specific Log Files
Web Server Logs:
# Apache logs
/var/log/apache2/access.log # HTTP requests (Debian/Ubuntu)
/var/log/apache2/error.log # Apache errors
/var/log/httpd/access_log # HTTP requests (RHEL/CentOS)
/var/log/httpd/error_log # Apache errors
# Nginx logs
/var/log/nginx/access.log # HTTP requests
/var/log/nginx/error.log # Nginx errors
Database Logs:
# MySQL/MariaDB
/var/log/mysql/error.log
/var/log/mysql/mysql.log
/var/log/mariadb/mariadb.log
# PostgreSQL
/var/log/postgresql/postgresql-*.log
Mail Server Logs:
# Postfix
/var/log/mail.log # General mail log (Debian/Ubuntu)
/var/log/maillog # Mail log (RHEL/CentOS)
/var/log/mail.err # Mail errors
System Service Logs:
# Cron jobs
/var/log/cron # Scheduled task execution
# System daemon messages
/var/log/daemon.log
# User activity
/var/log/user.log
Log File Permissions and Security
Log files contain sensitive information and should have restricted permissions:
# Check log file permissions
ls -l /var/log/auth.log
# Typical output: -rw-r----- 1 syslog adm 45678 Jan 11 10:30 auth.log
# Verify proper ownership
sudo find /var/log -type f -exec ls -lh {} \; | head -20
# Check for world-readable sensitive logs (security issue)
sudo find /var/log -type f -perm -004
Basic Log Analysis Techniques
Viewing Log Files
Using tail for recent entries:
# View last 50 lines
sudo tail -50 /var/log/syslog
# Follow log in real-time
sudo tail -f /var/log/syslog
# Follow multiple logs simultaneously
sudo tail -f /var/log/syslog /var/log/auth.log
# Show last 100 lines from multiple files
sudo tail -n 100 /var/log/syslog /var/log/auth.log
Using head for oldest entries:
# View first 50 lines
sudo head -50 /var/log/syslog
# Combine with tail to view specific range
sudo head -1000 /var/log/syslog | tail -100
Using less for interactive viewing:
# Open log with less (searchable, scrollable)
sudo less /var/log/syslog
# Less shortcuts:
# / - search forward
# ? - search backward
# n - next match
# N - previous match
# G - go to end
# g - go to beginning
# F - follow mode (like tail -f)
# q - quit
Using cat for full content:
# Display entire log file
sudo cat /var/log/syslog
# Display with line numbers
sudo cat -n /var/log/syslog | less
# Display multiple files sequentially
sudo cat /var/log/syslog /var/log/auth.log
Searching Log Content
Basic grep searches:
# Search for specific term
sudo grep "error" /var/log/syslog
# Case-insensitive search
sudo grep -i "error" /var/log/syslog
# Search multiple files
sudo grep "failed" /var/log/auth.log /var/log/syslog
# Recursive search in directory
sudo grep -r "connection refused" /var/log/
# Show line numbers
sudo grep -n "error" /var/log/syslog
# Show context (3 lines before and after)
sudo grep -C 3 "error" /var/log/syslog
# Count occurrences
sudo grep -c "error" /var/log/syslog
# Invert match (show lines NOT matching)
sudo grep -v "info" /var/log/syslog
Advanced grep patterns:
# Search for failed SSH login attempts
sudo grep "Failed password" /var/log/auth.log
# Search for successful sudo commands
sudo grep "sudo.*COMMAND" /var/log/auth.log
# Search for specific IP address
sudo grep "192.168.1.100" /var/log/syslog
# Search using regular expressions
sudo grep -E "error|warning|critical" /var/log/syslog
# Search for lines starting with specific pattern
sudo grep "^Jan 11" /var/log/syslog
# Search with extended regex (egrep)
sudo egrep "fail(ed|ure)" /var/log/auth.log
Filtering by Date and Time
Extract specific time ranges:
# Find all entries from specific date
sudo grep "Jan 11" /var/log/syslog
# Find entries from specific hour
sudo grep "Jan 11 14:" /var/log/syslog
# Find entries within time range
sudo awk '/Jan 11 14:00/,/Jan 11 15:00/' /var/log/syslog
# Last hour's entries (using timestamp)
sudo awk -v date="$(date --date='1 hour ago' '+%b %d %H')" '$0 ~ date' /var/log/syslog
Using journalctl for systemd logs:
# Logs since specific time
sudo journalctl --since "2024-01-11 14:00:00"
# Logs until specific time
sudo journalctl --until "2024-01-11 15:00:00"
# Logs from last hour
sudo journalctl --since "1 hour ago"
# Logs from today
sudo journalctl --since today
# Logs from yesterday
sudo journalctl --since yesterday --until today
Counting and Statistics
Generate log statistics:
# Count total lines in log
wc -l /var/log/syslog
# Count error occurrences
sudo grep -c "error" /var/log/syslog
# Count unique IP addresses
sudo grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' /var/log/nginx/access.log | sort -u | wc -l
# Top 10 most common errors
sudo grep "error" /var/log/syslog | sort | uniq -c | sort -rn | head -10
# Count logs per hour
sudo awk '{print $3}' /var/log/syslog | cut -d: -f1 | sort | uniq -c
Advanced Log Analysis with AWK
AWK is a powerful text processing tool ideal for structured log analysis.
Basic AWK Log Analysis
Print specific columns:
# Print timestamp and message (columns 1-3 and 5+)
sudo awk '{print $1, $2, $3, $5}' /var/log/syslog
# Print only error messages
sudo awk '/error/ {print $0}' /var/log/syslog
# Print messages from specific service
sudo awk '/nginx/ {print $0}' /var/log/syslog
Field-based filtering:
# Print logs from specific hour
sudo awk '$3 ~ /^14:/ {print $0}' /var/log/syslog
# Print logs with specific process
sudo awk '$5 == "sshd[1234]:" {print $0}' /var/log/auth.log
# Sum values in column (e.g., response sizes)
sudo awk '{sum+=$10} END {print sum}' /var/log/nginx/access.log
Advanced AWK Examples
Analyze Apache/Nginx access logs:
# Count requests by IP address
sudo awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
# Count requests by HTTP status code
sudo awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# Calculate average response time
sudo awk '{sum+=$NF; count++} END {print sum/count}' /var/log/nginx/access.log
# Find 404 errors with URLs
sudo awk '$9 == 404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# Requests per minute
sudo awk '{print $4}' /var/log/nginx/access.log | cut -d: -f2,3 | sort | uniq -c
Parse authentication logs:
# Count failed login attempts by user
sudo awk '/Failed password/ {print $(NF-5)}' /var/log/auth.log | sort | uniq -c | sort -rn
# List successful sudo commands by user
sudo awk '/sudo.*COMMAND/ {for(i=1;i<=NF;i++) if($i=="USER=") print $(i+1)}' /var/log/auth.log | sort | uniq -c
# Track SSH connections by IP
sudo awk '/Accepted password/ {print $(NF-3)}' /var/log/auth.log | sort | uniq -c | sort -rn
System resource log analysis:
# Parse disk usage alerts
sudo awk '/disk.*full/ {print $0}' /var/log/syslog
# Track service restarts
sudo awk '/systemd.*Started/ {print $0}' /var/log/syslog
# Memory-related errors
sudo awk '/Out of memory|OOM/ {print $0}' /var/log/syslog
Log Analysis with SED
SED (Stream Editor) excels at log transformation and filtering.
Basic SED Operations
Filtering log entries:
# Delete blank lines
sudo sed '/^$/d' /var/log/syslog
# Print only lines containing "error"
sudo sed -n '/error/p' /var/log/syslog
# Delete lines containing "debug"
sudo sed '/debug/d' /var/log/syslog
# Print lines 100-200
sudo sed -n '100,200p' /var/log/syslog
Text transformation:
# Replace "error" with "ERROR"
sudo sed 's/error/ERROR/g' /var/log/syslog
# Remove timestamps (first 3 columns)
sudo sed 's/^[^ ]* [^ ]* [^ ]* //' /var/log/syslog
# Extract IP addresses
sudo sed -n 's/.*\([0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/p' /var/log/nginx/access.log
Advanced SED Analysis
Multi-line log processing:
# Join multi-line stack traces
sudo sed -e :a -e '/\\$/N; s/\\\n//; ta' /var/log/application.log
# Add separator between date changes
sudo sed '/^Jan 11/i\---' /var/log/syslog
Conditional processing:
# Add prefix to error lines
sudo sed '/error/s/^/[ERROR] /' /var/log/syslog
# Delete all lines after first error
sudo sed '/error/,$d' /var/log/syslog
# Keep only lines between two patterns
sudo sed -n '/START/,/END/p' /var/log/application.log
Practical Log Analysis Examples
Security Analysis
Detect brute force attacks:
#!/bin/bash
# detect-bruteforce.sh - Identify SSH brute force attempts
echo "=== Failed SSH Login Attempts by IP ==="
sudo grep "Failed password" /var/log/auth.log | \
awk '{print $(NF-3)}' | \
sort | uniq -c | sort -rn | \
awk '$1 > 5 {print $1 " attempts from " $2}'
echo ""
echo "=== Failed Logins by Username ==="
sudo grep "Failed password" /var/log/auth.log | \
awk '{print $(NF-5)}' | \
sort | uniq -c | sort -rn | head -10
echo ""
echo "=== Recent Failed Attempts (Last 20) ==="
sudo grep "Failed password" /var/log/auth.log | tail -20
Audit sudo usage:
#!/bin/bash
# audit-sudo.sh - Track sudo command usage
echo "=== Sudo Commands by User ==="
sudo grep "sudo.*COMMAND" /var/log/auth.log | \
awk '{for(i=1;i<=NF;i++) if($i=="USER=") print $(i+1)}' | \
sort | uniq -c | sort -rn
echo ""
echo "=== Recent Sudo Commands ==="
sudo grep "sudo.*COMMAND" /var/log/auth.log | \
tail -20 | \
awk '{for(i=1;i<=NF;i++) if($i=="COMMAND=") {for(j=i+1;j<=NF;j++) printf $j" "; print ""}}'
Monitor privilege escalation attempts:
#!/bin/bash
# privilege-escalation.sh
echo "=== Failed Sudo Attempts ==="
sudo grep "sudo.*incorrect password" /var/log/auth.log
echo ""
echo "=== Su Command Usage ==="
sudo grep "su\[" /var/log/auth.log | tail -20
echo ""
echo "=== Authentication Failures ==="
sudo grep "authentication failure" /var/log/auth.log | \
awk '{print $NF}' | sort | uniq -c | sort -rn
Application Error Analysis
Web server error analysis:
#!/bin/bash
# webserver-errors.sh - Analyze web server errors
LOG="/var/log/nginx/error.log"
echo "=== Error Distribution by Type ==="
sudo grep -oE 'error|warn|crit|alert|emerg' "$LOG" | \
sort | uniq -c | sort -rn
echo ""
echo "=== Most Common Error Messages ==="
sudo grep "error" "$LOG" | \
awk -F'] ' '{print $2}' | \
sort | uniq -c | sort -rn | head -10
echo ""
echo "=== Errors by Hour ==="
sudo grep "error" "$LOG" | \
awk '{print $1, $2}' | cut -d: -f1 | \
sort | uniq -c
echo ""
echo "=== PHP Errors ==="
sudo grep -i "php" "$LOG" | tail -10
Database error analysis:
#!/bin/bash
# database-errors.sh - Analyze MySQL/MariaDB errors
LOG="/var/log/mysql/error.log"
echo "=== Database Error Summary ==="
sudo grep -i "error" "$LOG" | tail -20
echo ""
echo "=== Connection Issues ==="
sudo grep -i "connection\|connect" "$LOG" | tail -10
echo ""
echo "=== Crash/Restart Events ==="
sudo grep -i "shutdown\|started\|crash" "$LOG" | tail -10
echo ""
echo "=== Slow Query Warnings ==="
sudo grep -i "slow" "$LOG" | tail -10
System Performance Analysis
Disk space warnings:
#!/bin/bash
# disk-warnings.sh
echo "=== Disk Space Warnings ==="
sudo grep -i "no space left\|disk full\|quota exceeded" /var/log/syslog
echo ""
echo "=== Disk I/O Errors ==="
sudo grep -i "I/O error\|disk error" /var/log/kern.log | tail -10
Memory issues:
#!/bin/bash
# memory-issues.sh
echo "=== Out of Memory Events ==="
sudo grep -i "out of memory\|OOM\|killed process" /var/log/syslog | tail -20
echo ""
echo "=== Memory Allocation Failures ==="
sudo grep -i "allocation failed\|cannot allocate memory" /var/log/syslog | tail -10
Service failures:
#!/bin/bash
# service-failures.sh
echo "=== Failed Service Starts ==="
sudo grep -i "failed\|failure" /var/log/syslog | \
grep -i "service\|systemd" | tail -20
echo ""
echo "=== Service Restarts ==="
sudo grep "systemd.*Started\|systemd.*Stopped" /var/log/syslog | tail -20
echo ""
echo "=== Crashed Services ==="
sudo grep -i "crash\|core dump\|segfault" /var/log/syslog | tail -10
Log Analysis Scripts
Comprehensive Log Analyzer
#!/bin/bash
# comprehensive-log-analyzer.sh - Daily log analysis report
REPORT_DATE=$(date +%Y-%m-%d)
REPORT_FILE="/var/log/analysis/report-$REPORT_DATE.txt"
mkdir -p /var/log/analysis
{
echo "========================================="
echo "System Log Analysis Report"
echo "Date: $REPORT_DATE"
echo "Hostname: $(hostname)"
echo "========================================="
echo ""
echo "--- SECURITY ANALYSIS ---"
echo ""
echo "Failed SSH Login Attempts by IP:"
sudo grep "Failed password" /var/log/auth.log 2>/dev/null | \
awk '{print $(NF-3)}' | sort | uniq -c | sort -rn | head -10
echo ""
echo "Sudo Command Usage:"
sudo grep "sudo.*COMMAND" /var/log/auth.log 2>/dev/null | wc -l
echo ""
echo "--- ERROR ANALYSIS ---"
echo ""
echo "System Errors:"
sudo grep -i "error" /var/log/syslog 2>/dev/null | wc -l
echo ""
echo "Critical Events:"
sudo grep -i "critical\|crit" /var/log/syslog 2>/dev/null | tail -10
echo ""
echo "--- SERVICE STATUS ---"
echo ""
echo "Service Failures:"
sudo grep -i "failed" /var/log/syslog 2>/dev/null | \
grep -i "service\|systemd" | tail -10
echo ""
echo "Service Restarts:"
sudo grep "systemd.*Started" /var/log/syslog 2>/dev/null | \
wc -l
echo ""
echo "--- RESOURCE ISSUES ---"
echo ""
echo "Disk Space Warnings:"
sudo grep -i "no space left\|disk full" /var/log/syslog 2>/dev/null | wc -l
echo ""
echo "Memory Issues:"
sudo grep -i "out of memory\|OOM" /var/log/syslog 2>/dev/null | wc -l
echo ""
echo "--- WEB SERVER ANALYSIS ---"
if [ -f /var/log/nginx/access.log ]; then
echo ""
echo "Total HTTP Requests Today:"
sudo grep "$(date +%d/%b/%Y)" /var/log/nginx/access.log 2>/dev/null | wc -l
echo ""
echo "HTTP Status Code Distribution:"
sudo grep "$(date +%d/%b/%Y)" /var/log/nginx/access.log 2>/dev/null | \
awk '{print $9}' | sort | uniq -c | sort -rn
echo ""
echo "Top 10 Requesting IPs:"
sudo grep "$(date +%d/%b/%Y)" /var/log/nginx/access.log 2>/dev/null | \
awk '{print $1}' | sort | uniq -c | sort -rn | head -10
fi
echo ""
echo "========================================="
echo "Report Generated: $(date)"
echo "========================================="
} > "$REPORT_FILE"
echo "Analysis report saved to: $REPORT_FILE"
cat "$REPORT_FILE"
Real-Time Log Monitor
#!/bin/bash
# realtime-monitor.sh - Monitor logs for critical events
# Colors for output
RED='\033[0;31m'
YELLOW='\033[1;33m'
GREEN='\033[0;32m'
NC='\033[0m' # No Color
echo "Starting real-time log monitoring..."
echo "Press Ctrl+C to stop"
echo ""
# Monitor multiple logs simultaneously
sudo tail -f /var/log/syslog /var/log/auth.log 2>/dev/null | while read line; do
# Check for critical patterns
if echo "$line" | grep -qi "error\|critical\|failed\|failure"; then
echo -e "${RED}[ERROR]${NC} $line"
elif echo "$line" | grep -qi "warning\|warn"; then
echo -e "${YELLOW}[WARN]${NC} $line"
elif echo "$line" | grep -qi "failed password\|authentication failure"; then
echo -e "${RED}[SECURITY]${NC} $line"
elif echo "$line" | grep -qi "started\|stopped"; then
echo -e "${GREEN}[SERVICE]${NC} $line"
else
echo "$line"
fi
done
Log Retention Checker
#!/bin/bash
# log-retention-check.sh - Check log file sizes and retention
echo "=== Log File Size Analysis ==="
echo ""
# Find largest log files
echo "Top 10 Largest Log Files:"
sudo find /var/log -type f -exec du -h {} \; 2>/dev/null | \
sort -rh | head -10
echo ""
# Find old log files
echo "Log Files Older Than 30 Days:"
sudo find /var/log -type f -mtime +30 -exec ls -lh {} \; 2>/dev/null
echo ""
# Check total log directory size
echo "Total /var/log Directory Size:"
sudo du -sh /var/log
echo ""
# Check available disk space
echo "Available Disk Space on /var:"
df -h /var | tail -1
Alerting Based on Log Analysis
Email Alerts for Critical Events
#!/bin/bash
# log-alert.sh - Send email alerts for critical log events
EMAIL="[email protected]"
HOSTNAME=$(hostname)
TEMP_FILE="/tmp/log-alert-$$.txt"
# Check for critical errors
CRITICAL_ERRORS=$(sudo grep -i "critical\|emerg\|alert" /var/log/syslog 2>/dev/null | tail -10)
if [ -n "$CRITICAL_ERRORS" ]; then
{
echo "Critical errors detected on $HOSTNAME"
echo "Time: $(date)"
echo ""
echo "Recent Critical Events:"
echo "$CRITICAL_ERRORS"
} > "$TEMP_FILE"
mail -s "ALERT: Critical Errors on $HOSTNAME" "$EMAIL" < "$TEMP_FILE"
rm -f "$TEMP_FILE"
fi
# Check for failed login attempts
FAILED_LOGINS=$(sudo grep "Failed password" /var/log/auth.log 2>/dev/null | tail -10 | wc -l)
if [ "$FAILED_LOGINS" -gt 5 ]; then
{
echo "Multiple failed login attempts detected on $HOSTNAME"
echo "Time: $(date)"
echo "Count: $FAILED_LOGINS in last 10 entries"
echo ""
sudo grep "Failed password" /var/log/auth.log 2>/dev/null | tail -10
} | mail -s "ALERT: Failed Login Attempts on $HOSTNAME" "$EMAIL"
fi
# Check for disk space issues
DISK_ERRORS=$(sudo grep -i "no space left\|disk full" /var/log/syslog 2>/dev/null | tail -5)
if [ -n "$DISK_ERRORS" ]; then
{
echo "Disk space issues detected on $HOSTNAME"
echo "Time: $(date)"
echo ""
echo "Errors:"
echo "$DISK_ERRORS"
echo ""
echo "Current Disk Usage:"
df -h
} | mail -s "ALERT: Disk Space Issue on $HOSTNAME" "$EMAIL"
fi
Syslog-Based Alerting
#!/bin/bash
# syslog-monitor.sh - Monitor syslog for specific patterns
ALERT_PATTERNS=(
"error"
"critical"
"failed"
"out of memory"
"segfault"
"authentication failure"
)
# Monitor syslog in real-time
sudo tail -f /var/log/syslog | while read line; do
for pattern in "${ALERT_PATTERNS[@]}"; do
if echo "$line" | grep -qi "$pattern"; then
# Log to separate alert file
echo "$(date): $line" >> /var/log/critical-alerts.log
# Send to monitoring system (example: webhook)
# curl -X POST -d "alert=$line" https://monitoring.example.com/webhook
break
fi
done
done
Troubleshooting with Logs
Common Troubleshooting Scenarios
1. Service Won't Start
# Check service-specific logs
sudo journalctl -u service-name -n 50
# Check syslog for errors
sudo grep "service-name" /var/log/syslog | tail -20
# Check for dependency issues
sudo systemctl status service-name
2. SSH Connection Issues
# Check authentication logs
sudo grep "sshd" /var/log/auth.log | tail -30
# Look for specific connection attempts
sudo grep "Connection from" /var/log/auth.log | tail -20
# Check for denied connections
sudo grep "refused\|denied" /var/log/auth.log | tail -20
3. Web Server Not Responding
# Check error log for issues
sudo tail -50 /var/log/nginx/error.log
# Look for critical errors
sudo grep -i "crit\|emerg" /var/log/nginx/error.log | tail -20
# Check recent access patterns
sudo tail -100 /var/log/nginx/access.log
4. High System Load
# Check for OOM killer events
sudo grep -i "out of memory" /var/log/syslog
# Look for CPU-intensive processes in logs
sudo grep -i "cpu\|load" /var/log/syslog | tail -20
# Check kernel messages
dmesg | grep -i "error\|fail" | tail -20
5. Disk I/O Problems
# Check for I/O errors
sudo grep -i "I/O error" /var/log/kern.log
# Look for disk-related issues
sudo grep -i "disk\|ata\|sd[a-z]" /var/log/syslog | tail -30
# Check SMART errors (if available)
sudo grep -i "smart" /var/log/syslog
Log Management Best Practices
Log Rotation
Ensure logs are rotated to prevent disk space issues:
# Check logrotate configuration
cat /etc/logrotate.conf
# Test logrotate configuration
sudo logrotate -d /etc/logrotate.conf
# Force log rotation
sudo logrotate -f /etc/logrotate.conf
Log Compression
# Compress old logs manually
sudo gzip /var/log/syslog.1
# Find and compress old logs
sudo find /var/log -type f -name "*.log.1" -exec gzip {} \;
Log Archival
#!/bin/bash
# archive-logs.sh - Archive logs older than 30 days
ARCHIVE_DIR="/var/log/archive"
ARCHIVE_DATE=$(date -d "30 days ago" +%Y%m%d)
mkdir -p "$ARCHIVE_DIR"
# Find and archive old logs
sudo find /var/log -type f -name "*.log.*" -mtime +30 -exec mv {} "$ARCHIVE_DIR/" \;
# Compress archived logs
sudo tar -czf "$ARCHIVE_DIR/logs-$ARCHIVE_DATE.tar.gz" -C "$ARCHIVE_DIR" . --remove-files
echo "Logs archived to $ARCHIVE_DIR/logs-$ARCHIVE_DATE.tar.gz"
Conclusion
Effective log analysis is a cornerstone of successful Linux system administration, security monitoring, and troubleshooting. The /var/log/ directory contains a wealth of information that, when properly analyzed, provides deep insights into system behavior, security events, and application performance.
Key takeaways from this guide:
- Understanding log structure - Know where different types of events are logged and what each log file contains
- Command-line tools - Master grep, awk, sed, tail, and journalctl for efficient log analysis
- Pattern recognition - Identify common log patterns for security incidents, errors, and performance issues
- Automation - Create scripts to automate routine log analysis and alerting
- Proactive monitoring - Implement real-time monitoring and alerting for critical events
- Log management - Maintain proper log rotation, retention, and archival strategies
Regular log analysis should be part of your daily system administration routine. By developing a systematic approach to log review, you'll catch issues early, respond quickly to security incidents, and maintain a healthy, well-performing infrastructure.
Remember that logs are only valuable if they're reviewed and acted upon. Establish a regular log review schedule, automate common analysis tasks, and integrate log monitoring into your overall observability strategy to maximize the value of your system logs.


