dmesg and Kernel Log Analysis
dmesg is the primary tool for reading the Linux kernel ring buffer, which contains messages about hardware initialization, driver events, filesystem operations, and system errors. This guide covers dmesg usage, log levels, filtering techniques, timestamp interpretation, common error patterns, and automated monitoring of kernel messages.
Prerequisites
- Any Linux system (Ubuntu, CentOS, Rocky, Debian)
- Root or sudo access (some dmesg output restricted to root since kernel 3.17)
- Basic command-line knowledge
Basic dmesg Usage
# Show all kernel messages (requires sudo on most modern systems)
sudo dmesg
# Show last N lines
sudo dmesg | tail -20
# Human-readable output with timestamps (kernel 3.5+)
sudo dmesg -H
# Human-readable with relative timestamps
sudo dmesg -T # Converts to wall-clock time
# Clear the ring buffer (clears output for next run)
sudo dmesg -C
# Show messages and clear
sudo dmesg -c
# Follow new messages in real time
sudo dmesg -w
sudo dmesg --follow
# Show messages with colors (kernel 4.x+)
sudo dmesg -L
# View raw kernel log level indicators
sudo dmesg --raw | head -20
# Output format: <level>message text
# <0> = EMERG, <1> = ALERT, ... <7> = DEBUG
Log Levels and Filtering
The Linux kernel uses 8 log levels (0-7):
| Level | Name | Meaning |
|---|---|---|
| 0 | EMERG | System is unusable |
| 1 | ALERT | Action must be taken immediately |
| 2 | CRIT | Critical conditions |
| 3 | ERR | Error conditions |
| 4 | WARNING | Warning conditions |
| 5 | NOTICE | Normal but significant condition |
| 6 | INFO | Informational messages |
| 7 | DEBUG | Debug-level messages |
# Filter by log level (show only errors and above)
sudo dmesg --level=err,crit,alert,emerg
# Short form: -l
sudo dmesg -l err
# Show warnings and above
sudo dmesg -l warn,err,crit,alert,emerg
# Filter by subsystem/facility
sudo dmesg --facility=kern
# Combine level and facility
sudo dmesg --level=err --facility=kern
Quick pattern filtering:
# Find all errors and warnings
sudo dmesg | grep -iE "error|fail|warn|critical|oom|killed"
# Find hardware-specific messages
sudo dmesg | grep -i "nvme\|sata\|usb\|pci\|ata"
# Find memory messages
sudo dmesg | grep -iE "memory|oom|killed process"
# Find network messages
sudo dmesg | grep -i "eth\|ens\|enp\|link\|carrier"
Timestamps and Time Correlation
By default, dmesg shows time since boot in seconds. Convert to useful formats:
# Relative timestamp (seconds since boot)
sudo dmesg | head -5
# [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd040]
# [ 0.000000] Linux version 5.15.0-100-generic...
# Convert to wall clock time
sudo dmesg -T | head -5
# [Mon Apr 4 10:23:45 2026] Booting Linux...
# Find events around a specific time
sudo dmesg -T | grep "Apr 4 10:"
# Correlate with system boot time
uptime # Shows how long system has been running
who -b # Shows last boot time
# Calculate approximate real time from dmesg offset
# If uptime = 2 days, 3 hours, dmesg offset 100.5s = ~2 days, 3 hours ago
# Use journalctl for kernel messages with full timestamps
journalctl -k # All kernel messages
journalctl -k -b # Current boot kernel messages
journalctl -k -b -1 # Previous boot kernel messages
journalctl -k --since "1 hour ago"
journalctl -k -p err # Kernel errors only
Common Error Patterns
OOM Killer (Out of Memory):
# Find OOM kills
sudo dmesg | grep -E "oom|Out of memory|Killed process"
# Detailed OOM info
sudo dmesg | grep -A 15 "Out of memory"
# Shows: OOM score, memory maps, which process was killed
Disk Errors:
# SCSI/SATA errors
sudo dmesg | grep -iE "ata.*error|scsi.*error|I/O error|sector"
# NVME errors
sudo dmesg | grep -i "nvme.*error\|nvme.*reset"
# Filesystem errors
sudo dmesg | grep -iE "ext4.*error|xfs.*error|btrfs.*error"
# Interpret common disk errors:
# "end_request: I/O error" = hardware I/O failure
# "DRDY ERR" = device ready error (failing drive)
# "UNC" = uncorrectable error (bad sectors)
Network errors:
# Link state changes
sudo dmesg | grep -iE "link (up|down)|carrier"
# NIC errors
sudo dmesg | grep -iE "eth[0-9]|ens[0-9]|eno[0-9]" | grep -i "error\|reset\|timeout"
# Typical output for link down:
# [12345.678901] eth0: Link is Down
# [12346.123456] eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Kernel BUG and Oops:
# Find kernel bugs
sudo dmesg | grep -E "BUG:|Oops:|kernel BUG|Call Trace"
# Full Oops analysis
sudo dmesg | grep -A 40 "kernel BUG"
# Look for: RIP (instruction pointer), Call Trace (stack trace)
Memory hardware errors (EDAC):
# Check for correctable/uncorrectable memory errors
sudo dmesg | grep -iE "edac|corrected|uncorrected|mce"
# Machine Check Exceptions (hardware errors)
sudo dmesg | grep -i "mce\|machine check"
Hardware Troubleshooting with dmesg
Boot time hardware detection:
# View all boot messages
sudo dmesg -T | less
# CPU detection
sudo dmesg | grep -i "cpu\|processor\|core"
# Memory detection
sudo dmesg | grep -iE "memory|BIOS-e820|NUMA"
# PCI devices detected
sudo dmesg | grep -i pci | head -30
# USB devices
sudo dmesg | grep -i usb | head -20
# Block devices
sudo dmesg | grep -iE "sd[a-z]|nvme|ata[0-9]" | head -20
Monitor hot-plug events:
# Watch for device connect/disconnect in real time
sudo dmesg -w | grep -i usb
# After plugging in a USB drive:
# [ 1234.567] usb 1-1: new high-speed USB device number 3
# [ 1234.789] scsi host4: usb-storage 1-1:1.0
# [ 1235.001] sd 4:0:0:0: [sdb] 61497344 512-byte sectors
Diagnose GPU issues:
# NVIDIA driver messages
sudo dmesg | grep -i nvidia
# AMD GPU (AMDGPU/radeon)
sudo dmesg | grep -iE "amdgpu|radeon|drm"
# Intel GPU
sudo dmesg | grep -i "i915\|intel.*drm"
# GPU reset (crash recovery)
sudo dmesg | grep -i "gpu reset\|ring.*timeout"
Kernel Log Files and journald
Kernel messages are also saved to persistent log files:
# /var/log/kern.log (Ubuntu/Debian) - persistent kernel log
sudo tail -100 /var/log/kern.log
# /var/log/messages (CentOS/Rocky) - combined system log
sudo tail -100 /var/log/messages | grep kernel
# journald is the recommended way (persistent, structured)
journalctl -k --since "today"
journalctl -k -b -1 # Previous boot kernel messages
# Export kernel log to file
journalctl -k -b > /tmp/kernel-log-$(date +%Y%m%d).txt
Configure kernel log level (console and ring buffer):
# View current log levels
cat /proc/sys/kernel/printk
# Format: console_level ring_buffer_default_level min_level default_level
# Example: 4 4 1 7
# (4=WARNING to console, 4=WARNING to ring buffer, 1=min, 7=DEBUG max)
# Increase verbosity (show DEBUG to console)
sudo sysctl -w kernel.printk="7 4 1 7"
# Reduce noise (only ERRORS to console)
sudo sysctl -w kernel.printk="3 4 1 7"
# Make permanent
echo "kernel.printk = 4 4 1 7" | sudo tee /etc/sysctl.d/99-kernel-log.conf
Automated Kernel Log Monitoring
Set up automated alerting for kernel errors:
sudo tee /usr/local/bin/kernel-log-monitor.sh << 'SCRIPT'
#!/bin/bash
# Monitor kernel log for critical events
LOG_FILE="/var/log/kernel-monitor.log"
ALERT_EMAIL="[email protected]"
THRESHOLD_MINUTES=5
# Patterns to alert on
CRITICAL_PATTERNS="Out of memory|oom-killer|I/O error|DRDY ERR|kernel BUG|Oops:|Call Trace|EDAC|Machine Check"
# Get kernel messages from last N minutes
RECENT_ERRORS=$(journalctl -k \
--since "$THRESHOLD_MINUTES minutes ago" \
--priority=0..3 \
--no-pager 2>/dev/null)
if [ -n "$RECENT_ERRORS" ]; then
HOSTNAME=$(hostname)
TIMESTAMP=$(date -u '+%Y-%m-%d %H:%M:%S UTC')
echo "$TIMESTAMP - Kernel errors detected on $HOSTNAME:" >> "$LOG_FILE"
echo "$RECENT_ERRORS" >> "$LOG_FILE"
echo "---" >> "$LOG_FILE"
# Send email alert
echo "$RECENT_ERRORS" | \
mail -s "ALERT: Kernel errors on $HOSTNAME" "$ALERT_EMAIL" 2>/dev/null
echo "Alert sent for errors at $TIMESTAMP"
fi
SCRIPT
sudo chmod +x /usr/local/bin/kernel-log-monitor.sh
# Check every 5 minutes
(sudo crontab -l 2>/dev/null; echo "*/5 * * * * /usr/local/bin/kernel-log-monitor.sh") | sudo crontab -
Parse and summarize dmesg for reports:
cat << 'EOF' > /usr/local/bin/dmesg-summary.sh
#!/bin/bash
echo "=== Kernel Log Summary: $(hostname) - $(date) ==="
echo ""
echo "-- Critical/Error messages --"
sudo dmesg -T --level=err,crit,alert,emerg 2>/dev/null | tail -20
echo ""
echo "-- OOM Events --"
sudo dmesg | grep -c "Out of memory" 2>/dev/null | \
xargs -I{} echo "{} OOM events found"
echo ""
echo "-- Disk I/O Errors --"
sudo dmesg | grep -c "I/O error" 2>/dev/null | \
xargs -I{} echo "{} I/O errors found"
echo ""
echo "-- Network Link Changes --"
sudo dmesg -T | grep -iE "link (up|down)" | tail -10
echo ""
echo "-- Last 10 kernel messages --"
sudo dmesg -T | tail -10
EOF
chmod +x /usr/local/bin/dmesg-summary.sh
Troubleshooting
"Operation not permitted" running dmesg without sudo:
# Modern kernels restrict unprivileged dmesg access
# Check restriction level
cat /proc/sys/kernel/dmesg_restrict
# 0 = unrestricted, 1 = restricted to root
# Allow non-root users (less secure)
sudo sysctl -w kernel.dmesg_restrict=0
dmesg ring buffer is full and losing old messages:
# Check ring buffer size
dmesg --buffer-size 2>/dev/null || cat /proc/sys/kernel/printk_devkmsg
# Increase ring buffer at boot (GRUB kernel parameter)
# Add to GRUB_CMDLINE_LINUX in /etc/default/grub:
# log_buf_len=32M
sudo update-grub
Timestamps show wrong time zone:
# dmesg -T converts using local timezone
# Verify timezone
timedatectl
# Force UTC output
TZ=UTC dmesg -T
Too many messages to find what you need:
# Use grep with context
sudo dmesg -T | grep -B5 -A10 "error" | less
# Pipe to less for paging
sudo dmesg -H | less
# Filter by time range after converting
sudo dmesg -T | awk '/Apr 4 1[0-2]/ {print}'
Conclusion
dmesg and the kernel log are the first place to look when diagnosing hardware failures, driver issues, memory problems, and filesystem errors on Linux. Use dmesg -T for human-readable timestamps and dmesg -l err to focus on errors. For production systems, combine journalctl -k for persistent log access with the monitoring script to get proactive alerts when the kernel reports critical events. Early detection of disk I/O errors, OOM kills, and hardware faults can prevent minor issues from becoming full outages.


