Docker Logs and Troubleshooting: Complete Practical Guide

Effective logging and troubleshooting are critical for maintaining healthy Docker environments. This comprehensive guide covers Docker logging mechanisms, debugging techniques, common issues, and production-ready monitoring strategies for containerized applications.

Table of Contents

Introduction

Docker logging captures stdout and stderr streams from containers, providing visibility into application behavior. Proper logging and troubleshooting skills are essential for diagnosing issues, monitoring performance, and maintaining reliable containerized infrastructure.

What You'll Learn

  • Container log management and viewing techniques
  • Logging driver configuration for different backends
  • Debugging running and failed containers
  • Common Docker issues and their solutions
  • Production logging strategies
  • Advanced troubleshooting techniques

Prerequisites

  • Docker Engine installed
  • Basic Docker knowledge (containers, images)
  • Understanding of Linux logging concepts
  • Terminal access with appropriate permissions

Verify setup:

docker --version
docker info | grep "Logging Driver"

Docker Logging Architecture

How Docker Logs Work

Application (stdout/stderr)
    ↓
Container Runtime
    ↓
Logging Driver
    ↓
Log Destination (json-file, syslog, etc.)

Default Logging Driver

# Check default logging driver
docker info | grep "Logging Driver"

# Default is json-file
# Logs stored in: /var/lib/docker/containers/<container-id>/<container-id>-json.log

Log Types

  • Container logs: Application stdout/stderr
  • Daemon logs: Docker daemon itself
  • Image build logs: Dockerfile build output

Container Logs

Viewing Container Logs

# View all logs
docker logs container-name

# Follow logs in real-time
docker logs -f container-name

# Show timestamps
docker logs -t container-name

# Show last N lines
docker logs --tail 100 container-name

# Show logs since timestamp
docker logs --since 2024-01-01T00:00:00 container-name

# Show logs from last hour
docker logs --since 1h container-name

# Show logs until timestamp
docker logs --until 2024-01-01T23:59:59 container-name

# Combine options
docker logs -f --tail 50 --since 30m container-name

Log Filtering

# Logs between timestamps
docker logs --since 2024-01-01T00:00:00 --until 2024-01-01T12:00:00 container-name

# Last 24 hours
docker logs --since 24h container-name

# With grep filtering
docker logs container-name 2>&1 | grep ERROR

# Search for pattern
docker logs container-name 2>&1 | grep -i "connection failed"

# Count errors
docker logs container-name 2>&1 | grep -c ERROR

Multiple Container Logs

# View logs from multiple containers
docker-compose logs

# Follow all service logs
docker-compose logs -f

# Specific service
docker-compose logs -f web

# Multiple services
docker-compose logs web api database

Log Output Formatting

# Without color
docker logs --no-color container-name

# Details format
docker logs --details container-name

# JSON format (raw)
sudo cat /var/lib/docker/containers/<container-id>/<container-id>-json.log

Logging Drivers

Docker supports multiple logging drivers for different log destinations.

Available Logging Drivers

  • json-file: Default, JSON format on disk
  • syslog: System logging daemon
  • journald: Systemd journal
  • gelf: Graylog Extended Log Format
  • fluentd: Fluentd logging
  • awslogs: Amazon CloudWatch Logs
  • splunk: Splunk logging
  • gcplogs: Google Cloud Logging
  • logentries: Logentries logging
  • none: Disable logging

Configure Logging Driver per Container

# Use syslog driver
docker run -d \
  --log-driver syslog \
  --log-opt syslog-address=tcp://192.168.1.100:514 \
  nginx

# Use json-file with options
docker run -d \
  --log-driver json-file \
  --log-opt max-size=10m \
  --log-opt max-file=3 \
  nginx

# Use journald
docker run -d \
  --log-driver journald \
  --log-opt tag="webapp" \
  nginx

# Disable logging
docker run -d --log-driver none nginx

JSON File Driver (Default)

# Configure size limits
docker run -d \
  --log-driver json-file \
  --log-opt max-size=10m \
  --log-opt max-file=5 \
  --log-opt labels=production \
  --log-opt env=APP_ENV \
  nginx

Syslog Driver

# Send to remote syslog
docker run -d \
  --log-driver syslog \
  --log-opt syslog-address=tcp://syslog.example.com:514 \
  --log-opt tag="{{.Name}}/{{.ID}}" \
  --log-opt syslog-facility=daemon \
  nginx

# Local syslog
docker run -d \
  --log-driver syslog \
  --log-opt syslog-address=unix:///dev/log \
  nginx

Fluentd Driver

# Send logs to Fluentd
docker run -d \
  --log-driver fluentd \
  --log-opt fluentd-address=localhost:24224 \
  --log-opt tag="docker.{{.Name}}" \
  nginx

AWS CloudWatch Logs

# Send to CloudWatch
docker run -d \
  --log-driver awslogs \
  --log-opt awslogs-region=us-east-1 \
  --log-opt awslogs-group=my-app \
  --log-opt awslogs-stream=container-logs \
  nginx

GELF Driver (Graylog)

# Send to Graylog
docker run -d \
  --log-driver gelf \
  --log-opt gelf-address=udp://graylog.example.com:12201 \
  --log-opt tag="nginx" \
  nginx

Log Configuration

Global Daemon Configuration

Configure default logging for all containers:

# Edit daemon config
sudo nano /etc/docker/daemon.json

Add:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3",
    "labels": "production",
    "env": "os,customer"
  }
}

Restart Docker:

sudo systemctl restart docker

Docker Compose Logging

version: '3.8'

services:
  web:
    image: nginx
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

  api:
    image: my-api
    logging:
      driver: syslog
      options:
        syslog-address: "tcp://syslog:514"
        tag: "api"

  worker:
    image: my-worker
    logging:
      driver: fluentd
      options:
        fluentd-address: "localhost:24224"
        tag: "worker.{{.Name}}"

Environment-Specific Logging

# docker-compose.yml
version: '3.8'

services:
  app:
    image: my-app
    logging:
      driver: "json-file"
      options:
        max-size: "10m"

# docker-compose.prod.yml
version: '3.8'

services:
  app:
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://logs.example.com:514"

Debugging Containers

Inspect Container State

# View container details
docker inspect container-name

# Get specific field
docker inspect --format '{{.State.Status}}' container-name

# Get exit code
docker inspect --format '{{.State.ExitCode}}' container-name

# Get error message
docker inspect --format '{{.State.Error}}' container-name

# View all state information
docker inspect --format '{{json .State}}' container-name | jq

Execute Commands in Running Container

# Get shell access
docker exec -it container-name bash

# Run specific command
docker exec container-name ls -la /app

# Check process list
docker exec container-name ps aux

# Check environment variables
docker exec container-name env

# Check network connectivity
docker exec container-name ping -c 3 google.com

# Check disk usage
docker exec container-name df -h

# View running processes
docker top container-name

Debugging Failed Containers

# View logs of stopped container
docker logs container-name

# Start container in foreground
docker start -a container-name

# Commit failed container to image for debugging
docker commit container-name debug-image
docker run -it debug-image bash

# Override entrypoint to debug
docker run -it --entrypoint bash my-image

Attach to Running Container

# Attach to container's main process
docker attach container-name

# Attach with logs
docker attach --sig-proxy=false container-name

# Detach without stopping: Ctrl+P, Ctrl+Q

Copy Files for Debugging

# Copy file from container
docker cp container-name:/app/log.txt ./log.txt

# Copy file to container
docker cp config.json container-name:/etc/app/config.json

# Copy directory
docker cp container-name:/var/log ./container-logs

Network Debugging

# Use netshoot for network debugging
docker run -it --rm --network container:target-container nicolaka/netshoot

# Inside netshoot:
# - ping, traceroute, tcpdump
# - curl, wget
# - netstat, ss, lsof
# - iperf, iftop

# Check container IP
docker inspect --format '{{.NetworkSettings.IPAddress}}' container-name

# Check network
docker network inspect network-name

# Test connectivity
docker exec container-name ping other-container
docker exec container-name curl http://other-container:8080

Common Issues and Solutions

Container Won't Start

# Check logs
docker logs container-name

# Check exit code
docker inspect --format '{{.State.ExitCode}}' container-name

# Common exit codes:
# 0   - Success
# 1   - Application error
# 125 - Docker daemon error
# 126 - Command cannot be invoked
# 127 - Command not found
# 137 - Container killed (OOM or SIGKILL)
# 139 - Segmentation fault
# 143 - Graceful termination (SIGTERM)

# Try running in foreground
docker run -it my-image

# Check if image exists
docker images | grep my-image

# Pull latest image
docker pull my-image:latest

Port Already in Use

# Find process using port
sudo netstat -tulpn | grep :8080
sudo lsof -i :8080

# Kill process
sudo kill -9 <PID>

# Or use different port
docker run -p 8081:80 nginx

Container Exits Immediately

# Check if command completes
docker logs container-name

# Run with interactive shell
docker run -it --entrypoint sh my-image

# Override CMD
docker run my-image your-command

# Keep container running for debugging
docker run -d my-image tail -f /dev/null

Permission Denied Errors

# Check file permissions in container
docker exec container-name ls -la /path

# Run as root to fix permissions
docker exec -u root container-name chown -R appuser:appuser /path

# Check user container runs as
docker exec container-name whoami

# Run container as specific user
docker run --user 1000:1000 my-image

Out of Memory (OOM)

# Check if container was killed by OOM
docker inspect --format '{{.State.OOMKilled}}' container-name

# Set memory limits
docker run -m 512m --memory-swap 1g my-image

# Monitor memory usage
docker stats container-name

# Check system memory
free -h
docker system df

Disk Full

# Check Docker disk usage
docker system df

# Detailed view
docker system df -v

# Clean up unused resources
docker system prune -a --volumes

# Remove specific items
docker container prune
docker image prune -a
docker volume prune
docker network prune

DNS Resolution Issues

# Check DNS in container
docker exec container-name cat /etc/resolv.conf

# Test DNS
docker exec container-name nslookup google.com

# Use custom DNS
docker run --dns 8.8.8.8 --dns 8.8.4.4 my-image

# Set DNS in daemon.json
sudo nano /etc/docker/daemon.json

Add:

{
  "dns": ["8.8.8.8", "8.8.4.4"]
}

Network Connectivity Problems

# Check container network
docker network inspect bridge

# Restart networking
docker network disconnect bridge container-name
docker network connect bridge container-name

# Check iptables rules
sudo iptables -t nat -L -n

# Restart Docker daemon
sudo systemctl restart docker

Monitoring and Alerting

Real-Time Monitoring

# Monitor all containers
docker stats

# Monitor specific container
docker stats container-name

# No streaming (one-time output)
docker stats --no-stream

# Format output
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

Health Checks

# In Dockerfile
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost/ || exit 1
# Run with health check
docker run -d \
  --name web \
  --health-cmd="curl -f http://localhost/ || exit 1" \
  --health-interval=30s \
  --health-timeout=3s \
  --health-retries=3 \
  nginx

# Check health status
docker inspect --format='{{.State.Health.Status}}' web

# View health check logs
docker inspect --format='{{json .State.Health}}' web | jq

Container Events

# Stream events
docker events

# Filter events
docker events --filter 'type=container'
docker events --filter 'container=my-container'
docker events --filter 'event=start'

# Since timestamp
docker events --since '2024-01-01T00:00:00'

# Format output
docker events --format '{{.Time}} {{.Action}} {{.Actor.Attributes.name}}'

Logging Stack with ELK

version: '3.8'

services:
  app:
    image: my-app
    logging:
      driver: gelf
      options:
        gelf-address: "udp://logstash:12201"

  elasticsearch:
    image: elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
    volumes:
      - es-data:/usr/share/elasticsearch/data

  logstash:
    image: logstash:8.11.0
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    ports:
      - "12201:12201/udp"

  kibana:
    image: kibana:8.11.0
    ports:
      - "5601:5601"
    environment:
      ELASTICSEARCH_HOSTS: http://elasticsearch:9200

volumes:
  es-data:

Production Best Practices

Configure Log Rotation

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3",
    "compress": "true"
  }
}

Use Structured Logging

// Node.js example with structured logging
const winston = require('winston');

const logger = winston.createLogger({
  format: winston.format.json(),
  transports: [
    new winston.transports.Console()
  ]
});

logger.info('User login', {
  userId: 123,
  ip: '192.168.1.1',
  timestamp: new Date().toISOString()
});

Centralized Logging

# Send all logs to centralized system
version: '3.8'

x-logging: &default-logging
  driver: syslog
  options:
    syslog-address: "tcp://logs.example.com:514"
    tag: "{{.Name}}"

services:
  web:
    image: nginx
    logging: *default-logging

  api:
    image: my-api
    logging: *default-logging

Label Containers for Log Filtering

docker run -d \
  --label environment=production \
  --label app=web \
  --label version=1.0 \
  --log-opt labels=environment,app,version \
  nginx

Security: Don't Log Sensitive Data

// Bad - logs password
logger.info('Login attempt', { username, password });

// Good - don't log password
logger.info('Login attempt', { username });

// Good - mask sensitive data
logger.info('API request', { apiKey: '***' + apiKey.slice(-4) });

Advanced Troubleshooting

Debug Docker Daemon

# View daemon logs
sudo journalctl -fu docker.service

# Check Docker service status
sudo systemctl status docker

# Restart Docker daemon in debug mode
sudo dockerd --debug

# View daemon configuration
docker info

Analyze Image Layers

# View image history
docker history my-image

# Analyze with dive
dive my-image

# Check image size
docker images my-image

Network Packet Capture

# Capture traffic from container
docker exec container-name tcpdump -i eth0 -w /tmp/capture.pcap

# Copy capture file
docker cp container-name:/tmp/capture.pcap ./capture.pcap

# Analyze with Wireshark or tcpdump
tcpdump -r capture.pcap

System Resource Analysis

# Check system load
uptime
top
htop

# Check Docker resource usage
docker system df
docker stats --no-stream

# Check available disk space
df -h

# Check inode usage
df -i

Performance Profiling

# CPU profiling
docker stats container-name

# Memory profiling
docker exec container-name ps aux --sort=-%mem | head

# Strace for system call debugging
docker exec container-name strace -p 1

# ltrace for library calls
docker exec container-name ltrace -p 1

Conclusion

Effective logging and troubleshooting are essential for maintaining healthy Docker environments. This guide covered logging mechanisms, debugging techniques, and production best practices.

Key Takeaways

  • Log Management: Use appropriate logging drivers and rotation
  • Debugging Tools: Master docker logs, exec, inspect, and stats
  • Common Issues: Understand exit codes and typical problems
  • Monitoring: Implement health checks and real-time monitoring
  • Production: Centralize logs, use structured logging, rotate logs
  • Security: Never log sensitive data, use secure log transport

Quick Reference

# Logging
docker logs -f --tail 100 container-name    # View logs
docker logs --since 1h container-name       # Recent logs
docker stats container-name                 # Resource usage

# Debugging
docker exec -it container-name bash         # Shell access
docker inspect container-name               # Container details
docker top container-name                   # Processes

# Troubleshooting
docker events                               # Docker events
docker system df                            # Disk usage
docker system prune                         # Cleanup

# Health
docker inspect --format='{{.State.Health.Status}}' container-name

Next Steps

  1. Implement: Set up centralized logging
  2. Monitor: Deploy monitoring stack (Prometheus/Grafana)
  3. Automate: Create alerting for critical issues
  4. Document: Maintain troubleshooting runbooks
  5. Practice: Simulate failures for practice
  6. Optimize: Tune log retention and rotation
  7. Secure: Implement secure log transport and storage

Master these logging and troubleshooting techniques to maintain reliable, observable containerized applications.