Docker Volumes: Data Persistence - Complete Guide

Docker volumes are the preferred mechanism for persisting data generated and used by Docker containers. Unlike containers themselves, volumes persist beyond container lifecycle, enabling data sharing, backups, and migrations. This comprehensive guide covers everything about Docker volumes, from basic concepts to advanced production strategies.

Table of Contents

Introduction to Docker Volumes

Containers are ephemeral by design—when removed, all data inside is lost. Docker volumes solve this by providing persistent storage that exists independently of container lifecycle. Volumes are managed by Docker and stored outside the container's writable layer, offering better performance and easier management.

Why Use Docker Volumes?

  • Data Persistence: Survive container restarts and removals
  • Performance: Better I/O performance than container filesystem
  • Sharing: Share data between multiple containers
  • Backup: Easier to backup and restore
  • Portability: Move data between hosts
  • Security: Can be encrypted and access-controlled

Storage Types Comparison

FeatureVolumesBind Mountstmpfs
Managed by DockerYesNoNo
PersistentYesYesNo
ShareableYesYesNo
PerformanceHighMediumHighest
PortabilityHighLowN/A
Host accessLimitedDirectNone

Prerequisites

Before working with Docker volumes, ensure you have:

  • Docker Engine installed (version 17.06 or higher)
  • Basic understanding of Docker containers
  • Sudo/root access for some operations
  • Understanding of filesystem concepts

Verify Docker installation:

docker --version
docker info | grep "Storage Driver"

Types of Data Persistence

1. Named Volumes (Recommended)

Managed by Docker, stored in Docker's storage directory:

docker volume create my-volume
docker run -v my-volume:/app/data nginx

2. Bind Mounts

Direct mapping to host filesystem:

docker run -v /host/path:/container/path nginx

3. tmpfs Mounts

Temporary, in-memory storage (Linux only):

docker run --tmpfs /app/temp nginx

Volume Architecture

Container Layer (ephemeral)
    ↓
Volume Mount Point
    ↓
Docker Volume (persistent)
    ↓
Host Filesystem (/var/lib/docker/volumes/)

Named Volumes

Named volumes are Docker-managed storage units with explicit names, making them easy to reference and manage.

Creating Named Volumes

# Create basic volume
docker volume create my-data

# Create with driver options
docker volume create \
  --driver local \
  --opt type=none \
  --opt device=/path/on/host \
  --opt o=bind \
  custom-volume

# Create with labels
docker volume create \
  --label environment=production \
  --label backup=daily \
  prod-data

Volume Storage Location

# Default location: /var/lib/docker/volumes/
sudo ls -la /var/lib/docker/volumes/

# Inspect volume location
docker volume inspect my-data

Using Named Volumes

# Run container with named volume
docker run -d \
  --name web \
  -v my-data:/usr/share/nginx/html \
  nginx

# Multiple volumes
docker run -d \
  --name app \
  -v app-data:/app/data \
  -v app-logs:/app/logs \
  -v app-config:/app/config \
  my-app:latest

Volume with Specific Options

# Read-only volume
docker run -d -v my-data:/app/data:ro nginx

# Volume with nocopy option (don't copy data from container)
docker run -d -v my-data:/app/data:nocopy nginx

# SELinux label (for SELinux-enabled systems)
docker run -d -v my-data:/app/data:z nginx  # private
docker run -d -v my-data:/app/data:Z nginx  # shared

Sharing Volumes Between Containers

# Create volume
docker volume create shared-data

# First container writes data
docker run -d \
  --name writer \
  -v shared-data:/data \
  alpine sh -c 'echo "Hello" > /data/message.txt'

# Second container reads data
docker run --rm \
  -v shared-data:/data \
  alpine cat /data/message.txt

Volume Lifecycle

# Create volume
docker volume create app-data

# Use in container
docker run -d --name app -v app-data:/data my-app

# Container removed, volume persists
docker rm -f app

# Volume still exists
docker volume ls

# Manually remove volume
docker volume rm app-data

Bind Mounts

Bind mounts map a host directory or file directly into a container. Useful for development and when you need direct host filesystem access.

Basic Bind Mount

# Mount host directory
docker run -d \
  -v /host/path:/container/path \
  nginx

# Using --mount syntax (recommended)
docker run -d \
  --mount type=bind,source=/host/path,target=/container/path \
  nginx

Development Workflow

# Mount source code for development
docker run -d \
  --name dev-container \
  -v $(pwd)/app:/usr/src/app \
  -v $(pwd)/app/node_modules:/usr/src/app/node_modules \
  -p 3000:3000 \
  node:18-alpine \
  npm run dev

Read-Only Bind Mount

# Mount as read-only
docker run -d \
  -v /host/config:/app/config:ro \
  nginx

# Using --mount
docker run -d \
  --mount type=bind,source=/host/config,target=/app/config,readonly \
  nginx

File Bind Mount

# Mount single file
docker run -d \
  -v /host/nginx.conf:/etc/nginx/nginx.conf:ro \
  nginx

Bind Mount Permissions

# Preserve permissions
docker run -d \
  -v /host/data:/container/data \
  --user $(id -u):$(id -g) \
  nginx

# With specific user
docker run -d \
  -v /host/data:/container/data \
  --user 1000:1000 \
  nginx

Bind Mount Example: WordPress Development

# Mount WordPress source and database
docker run -d \
  --name wordpress \
  -v $(pwd)/wordpress:/var/www/html \
  -v $(pwd)/uploads:/var/www/html/wp-content/uploads \
  -p 8080:80 \
  wordpress:latest

docker run -d \
  --name mysql \
  -v $(pwd)/mysql:/var/lib/mysql \
  -e MYSQL_ROOT_PASSWORD=secret \
  mysql:8.0

tmpfs Mounts

tmpfs mounts are temporary, memory-only storage. Data is lost when container stops.

Creating tmpfs Mount

# Basic tmpfs mount
docker run -d \
  --tmpfs /app/temp \
  nginx

# Using --mount syntax
docker run -d \
  --mount type=tmpfs,target=/app/temp \
  nginx

# With size limit
docker run -d \
  --mount type=tmpfs,target=/app/temp,tmpfs-size=100m \
  nginx

tmpfs Use Cases

# Temporary cache
docker run -d \
  --tmpfs /tmp:rw,noexec,nosuid,size=1g \
  nginx

# Session storage
docker run -d \
  --tmpfs /app/sessions:size=512m \
  my-app

# Build artifacts (don't persist)
docker run --rm \
  --tmpfs /app/build:size=2g \
  my-builder

tmpfs Performance

Perfect for:

  • Temporary files
  • Cache that doesn't need persistence
  • Sensitive data (automatically cleared)
  • High-speed storage requirements
# Database temp directory
docker run -d \
  --name postgres \
  --tmpfs /var/lib/postgresql/tmp:size=1g \
  -v postgres-data:/var/lib/postgresql/data \
  postgres:15-alpine

Volume Drivers

Docker supports various volume drivers for different storage backends.

Local Driver (Default)

# Default local driver
docker volume create my-volume

# Local with options
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw \
  --opt device=:/path/to/share \
  nfs-volume

NFS Volume

# Create NFS volume
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=nfs-server.example.com,rw,nfsvers=4 \
  --opt device=:/exported/path \
  nfs-data

# Use NFS volume
docker run -d \
  -v nfs-data:/app/data \
  nginx

Third-Party Volume Plugins

# Install plugin (example: REX-Ray)
docker plugin install rexray/s3fs

# Create volume with plugin
docker volume create \
  --driver rexray/s3fs \
  --opt bucket=my-bucket \
  s3-volume

Popular volume plugins:

  • REX-Ray: AWS EBS, S3, etc.
  • Portworx: Container-native storage
  • GlusterFS: Distributed filesystem
  • Ceph: Distributed storage

Volume Management Commands

Creating Volumes

# Create volume
docker volume create my-volume

# Create with name
docker volume create --name production-data

# With labels
docker volume create \
  --label app=myapp \
  --label env=prod \
  app-data

Listing Volumes

# List all volumes
docker volume ls

# Filter by name
docker volume ls -f name=app

# Filter by label
docker volume ls -f label=env=prod

# Show dangling volumes
docker volume ls -f dangling=true

Inspecting Volumes

# Inspect volume
docker volume inspect my-volume

# Get specific field
docker volume inspect --format '{{.Mountpoint}}' my-volume

# Inspect multiple volumes
docker volume inspect volume1 volume2

Removing Volumes

# Remove specific volume
docker volume rm my-volume

# Remove multiple volumes
docker volume rm volume1 volume2 volume3

# Remove all unused volumes
docker volume prune

# Force remove all volumes (dangerous!)
docker volume rm $(docker volume ls -q)

Volume Information

# Get volume size (requires sudo)
sudo du -sh /var/lib/docker/volumes/my-volume

# List volumes with size
docker system df -v

# Check volume usage
docker volume inspect my-volume | grep Mountpoint

Using Volumes with Containers

Multiple Volumes

# Container with multiple volumes
docker run -d \
  --name app \
  -v app-data:/app/data \
  -v app-logs:/var/log/app \
  -v app-cache:/app/cache \
  -v app-config:/etc/app:ro \
  my-app:latest

Volume from Another Container

# Create data container
docker create -v /data --name data-container alpine

# Use volumes from data container
docker run -d \
  --volumes-from data-container \
  --name app \
  nginx

Populate Volume from Container

# Container image has data in /app/data
# First run copies data to empty volume
docker run -d \
  --name app \
  -v app-data:/app/data \
  my-app:latest

# Subsequent runs use existing volume data
docker rm -f app
docker run -d \
  --name app \
  -v app-data:/app/data \
  my-app:latest

Volumes in Docker Compose

Basic Volume Configuration

version: '3.8'

services:
  web:
    image: nginx
    volumes:
      - html-data:/usr/share/nginx/html

  database:
    image: postgres:15-alpine
    volumes:
      - db-data:/var/lib/postgresql/data

volumes:
  html-data:
  db-data:

Named Volumes with Options

version: '3.8'

services:
  app:
    image: my-app
    volumes:
      - app-data:/app/data

volumes:
  app-data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /path/on/host

Bind Mounts in Compose

version: '3.8'

services:
  web:
    image: nginx
    volumes:
      - ./html:/usr/share/nginx/html
      - ./nginx.conf:/etc/nginx/nginx.conf:ro

Volume with NFS

version: '3.8'

services:
  app:
    image: my-app
    volumes:
      - nfs-data:/app/data

volumes:
  nfs-data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=nfs-server.example.com,rw
      device: ":/exported/path"

External Volumes

version: '3.8'

services:
  app:
    image: my-app
    volumes:
      - existing-volume:/app/data

volumes:
  existing-volume:
    external: true

Complete Compose Example

version: '3.8'

services:
  wordpress:
    image: wordpress:latest
    volumes:
      - wordpress-html:/var/www/html
      - wordpress-uploads:/var/www/html/wp-content/uploads
    environment:
      WORDPRESS_DB_HOST: database
      WORDPRESS_DB_NAME: wordpress
      WORDPRESS_DB_USER: wordpress
      WORDPRESS_DB_PASSWORD: ${DB_PASSWORD}

  database:
    image: mysql:8.0
    volumes:
      - db-data:/var/lib/mysql
      - ./mysql-init:/docker-entrypoint-initdb.d:ro
    environment:
      MYSQL_DATABASE: wordpress
      MYSQL_USER: wordpress
      MYSQL_PASSWORD: ${DB_PASSWORD}
      MYSQL_ROOT_PASSWORD: ${DB_ROOT_PASSWORD}

  backup:
    image: alpine
    volumes:
      - wordpress-html:/backup/html:ro
      - db-data:/backup/db:ro
      - ./backups:/backups
    command: sh -c "tar czf /backups/backup-$(date +%Y%m%d-%H%M%S).tar.gz /backup"

volumes:
  wordpress-html:
  wordpress-uploads:
  db-data:

Backup and Restore

Backup Named Volume

# Backup volume to tar archive
docker run --rm \
  -v my-volume:/source:ro \
  -v $(pwd):/backup \
  alpine \
  tar czf /backup/my-volume-backup.tar.gz -C /source .

# Backup with timestamp
docker run --rm \
  -v my-volume:/source:ro \
  -v $(pwd):/backup \
  alpine \
  tar czf /backup/backup-$(date +%Y%m%d-%H%M%S).tar.gz -C /source .

Restore Volume from Backup

# Create new volume
docker volume create my-volume-restored

# Restore from backup
docker run --rm \
  -v my-volume-restored:/target \
  -v $(pwd):/backup \
  alpine \
  sh -c "cd /target && tar xzf /backup/my-volume-backup.tar.gz"

Backup Database Volume

# PostgreSQL backup
docker run --rm \
  -v postgres-data:/var/lib/postgresql/data:ro \
  -v $(pwd):/backup \
  postgres:15-alpine \
  tar czf /backup/postgres-backup.tar.gz -C /var/lib/postgresql/data .

# MySQL backup
docker exec mysql-container \
  mysqldump -u root -p${DB_PASSWORD} --all-databases \
  > mysql-backup.sql

Automated Backup Script

#!/bin/bash
# backup-volumes.sh

BACKUP_DIR="/backup"
DATE=$(date +%Y%m%d-%H%M%S)

# Backup multiple volumes
for volume in app-data db-data logs-data; do
  docker run --rm \
    -v ${volume}:/source:ro \
    -v ${BACKUP_DIR}:/backup \
    alpine \
    tar czf /backup/${volume}-${DATE}.tar.gz -C /source .
done

# Clean old backups (keep last 7 days)
find ${BACKUP_DIR} -name "*.tar.gz" -mtime +7 -delete

Migrate Volume to Another Host

# On source host: Export volume
docker run --rm \
  -v my-volume:/source:ro \
  alpine \
  tar c -C /source . | ssh user@destination-host 'cat > /tmp/volume-export.tar'

# On destination host: Import volume
docker volume create my-volume
cat /tmp/volume-export.tar | docker run --rm -i \
  -v my-volume:/target \
  alpine \
  tar x -C /target

Production Best Practices

Use Named Volumes

# Good: Named volume
docker run -d -v postgres-data:/var/lib/postgresql/data postgres

# Avoid: Anonymous volume
docker run -d -v /var/lib/postgresql/data postgres

Volume Naming Convention

# Include environment and purpose
docker volume create prod-app-data
docker volume create prod-db-data
docker volume create staging-cache

Label Volumes

docker volume create \
  --label environment=production \
  --label backup=daily \
  --label app=myapp \
  --label retention=30d \
  prod-app-data

Read-Only Mounts

# Configuration should be read-only
docker run -d \
  -v config-data:/etc/app:ro \
  -v app-data:/app/data \
  my-app

Volume Monitoring

# Check volume sizes
docker system df -v

# Monitor specific volume
watch -n 5 'sudo du -sh /var/lib/docker/volumes/my-volume/_data'

Backup Strategy

# Add backup service to compose
services:
  backup:
    image: alpine
    volumes:
      - app-data:/data:ro
      - ./backups:/backups
    command: >
      sh -c "while true; do
        tar czf /backups/backup-$$(date +%Y%m%d-%H%M%S).tar.gz /data;
        sleep 86400;
      done"

Security Considerations

# Run with specific user
docker run -d \
  --user 1000:1000 \
  -v app-data:/app/data \
  my-app

# Use secrets for sensitive data (Swarm)
echo "password" | docker secret create db_password -
docker service create \
  --secret db_password \
  --mount source=db-data,target=/var/lib/mysql \
  mysql

Resource Limits

# Limit volume size (requires specific storage driver)
docker volume create \
  --driver local \
  --opt type=tmpfs \
  --opt device=tmpfs \
  --opt o=size=100m,uid=1000 \
  limited-volume

Troubleshooting

Volume Not Mounting

# Check if volume exists
docker volume ls | grep my-volume

# Inspect volume
docker volume inspect my-volume

# Check container mount
docker inspect container-name | grep -A 10 Mounts

# Verify permissions
sudo ls -la /var/lib/docker/volumes/my-volume/_data

Permission Denied Errors

# Check ownership
docker exec container-name ls -la /app/data

# Fix permissions from container
docker exec container-name chown -R appuser:appuser /app/data

# Or use proper user
docker run -d --user $(id -u):$(id -g) -v my-volume:/data my-app

Volume Data Not Persisting

# Verify volume is actually used
docker inspect container-name | grep -A 20 Mounts

# Check if using anonymous volume
docker volume ls -f dangling=true

# Ensure container writes to mounted path
docker exec container-name touch /data/test.txt
docker run --rm -v my-volume:/data alpine ls -la /data

Volume Full

# Check volume size
docker system df -v

# Find large files
docker run --rm -v my-volume:/data alpine du -sh /data/*

# Clean up old data
docker exec container-name find /data -mtime +30 -delete

Cannot Remove Volume

# Find containers using volume
docker ps -a --filter volume=my-volume

# Stop and remove containers
docker rm -f $(docker ps -a -q --filter volume=my-volume)

# Remove volume
docker volume rm my-volume

Performance Issues

# Check storage driver
docker info | grep "Storage Driver"

# Use tmpfs for high-performance temporary storage
docker run -d --tmpfs /app/temp:size=1g my-app

# Consider volume driver options
docker volume create \
  --driver local \
  --opt type=none \
  --opt o=bind \
  --opt device=/fast/storage/path \
  fast-volume

Conclusion

Docker volumes provide robust, flexible data persistence for containerized applications. Understanding volumes, bind mounts, and tmpfs mounts is essential for production deployments.

Key Takeaways

  • Named Volumes: Preferred method, Docker-managed, portable
  • Bind Mounts: Development workflows, direct host access
  • tmpfs: Temporary, high-performance, memory-only storage
  • Persistence: Data survives container lifecycle
  • Sharing: Multiple containers can use same volume
  • Backup: Regular backups are essential for production

Quick Reference

# Volume Management
docker volume create my-volume              # Create volume
docker volume ls                            # List volumes
docker volume inspect my-volume             # Inspect volume
docker volume rm my-volume                  # Remove volume
docker volume prune                         # Remove unused

# Using Volumes
docker run -v my-volume:/data nginx         # Named volume
docker run -v /host:/container nginx        # Bind mount
docker run --tmpfs /tmp nginx               # tmpfs mount
docker run -v my-volume:/data:ro nginx      # Read-only

# Backup & Restore
# Backup
docker run --rm -v my-volume:/source:ro -v $(pwd):/backup alpine tar czf /backup/backup.tar.gz -C /source .
# Restore
docker run --rm -v my-volume:/target -v $(pwd):/backup alpine tar xzf /backup/backup.tar.gz -C /target

Next Steps

  1. Implement: Use volumes in your applications
  2. Backup: Set up automated backup strategy
  3. Monitor: Track volume sizes and usage
  4. Secure: Implement proper permissions and encryption
  5. Optimize: Choose appropriate storage drivers
  6. Scale: Explore distributed storage solutions
  7. Document: Maintain volume inventory and procedures

Proper volume management ensures data persistence, portability, and reliability in your containerized infrastructure.