Persistent Volumes in Kubernetes: Complete Production Guide

Persistent Volumes (PV) and Persistent Volume Claims (PVC) provide storage abstraction in Kubernetes, enabling stateful applications to persist data beyond pod lifecycles. This comprehensive guide covers PV/PVC concepts, storage classes, dynamic provisioning, and production best practices for data persistence in Kubernetes.

Table of Contents

Introduction

Kubernetes Persistent Volumes decouple storage from pods, providing durable storage that survives pod restarts and rescheduling. Understanding PV, PVC, and StorageClasses is essential for running stateful applications like databases, message queues, and file storage systems.

Why Persistent Volumes?

  • Data Persistence: Survive pod restarts and deletions
  • Storage Abstraction: Decouple storage from pod specifications
  • Dynamic Provisioning: Automatic volume creation
  • Portability: Consistent storage API across providers
  • Lifecycle Management: Independent storage lifecycle

Volume Hierarchy

StorageClass
    ↓
PersistentVolume (PV)
    ↓
PersistentVolumeClaim (PVC)
    ↓
Pod Volume Mount

Prerequisites

  • Kubernetes cluster (1.19+)
  • kubectl configured
  • Storage backend (local, NFS, cloud provider, etc.)
  • Basic understanding of Pods and Deployments

Verify setup:

kubectl version --client
kubectl get nodes
kubectl get storageclass

Storage Concepts

Volume Types

Ephemeral Volumes:

  • emptyDir: Temporary storage, pod lifecycle
  • configMap: Configuration data
  • secret: Sensitive data

Persistent Volumes:

  • hostPath: Node's filesystem (development only)
  • nfs: Network File System
  • csi: Container Storage Interface plugins
  • Cloud: awsEBS, gcePersistentDisk, azureDisk

Lifecycle States

PV States:

  • Available: Ready for claim
  • Bound: Claimed by PVC
  • Released: PVC deleted, data retained
  • Failed: Automatic reclamation failed

PVC States:

  • Pending: Waiting for PV binding
  • Bound: Bound to PV
  • Lost: PV unavailable

Persistent Volumes

Basic PV Example

# pv-local.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  hostPath:
    path: /mnt/data
# Create PV
kubectl apply -f pv-local.yaml

# Check PV status
kubectl get pv
kubectl describe pv local-pv

NFS Persistent Volume

# pv-nfs.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
  nfs:
    server: nfs-server.example.com
    path: /exported/path
  mountOptions:
    - hard
    - nfsvers=4.1

Cloud Provider PV Examples

AWS EBS

# pv-aws-ebs.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: aws-ebs-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: aws-ebs
  awsElasticBlockStore:
    volumeID: vol-0123456789abcdef0
    fsType: ext4

Google Persistent Disk

# pv-gce-pd.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: gce-pd-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: gce-pd
  gcePersistentDisk:
    pdName: my-disk-name
    fsType: ext4

Azure Disk

# pv-azure-disk.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: azure-disk-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: azure-disk
  azureDisk:
    diskName: myAKSDisk
    diskURI: /subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Compute/disks/myAKSDisk
    kind: Managed

Reclaim Policies

spec:
  persistentVolumeReclaimPolicy: Retain  # Keep data after PVC deletion
  # OR
  persistentVolumeReclaimPolicy: Delete  # Delete volume after PVC deletion
  # OR
  persistentVolumeReclaimPolicy: Recycle # Deprecated - basic scrub (rm -rf)

Persistent Volume Claims

Basic PVC

# pvc-basic.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: basic-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: standard
# Create PVC
kubectl apply -f pvc-basic.yaml

# Check PVC status
kubectl get pvc
kubectl describe pvc basic-pvc

Using PVC in Pod

# pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-storage
spec:
  containers:
  - name: app
    image: nginx:alpine
    volumeMounts:
    - name: data
      mountPath: /usr/share/nginx/html
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: basic-pvc

Using PVC in Deployment

# deployment-with-pvc.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 1  # Note: RWO volumes limit to 1 replica
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        volumeMounts:
        - name: web-storage
          mountPath: /usr/share/nginx/html
      volumes:
      - name: web-storage
        persistentVolumeClaim:
          claimName: web-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: web-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard

Selector for Specific PV

# pvc-with-selector.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: selective-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: ""  # Empty for manual binding
  selector:
    matchLabels:
      environment: production
      tier: database

Storage Classes

StorageClasses enable dynamic provisioning of PersistentVolumes.

View Storage Classes

# List storage classes
kubectl get storageclass
kubectl get sc

# Describe storage class
kubectl describe sc standard

Basic StorageClass

# storageclass-basic.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete

Cloud Provider StorageClasses

AWS EBS

# sc-aws-ebs.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: aws-ebs-gp3
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Google Cloud

# sc-gce-pd.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gce-ssd
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-ssd
  replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Azure Disk

# sc-azure.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azure-premium
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_LRS
  kind: Managed
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Default StorageClass

# Set default storage class
kubectl patch storageclass standard \
  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# Remove default
kubectl patch storageclass standard \
  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

Dynamic Provisioning

Dynamic provisioning automatically creates PVs when PVCs are created.

PVC with Dynamic Provisioning

# pvc-dynamic.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dynamic-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: fast-ssd  # Uses StorageClass provisioner
# Create PVC
kubectl apply -f pvc-dynamic.yaml

# PV is automatically created
kubectl get pv
kubectl get pvc dynamic-pvc

Volume Expansion

# sc-expandable.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: expandable-storage
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
allowVolumeExpansion: true  # Enable expansion
# Expand PVC
kubectl patch pvc dynamic-pvc -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'

# Check expansion status
kubectl get pvc dynamic-pvc
kubectl describe pvc dynamic-pvc

Volume Modes and Access

Access Modes

  • ReadWriteOnce (RWO): Single node read-write
  • ReadOnlyMany (ROX): Multiple nodes read-only
  • ReadWriteMany (RWX): Multiple nodes read-write
  • ReadWriteOncePod (RWOP): Single pod read-write (1.22+)
# Different access modes
spec:
  accessModes:
    - ReadWriteOnce   # Block storage (EBS, Azure Disk)
    # OR
    - ReadWriteMany   # Shared storage (NFS, EFS, Azure Files)
    # OR
    - ReadOnlyMany    # Shared read-only

Volume Modes

# Block mode (raw block device)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Block  # Block mode
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd
---
# Using block volume in pod
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-block
spec:
  containers:
  - name: app
    image: myapp:latest
    volumeDevices:  # volumeDevices instead of volumeMounts
    - name: data
      devicePath: /dev/xvda
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: block-pvc

StatefulSets with Persistent Storage

StatefulSets provide stable, unique network identifiers and persistent storage.

StatefulSet with VolumeClaimTemplates

# statefulset-mysql.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi
      storageClassName: fast-ssd
---
apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  clusterIP: None  # Headless service
  selector:
    app: mysql
  ports:
  - port: 3306

PostgreSQL StatefulSet

# statefulset-postgres.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15-alpine
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 50Gi
      storageClassName: standard

Production Patterns

Multiple Volume Mounts

# pod-multi-volumes.yaml
apiVersion: v1
kind: Pod
metadata:
  name: multi-volume-pod
spec:
  containers:
  - name: app
    image: myapp:latest
    volumeMounts:
    - name: app-data
      mountPath: /app/data
    - name: logs
      mountPath: /app/logs
    - name: config
      mountPath: /app/config
      readOnly: true
  volumes:
  - name: app-data
    persistentVolumeClaim:
      claimName: app-data-pvc
  - name: logs
    persistentVolumeClaim:
      claimName: logs-pvc
  - name: config
    configMap:
      name: app-config

Init Container for Data Setup

# pod-with-init.yaml
apiVersion: v1
kind: Pod
metadata:
  name: app-with-init
spec:
  initContainers:
  - name: setup
    image: busybox:latest
    command: ['sh', '-c', 'echo "Setting up data" > /data/setup.txt']
    volumeMounts:
    - name: data
      mountPath: /data
  containers:
  - name: app
    image: nginx:alpine
    volumeMounts:
    - name: data
      mountPath: /usr/share/nginx/html
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: app-pvc

Resource Limits

# pvc-with-limits.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: limited-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
    limits:
      storage: 50Gi  # Maximum expansion
  storageClassName: expandable-storage

Backup and Disaster Recovery

Volume Snapshots

# volumesnapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: data-snapshot
spec:
  volumeSnapshotClassName: csi-snapclass
  source:
    persistentVolumeClaimName: app-data-pvc
# Create snapshot
kubectl apply -f volumesnapshot.yaml

# Check snapshot
kubectl get volumesnapshot
kubectl describe volumesnapshot data-snapshot

Restore from Snapshot

# pvc-from-snapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restored-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard
  dataSource:
    name: data-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Backup with Jobs

# backup-job.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-job
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            volumeMounts:
            - name: data
              mountPath: /data
              readOnly: true
            - name: backup
              mountPath: /backup
          restartPolicy: OnFailure
          volumes:
          - name: data
            persistentVolumeClaim:
              claimName: app-data-pvc
          - name: backup
            persistentVolumeClaim:
              claimName: backup-pvc

Troubleshooting

Common Issues

# PVC stuck in Pending
kubectl describe pvc <pvc-name>
# Check: StorageClass exists, sufficient resources, node affinity

# Pod can't mount volume
kubectl describe pod <pod-name>
kubectl get events --sort-by='.lastTimestamp'
# Check: PVC is Bound, correct claimName, access modes match

# Volume not expanding
kubectl describe pvc <pvc-name>
# Check: allowVolumeExpansion=true, underlying storage supports expansion

# Check PV/PVC binding
kubectl get pv,pvc
kubectl describe pv <pv-name>
kubectl describe pvc <pvc-name>

Debug Commands

# Check storage class
kubectl get sc
kubectl describe sc <sc-name>

# Check PV
kubectl get pv
kubectl describe pv <pv-name>

# Check PVC
kubectl get pvc -A
kubectl describe pvc <pvc-name>

# Check pod volumes
kubectl describe pod <pod-name> | grep -A 5 Volumes

# Check node storage
kubectl describe node <node-name> | grep -A 10 "Allocated resources"

# Events
kubectl get events --field-selector involvedObject.name=<pvc-name>

Clean Up Stuck Resources

# Remove PVC finalizers (if stuck deleting)
kubectl patch pvc <pvc-name> -p '{"metadata":{"finalizers":null}}'

# Force delete PV
kubectl delete pv <pv-name> --grace-period=0 --force

# Check for orphaned volumes
kubectl get pv | grep Released

Conclusion

Persistent Volumes are essential for stateful applications in Kubernetes. Understanding PV, PVC, StorageClasses, and dynamic provisioning enables reliable data persistence for production workloads.

Key Takeaways

  • PV: Physical storage resource in cluster
  • PVC: Request for storage by pods
  • StorageClass: Dynamic provisioning template
  • Access Modes: Control how volumes are accessed
  • StatefulSets: Ordered deployment with stable storage
  • Backups: Regular snapshots and disaster recovery plans

Quick Reference

# PersistentVolume
kubectl get pv
kubectl describe pv <pv-name>
kubectl delete pv <pv-name>

# PersistentVolumeClaim
kubectl get pvc
kubectl describe pvc <pvc-name>
kubectl delete pvc <pvc-name>

# StorageClass
kubectl get sc
kubectl describe sc <sc-name>

# Volume Snapshots
kubectl get volumesnapshot
kubectl describe volumesnapshot <name>

# Expand PVC
kubectl patch pvc <name> -p '{"spec":{"resources":{"requests":{"storage":"50Gi"}}}}'

Production Checklist

  • Choose appropriate StorageClass for workload
  • Set correct access modes (RWO, RWX, ROX)
  • Configure volume expansion capability
  • Implement backup strategy (snapshots, jobs)
  • Set appropriate reclaim policy
  • Monitor volume usage and capacity
  • Test restore procedures
  • Document storage architecture
  • Implement resource quotas
  • Plan for disaster recovery

Next Steps

  1. Plan: Assess storage requirements
  2. Configure: Set up StorageClasses
  3. Deploy: Implement StatefulSets with storage
  4. Backup: Configure snapshot strategy
  5. Monitor: Track volume metrics
  6. Optimize: Right-size storage allocations
  7. Disaster Recovery: Test backup/restore procedures

Master Persistent Volumes to run stateful applications reliably in Kubernetes!