Kubernetes Resource Limits and Requests Optimization
Proper resource configuration ensures efficient cluster utilization, prevents pod eviction, and maintains application performance. This guide covers Quality of Service (QoS) classes, LimitRanges, ResourceQuotas, right-sizing strategies, out-of-memory handling, and eviction policies for your VPS and baremetal Kubernetes infrastructure.
Table of Contents
- Resource Management Fundamentals
- Requests and Limits
- Quality of Service Classes
- LimitRange
- ResourceQuota
- Right-Sizing
- Troubleshooting
- Practical Examples
- Conclusion
Resource Management Fundamentals
Requests vs Limits
Request: Minimum guaranteed resources
- Used for scheduling decisions
- Pod won't be scheduled if cluster lacks requested resources
- Guaranteed to be available
Limit: Maximum resources allowed
- Pod terminated if exceeding memory limit
- CPU is throttled if limit exceeded
- Prevents resource hogging
Resource Types
CPU:
- Measured in cores
- 1000m = 1 core
- Can be fractional (500m = 0.5 cores)
Memory:
- Measured in bytes
- Suffixes: Ki, Mi, Gi, etc.
- 1Gi = 1024Mi = 1048576Ki
Requests and Limits
Basic Configuration
apiVersion: v1
kind: Pod
metadata:
name: resource-pod
spec:
containers:
- name: app
image: myapp:1.0
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
Per-Container Resources
Multiple containers with different resource needs:
apiVersion: v1
kind: Pod
metadata:
name: multi-container
spec:
containers:
- name: app
image: myapp:1.0
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1
memory: 1Gi
- name: sidecar
image: sidecar:1.0
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
Resource Allocation in Deployments
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
template:
spec:
containers:
- name: web
image: web:1.0
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Quality of Service Classes
QoS Classes
Kubernetes automatically assigns QoS classes based on requests/limits:
Guaranteed: Requests = Limits (highest priority)
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
Burstable: Requests < Limits (medium priority)
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
BestEffort: No requests/limits (lowest priority)
resources: {}
Eviction Order
When cluster runs out of resources, pods are evicted in this order:
- BestEffort pods
- Burstable pods exceeding requests
- Guaranteed pods (if very critical)
Checking QoS Class
kubectl get pods -o custom-columns=NAME:.metadata.name,QOS:.status.qosClass
kubectl describe pod myapp | grep QoS
LimitRange
LimitRange Fundamentals
LimitRange enforces resource constraints at namespace level.
Pod-Level LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: pod-limits
namespace: production
spec:
limits:
- type: Pod
max:
cpu: "2"
memory: "2Gi"
min:
cpu: "100m"
memory: "128Mi"
maxLimitRequestRatio:
cpu: "4"
memory: "2"
Container-Level LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: container-limits
namespace: production
spec:
limits:
- type: Container
max:
cpu: "1"
memory: "1Gi"
min:
cpu: "50m"
memory: "64Mi"
default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
maxLimitRequestRatio:
cpu: "2"
memory: "1.5"
Viewing LimitRange
kubectl get limitrange -n production
kubectl describe limitrange container-limits -n production
ResourceQuota
ResourceQuota Fundamentals
ResourceQuota limits total resources consumed in a namespace.
Basic ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "10"
requests.memory: "20Gi"
limits.cpu: "20"
limits.memory: "40Gi"
pods: "100"
Advanced ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: comprehensive-quota
namespace: production
spec:
hard:
# Compute resources
requests.cpu: "50"
requests.memory: "100Gi"
limits.cpu: "100"
limits.memory: "200Gi"
# Pod count
pods: "200"
# Object counts
services: "10"
services.loadbalancers: "2"
services.nodeports: "5"
persistentvolumeclaims: "10"
configmaps: "50"
secrets: "50"
# Storage
requests.storage: "500Gi"
scopeSelector:
matchExpressions:
- operator: In
scopeName: PriorityClass
values: ["high", "medium"]
Quota with Priority Classes
apiVersion: v1
kind: ResourceQuota
metadata:
name: priority-quota
namespace: production
spec:
hard:
requests.cpu: "10"
limits.cpu: "20"
scopeSelector:
matchExpressions:
- operator: In
scopeName: PriorityClass
values: ["production"]
Viewing ResourceQuota
kubectl get resourcequota -n production
kubectl describe resourcequota production-quota -n production
kubectl top nodes
kubectl top pods -n production
Right-Sizing
Identifying Resource Usage
Get current usage:
# Node usage
kubectl top nodes
# Pod usage
kubectl top pods -n production
# Container usage in details
kubectl get pods -n production -o json | jq '.items[] | {name: .metadata.name, containers: .spec.containers[] | {name, resources}}'
Using Metrics
Prometheus queries for usage analysis:
# CPU usage rate
sum(rate(container_cpu_usage_seconds_total[5m])) by (pod_name)
# Memory usage
sum(container_memory_working_set_bytes) by (pod_name)
# CPU request utilization
sum(rate(container_cpu_usage_seconds_total[5m])) by (pod_name) /
sum(kube_pod_container_resource_requests_cpu_cores) by (pod_name)
Right-Sizing Strategy
- Monitor: Collect metrics for 2-4 weeks
- Analyze: Identify P95/P99 usage
- Set Request: To P50-P75 of usage
- Set Limit: To P99 or 2x request
- Validate: Monitor and adjust
Example Right-Sizing
Based on metrics:
- P50: 100m CPU, 256Mi memory
- P95: 300m CPU, 512Mi memory
- P99: 500m CPU, 800Mi memory
Set as:
- Request: 200m / 300Mi
- Limit: 500m / 800Mi
resources:
requests:
cpu: 200m
memory: 300Mi
limits:
cpu: 500m
memory: 800Mi
Troubleshooting
Out of Memory (OOM) Errors
# Check if pod was OOMKilled
kubectl describe pod myapp | grep -A 5 "Last State"
# View events
kubectl get events -A --sort-by='.lastTimestamp' | grep OOMKilled
# Check memory limit in pod
kubectl get pod myapp -o yaml | grep -A 5 "memory:"
Solutions:
- Increase limit
- Optimize application
- Scale horizontally
Pending Pods
# Check why pod can't be scheduled
kubectl describe pod pending-pod
# View node resources
kubectl describe nodes
# Check ResourceQuota usage
kubectl describe resourcequota -n production
CPU Throttling
# Check if CPU is being throttled
kubectl get pods -o custom-columns=NAME:.metadata.name,CPU-REQUESTS:.spec.containers[*].resources.requests.cpu
# Prometheus query for throttling
increase(container_cpu_cfs_throttled_seconds_total[5m])
Practical Examples
Example: Web Application Right-Sizing
---
# LimitRange for web namespace
apiVersion: v1
kind: LimitRange
metadata:
name: web-limits
namespace: web
spec:
limits:
- type: Container
max:
cpu: "2"
memory: "2Gi"
min:
cpu: "100m"
memory: "128Mi"
defaultRequest:
cpu: "200m"
memory: "256Mi"
default:
cpu: "500m"
memory: "512Mi"
---
# ResourceQuota for web namespace
apiVersion: v1
kind: ResourceQuota
metadata:
name: web-quota
namespace: web
spec:
hard:
requests.cpu: "20"
requests.memory: "40Gi"
limits.cpu: "40"
limits.memory: "80Gi"
pods: "50"
---
# Web application with optimized resources
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
namespace: web
spec:
replicas: 3
template:
spec:
containers:
- name: web
image: web:1.0
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
- name: cache
image: redis:7
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
Example: Database Pod with Guaranteed QoS
apiVersion: v1
kind: Pod
metadata:
name: database
namespace: databases
spec:
containers:
- name: postgres
image: postgres:15
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
- name: backup
image: backup-tool:1.0
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
Example: Multi-Tier ResourceQuota
---
# CPU-intensive workloads quota
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-intensive-quota
namespace: batch
spec:
hard:
requests.cpu: "100"
limits.cpu: "200"
requests.memory: "100Gi"
limits.memory: "200Gi"
scopeSelector:
matchExpressions:
- operator: In
scopeName: PriorityClass
values: ["compute-intensive"]
---
# Standard workloads quota
apiVersion: v1
kind: ResourceQuota
metadata:
name: standard-quota
namespace: production
spec:
hard:
requests.cpu: "50"
limits.cpu: "100"
requests.memory: "50Gi"
limits.memory: "100Gi"
scopeSelector:
matchExpressions:
- operator: NotIn
scopeName: PriorityClass
values: ["compute-intensive"]
Conclusion
Proper resource management is critical for stable, efficient Kubernetes clusters. By implementing appropriate requests and limits, using QoS classes strategically, enforcing LimitRanges and ResourceQuotas, and regularly right-sizing based on actual usage, you optimize cluster utilization and prevent resource contention. Start with conservative estimates, monitor actual usage, and adjust based on metrics. Regular review of resource allocation ensures your applications run reliably on your VPS and baremetal Kubernetes infrastructure without unnecessary waste or performance degradation.


