Kubernetes Job and CronJob Configuration
Jobs and CronJobs enable running batch workloads and scheduled tasks in Kubernetes. Esta guía cubre Job specification with parallelism and completion semantics, CronJob scheduling, history management, suspension, and best practices for batch processing on your VPS and baremetal Kubernetes infrastructure.
Tabla de contenidos
- Jobs Fundamentals
- Job Configuration
- Job Patterns
- CronJobs
- Advanced Features
- Monitoring and Troubleshooting
- Practical Examples
- Conclusion
Jobs Fundamentals
What is a Job?
A Job creates one or more Pods and ensures they complete successfully. It tracks completion and retries on failure.
Job vs Other Controllers
| Feature | Job | Deployment | StatefulSet |
|---|---|---|---|
| Purpose | Batch tasks | Long-running apps | Stateful apps |
| Completion | Completes | Runs indefinitely | Runs indefinitely |
| Restart | On failure | Always | Always |
| Storage | Shared volumens | Persistent | Persistent |
Job Lifecycle
Pending → Active → Succeeded/Failed → Complete
Job Configuration
Basic Job
apiVersion: batch/v1
kind: Job
metadata:
name: simple-job
namespace: batch
spec:
template:
spec:
containers:
- name: task
image: busybox:1.35
command: ["echo", "Hello from Job"]
restartPolicy: Never
backoffLimit: 3
ttlSecondsAfterFinished: 3600
Crea el job:
kubectl apply -f job.yaml
kubectl get jobs -n batch
kubectl describe job simple-job -n batch
Job Parameters
template: Pod template for job pods
backoffLimit: Max failed pod creation attempts (default: 6)
parallelism: Number of pods running in parallel
completions: Number of successful pods needed (default: 1)
activeDeadlineSeconds: Max execution time in seconds
ttlSecondsAfterFinished: Delete job after N seconds
Simple Completion
Job completes after one successful pod:
apiVersion: batch/v1
kind: Job
metadata:
name: single-completion-job
spec:
template:
spec:
containers:
- name: task
image: python:3.11
command:
- python
- -c
- |
import time
for i in range(10):
print(f"Progress: {i+1}/10")
time.sleep(1)
print("Task complete!")
restartPolicy: Never
backoffLimit: 3
activeDeadlineSeconds: 300
Parallel Jobs
Multiple workers in parallel:
apiVersion: batch/v1
kind: Job
metadata:
name: parallel-job
spec:
parallelism: 4
completions: 10
template:
spec:
containers:
- name: worker
image: myworker:1.0
env:
- name: JOB_COMPLETION_INDEX
valueFrom:
fieldRef:
fieldPath: metadata.annotations['batch.kubernetes.io/job-completion-index']
restartPolicy: Never
backoffLimit: 2
Work Queue Pattern
Multiple workers without specified completions:
apiVersion: batch/v1
kind: Job
metadata:
name: work-queue-job
spec:
parallelism: 3
template:
spec:
containers:
- name: worker
image: worker:latest
command:
- /bin/sh
- -c
- |
while true; do
# Get work from queue
work=$(redis-cli BLPOP job_queue 0 | tail -1)
if [ $? -eq 0 ]; then
echo "Processing: $work"
# Process work
sleep 5
else
exit 0
fi
done
restartPolicy: Never
activeDeadlineSeconds: 3600
Job Patterns
Index-Based Job
Process specific indices in parallel:
apiVersion: batch/v1
kind: Job
metadata:
name: indexed-job
spec:
parallelism: 4
completions: 20
completionMode: Indexed
template:
spec:
containers:
- name: task
image: task-processor:1.0
env:
- name: TASK_ID
valueFrom:
fieldRef:
fieldPath: metadata.annotations['batch.kubernetes.io/job-completion-index']
restartPolicy: Never
backoffLimit: 2
Batch Processing Job
Process data in batches:
apiVersion: batch/v1
kind: Job
metadata:
name: batch-processor
spec:
parallelism: 5
completions: 100
template:
metadata:
labels:
app: batch-processor
spec:
containers:
- name: processor
image: batch-processor:1.0
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: batch-data
restartPolicy: Never
backoffLimit: 3
activeDeadlineSeconds: 86400
CronJobs
CronJob Basics
CronJobs schedule Jobs based on cron expressions.
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: backup-tool:1.0
command:
- /bin/sh
- -c
- /scripts/backup.sh
restartPolicy: OnFailure
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
suspend: false
Cron Schedule Format
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
Common schedules:
"0 0 * * *" # Daily at midnight
"0 */4 * * *" # Every 4 hours
"0 9 * * 1-5" # Weekdays at 9 AM
"*/15 * * * *" # Every 15 minutes
"0 0 1 * *" # Monthly on 1st
"0 0 * * 0" # Weekly on Sunday
CronJob with Environment Variables
apiVersion: batch/v1
kind: CronJob
metadata:
name: scheduled-cleanup
spec:
schedule: "0 3 * * *"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: cleanup-tool:1.0
env:
- name: CLEANUP_DAYS
value: "30"
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
restartPolicy: OnFailure
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
Timezone Support
CronJobs run in UTC by default. Set timezone:
apiVersion: batch/v1
kind: CronJob
metadata:
name: timezone-job
spec:
schedule: "0 9 * * 1-5"
timeZone: "America/New_York"
jobTemplate:
spec:
template:
spec:
containers:
- name: task
image: task:1.0
restartPolicy: OnFailure
Advanced Features
Job Suspend
Pause a job without deleting:
kubectl patch job simple-job -p '{"spec":{"suspend":true}}'
kubectl patch job simple-job -p '{"spec":{"suspend":false}}'
Or in YAML:
spec:
suspend: true
CronJob Suspension
Pause scheduled execution:
kubectl patch cronjob daily-backup -p '{"spec":{"suspend":true}}'
Concurrency Policy
Control simultaneous job execution:
apiVersion: batch/v1
kind: CronJob
metadata:
name: concurrent-job
spec:
schedule: "*/5 * * * *"
concurrencyPolicy: Forbid # Allow, Forbid, Replace
jobTemplate:
spec:
template:
spec:
containers:
- name: task
image: task:1.0
restartPolicy: OnFailure
Allow: Multiple concurrent jobs
Forbid: Skip job if previous hasn't finished
Replace: Delete previous job and start new
History Management
Control job history retention:
spec:
successfulJobsHistoryLimit: 5 # Keep 5 successful jobs
failedJobsHistoryLimit: 3 # Keep 3 failed jobs
Clean old jobs manually:
# Delete all completed jobs
kubectl delete job -n batch --field-selector status.successful=1
# Delete failed jobs
kubectl delete job -n batch --field-selector status.failed=1
Supervisión and Troubleshooting
Viewing Job Status
# List jobs
kubectl get jobs -n batch
kubectl get jobs -n batch -o wide
# View job details
kubectl describe job simple-job -n batch
# Check pod status
kubectl get pods -n batch -l job-name=simple-job
kubectl logs -n batch -l job-name=simple-job
Common Issues
Job stuck in pending:
kubectl describe job stuck-job -n batch
kubectl get pods -l job-name=stuck-job -o yaml | grep -A 5 "events:"
Pods failing:
kubectl logs -n batch <pod-name>
kubectl describe pod -n batch <pod-name>
Jobs not completing:
# Check job status
kubectl get job simple-job -n batch -o yaml | grep -A 10 "status:"
# View events
kubectl get events -n batch --sort-by='.lastTimestamp'
Practical Examples
Ejemplo: Database Backup Job
---
apiVersion: v1
kind: ConfigMap
metadata:
name: backup-script
namespace: batch
data:
backup.sh: |
#!/bin/bash
set -e
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="/backups/db_backup_${TIMESTAMP}.sql"
echo "Starting database backup..."
mysqldump -h ${DB_HOST} -u ${DB_USER} -p${DB_PASSWORD} ${DB_NAME} > ${BACKUP_FILE}
echo "Compressing backup..."
gzip ${BACKUP_FILE}
echo "Backup complete: ${BACKUP_FILE}.gz"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: database-backup
namespace: batch
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: backup-operator
containers:
- name: backup
image: mysql:8.0
command: ["/scripts/backup.sh"]
env:
- name: DB_HOST
value: mysql.databases.svc
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: DB_NAME
value: production
volumeMounts:
- name: backup-script
mountPath: /scripts
- name: backups
mountPath: /backups
volumes:
- name: backup-script
configMap:
name: backup-script
defaultMode: 0755
- name: backups
persistentVolumeClaim:
claimName: backup-storage
restartPolicy: OnFailure
successfulJobsHistoryLimit: 7
failedJobsHistoryLimit: 3
Ejemplo: Parallel Data Processing
---
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
namespace: batch
spec:
parallelism: 8
completions: 32
completionMode: Indexed
template:
metadata:
labels:
app: data-processor
spec:
containers:
- name: processor
image: data-processor:1.0
env:
- name: TASK_INDEX
valueFrom:
fieldRef:
fieldPath: metadata.annotations['batch.kubernetes.io/job-completion-index']
- name: TOTAL_TASKS
value: "32"
volumeMounts:
- name: input-data
mountPath: /data/input
- name: output-data
mountPath: /data/output
volumes:
- name: input-data
persistentVolumeClaim:
claimName: input-data
- name: output-data
persistentVolumeClaim:
claimName: output-data
restartPolicy: Never
backoffLimit: 3
activeDeadlineSeconds: 86400
Ejemplo: Scheduled Report Generation
apiVersion: batch/v1
kind: CronJob
metadata:
name: weekly-report
namespace: batch
spec:
schedule: "0 6 * * 1"
timeZone: "America/New_York"
concurrencyPolicy: Forbid
jobTemplate:
spec:
ttlSecondsAfterFinished: 604800
template:
spec:
serviceAccountName: report-generator
containers:
- name: report
image: report-generator:1.0
command:
- /bin/sh
- -c
- |
python /app/generate_report.py \
--start-date $(date -d 'last week monday' +%Y-%m-%d) \
--end-date $(date -d 'yesterday' +%Y-%m-%d) \
--output /reports/report_$(date +%Y%m%d).pdf
volumeMounts:
- name: reports
mountPath: /reports
volumes:
- name: reports
persistentVolumeClaim:
claimName: reports-storage
restartPolicy: OnFailure
successfulJobsHistoryLimit: 12
failedJobsHistoryLimit: 2
Conclusión
Jobs and CronJobs are essential for batch processing and scheduled tasks in Kubernetes. By properly configuring parallelism, completions, and backoff policies, you create efficient batch workflows. CronJobs provide reliable scheduled execution with history tracking and suspension capabilities. Start with simple single-completion jobs, advance to parallel jobs for rendimiento, and implement CronJobs for production automation. Regular monitoreo of job history and logs ensures reliable batch processing on your VPS and baremetal Kubernetes infrastructure.


