Keda Event-Driven Autoscaling for Kubernetes
KEDA (Kubernetes Event-Driven Autoscaling) extends the Kubernetes Horizontal Pod Autoscaler to scale workloads based on external event sources like message queues, databases, and custom metrics rather than just CPU and memory. With 60+ built-in scalers for Kafka, RabbitMQ, Prometheus, cloud queues, and more, KEDA enables fine-grained autoscaling that matches your workload to actual demand.
Prerequisites
- Kubernetes 1.24+
kubectland Helm 3.x- Metrics Server installed (for HPA compatibility)
- Target event sources accessible from the cluster (Kafka, RabbitMQ, etc.)
Installing KEDA
# Install via Helm (recommended)
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
--namespace keda \
--create-namespace \
--set prometheus.metricServer.enabled=true \
--set prometheus.operator.enabled=true
# Verify KEDA components
kubectl -n keda get pods
# Expected pods:
# keda-operator
# keda-operator-metrics-apiserver
# keda-admission-webhooks
# Check KEDA API is registered
kubectl get apiservices | grep keda
# Install KEDA CLI (optional)
curl -s https://api.github.com/repos/kedacore/keda/releases/latest | \
grep browser_download_url | grep linux_amd64 | cut -d '"' -f 4 | wget -qi -
tar -xvf keda-*-linux-amd64.tar.gz
sudo mv keda /usr/local/bin/
ScaledObject Configuration
A ScaledObject links a deployment to an event source for autoscaling:
# Basic ScaledObject structure
cat > basic-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-app-scaler
namespace: production
spec:
scaleTargetRef:
name: my-app-deployment # Must match the Deployment name
# kind: StatefulSet # Can also target StatefulSets
pollingInterval: 15 # Check every 15 seconds
cooldownPeriod: 300 # Wait 300s before scaling down
minReplicaCount: 0 # Scale to zero when idle
maxReplicaCount: 50 # Maximum pods
advanced:
restoreToOriginalReplicaCount: false
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Avoid flapping
policies:
- type: Percent
value: 10
periodSeconds: 60
triggers:
- type: kafka
metadata:
bootstrapServers: kafka:9092
topic: my-topic
consumerGroup: my-consumer-group
lagThreshold: "100"
EOF
kubectl apply -f basic-scaledobject.yaml
# Check ScaledObject status
kubectl get scaledobject my-app-scaler -n production
kubectl describe scaledobject my-app-scaler -n production
Kafka Scaler
Scale consumers based on Kafka consumer group lag:
# Create a secret for Kafka SASL authentication
kubectl create secret generic kafka-auth \
--from-literal=username=kafka-user \
--from-literal=password=kafka-password \
-n production
# Kafka ScaledObject with SASL authentication
cat > kafka-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-consumer-scaler
namespace: production
spec:
scaleTargetRef:
name: order-processor
minReplicaCount: 1
maxReplicaCount: 20
cooldownPeriod: 60
triggers:
- type: kafka
metadata:
bootstrapServers: kafka-broker-1:9092,kafka-broker-2:9092
consumerGroup: order-processing-group
topic: orders
lagThreshold: "50" # Scale up when lag > 50 per partition
offsetResetPolicy: latest
allowIdleConsumers: "false" # Don't count idle partitions
scaleToZeroOnInvalidOffset: "false"
excludePersistentLag: "false"
tls: enable
sasl: plaintext
authenticationRef:
name: kafka-trigger-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: kafka-trigger-auth
namespace: production
spec:
secretTargetRef:
- parameter: username
name: kafka-auth
key: username
- parameter: password
name: kafka-auth
key: password
EOF
kubectl apply -f kafka-scaledobject.yaml
# Monitor Kafka lag and replica count
watch kubectl get scaledobject kafka-consumer-scaler -n production
# Check HPA created by KEDA
kubectl get hpa -n production
RabbitMQ Scaler
Scale workers based on RabbitMQ queue depth:
# Create RabbitMQ credentials secret
kubectl create secret generic rabbitmq-auth \
--from-literal=host=amqp://user:password@rabbitmq:5672 \
-n production
# RabbitMQ queue-based scaling
cat > rabbitmq-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: rabbitmq-trigger-auth
namespace: production
spec:
secretTargetRef:
- parameter: host
name: rabbitmq-auth
key: host
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: rabbitmq-consumer-scaler
namespace: production
spec:
scaleTargetRef:
name: email-worker
minReplicaCount: 0
maxReplicaCount: 30
cooldownPeriod: 120
triggers:
- type: rabbitmq
metadata:
protocol: amqp
queueName: email-queue
mode: QueueLength # Scale based on queue length
value: "10" # 1 replica per 10 messages
activationValue: "1" # Activate at 1 message (from 0)
authenticationRef:
name: rabbitmq-trigger-auth
---
# HTTP API-based monitoring (more metrics available)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: rabbitmq-rate-scaler
namespace: production
spec:
scaleTargetRef:
name: payment-worker
triggers:
- type: rabbitmq
metadata:
protocol: http
queueName: payments
mode: MessageRate # Scale based on message rate/second
value: "100" # 1 replica per 100 messages/sec
host: http://rabbitmq-management:15672/api
vhostName: "/"
authenticationRef:
name: rabbitmq-http-auth
EOF
kubectl apply -f rabbitmq-scaledobject.yaml
Prometheus Scaler
Scale based on any Prometheus metric:
# Scale based on custom application metrics
cat > prometheus-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaler
namespace: production
spec:
scaleTargetRef:
name: api-gateway
minReplicaCount: 2
maxReplicaCount: 50
cooldownPeriod: 120
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc:9090
metricName: http_requests_per_second
# Scale based on average requests per second
query: |
sum(rate(http_requests_total{job="api-gateway"}[2m]))
threshold: "100" # Scale up per 100 req/s
activationThreshold: "10" # Start scaling at 10 req/s
# Optional: auth for secured Prometheus
# authenticationRef:
# name: prometheus-auth
---
# Scale based on database connections
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: db-connection-scaler
namespace: production
spec:
scaleTargetRef:
name: connection-pool-manager
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc:9090
metricName: postgres_connections
query: |
sum(pg_stat_activity_count{state="active"})
threshold: "50" # Scale at 50 active connections
EOF
kubectl apply -f prometheus-scaledobject.yaml
Cron and Time-Based Scaling
Pre-scale workloads for known traffic patterns:
# Scale up before business hours, scale down at night
cat > cron-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: business-hours-scaler
namespace: production
spec:
scaleTargetRef:
name: web-frontend
minReplicaCount: 2
maxReplicaCount: 20
triggers:
- type: cron
metadata:
timezone: America/New_York
start: "0 8 * * 1-5" # 8 AM weekdays
end: "0 20 * * 1-5" # 8 PM weekdays
desiredReplicas: "10" # Scale to 10 during business hours
- type: cron
metadata:
timezone: America/New_York
start: "0 20 * * 5" # Friday 8 PM - pre-scale for weekend
end: "0 8 * * 1" # Monday 8 AM
desiredReplicas: "5"
# Combine with metrics-based scaling
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc:9090
metricName: active_users
query: count(user_sessions_active)
threshold: "50"
EOF
kubectl apply -f cron-scaledobject.yaml
ScaledJob for Batch Workloads
KEDA can also scale Kubernetes Jobs for batch processing:
# Scale Jobs to process a queue
cat > scaled-job.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: image-processor
namespace: production
spec:
jobTargetRef:
parallelism: 1
completions: 1
activeDeadlineSeconds: 600
backoffLimit: 3
template:
spec:
restartPolicy: Never
containers:
- name: processor
image: your-org/image-processor:latest
command: ["/process-image"]
env:
- name: QUEUE_NAME
value: "image-processing"
- name: RABBITMQ_HOST
valueFrom:
secretKeyRef:
name: rabbitmq-auth
key: host
pollingInterval: 10
maxReplicaCount: 20 # Max parallel jobs
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 5
scalingStrategy:
strategy: "accurate" # One job per queue message
pendingPodConditions:
- "Ready"
- "PodScheduled"
triggers:
- type: rabbitmq
metadata:
protocol: amqp
queueName: image-processing
mode: QueueLength
value: "1" # 1 job per message
authenticationRef:
name: rabbitmq-trigger-auth
EOF
kubectl apply -f scaled-job.yaml
# Monitor scaled jobs
kubectl get scaledjob image-processor -n production
kubectl get jobs -n production -w
Troubleshooting
ScaledObject not scaling:
# Check ScaledObject conditions
kubectl describe scaledobject <name> -n <namespace>
# Look for errors in KEDA operator logs
kubectl -n keda logs deploy/keda-operator -f
# Check metrics server
kubectl -n keda logs deploy/keda-operator-metrics-apiserver -f
# Test trigger authentication
kubectl -n keda exec deploy/keda-operator -- env | grep -i trigger
Scale to zero not working:
# Verify minReplicaCount is 0
kubectl get scaledobject <name> -o jsonpath='{.spec.minReplicaCount}'
# Check KEDA has external trigger data
kubectl describe hpa keda-hpa-<scaledobject-name> -n <namespace>
# Ensure activationValue is set appropriately
Metrics server conflict with HPA:
# KEDA creates an HPA internally - don't create a separate HPA for the same deployment
kubectl get hpa -n production
# If conflicts exist, delete the manually created HPA
kubectl delete hpa <manual-hpa-name> -n production
TriggerAuthentication not working:
# Verify secret exists and has correct keys
kubectl get secret <secret-name> -n <namespace> -o yaml
# Check TriggerAuthentication status
kubectl describe triggerauthentication <name> -n <namespace>
Conclusion
KEDA transforms Kubernetes autoscaling from reactive CPU/memory-based scaling into proactive event-driven scaling that responds to actual workload demand. By scaling to zero during idle periods and rapidly expanding based on queue depth, message rates, or custom metrics, KEDA reduces infrastructure costs while ensuring responsiveness. Start with simple queue-depth scalers and progress to multi-trigger configurations as your workloads grow in complexity.


