Keda Event-Driven Autoscaling for Kubernetes

KEDA (Kubernetes Event-Driven Autoscaling) extends the Kubernetes Horizontal Pod Autoscaler to scale workloads based on external event sources like message queues, databases, and custom metrics rather than just CPU and memory. With 60+ built-in scalers for Kafka, RabbitMQ, Prometheus, cloud queues, and more, KEDA enables fine-grained autoscaling that matches your workload to actual demand.

Prerequisites

  • Kubernetes 1.24+
  • kubectl and Helm 3.x
  • Metrics Server installed (for HPA compatibility)
  • Target event sources accessible from the cluster (Kafka, RabbitMQ, etc.)

Installing KEDA

# Install via Helm (recommended)
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --set prometheus.metricServer.enabled=true \
  --set prometheus.operator.enabled=true

# Verify KEDA components
kubectl -n keda get pods

# Expected pods:
# keda-operator
# keda-operator-metrics-apiserver
# keda-admission-webhooks

# Check KEDA API is registered
kubectl get apiservices | grep keda

# Install KEDA CLI (optional)
curl -s https://api.github.com/repos/kedacore/keda/releases/latest | \
  grep browser_download_url | grep linux_amd64 | cut -d '"' -f 4 | wget -qi -
tar -xvf keda-*-linux-amd64.tar.gz
sudo mv keda /usr/local/bin/

ScaledObject Configuration

A ScaledObject links a deployment to an event source for autoscaling:

# Basic ScaledObject structure
cat > basic-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-app-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: my-app-deployment    # Must match the Deployment name
    # kind: StatefulSet        # Can also target StatefulSets
  pollingInterval: 15          # Check every 15 seconds
  cooldownPeriod: 300          # Wait 300s before scaling down
  minReplicaCount: 0           # Scale to zero when idle
  maxReplicaCount: 50          # Maximum pods
  advanced:
    restoreToOriginalReplicaCount: false
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300  # Avoid flapping
          policies:
            - type: Percent
              value: 10
              periodSeconds: 60
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka:9092
        topic: my-topic
        consumerGroup: my-consumer-group
        lagThreshold: "100"
EOF

kubectl apply -f basic-scaledobject.yaml

# Check ScaledObject status
kubectl get scaledobject my-app-scaler -n production
kubectl describe scaledobject my-app-scaler -n production

Kafka Scaler

Scale consumers based on Kafka consumer group lag:

# Create a secret for Kafka SASL authentication
kubectl create secret generic kafka-auth \
  --from-literal=username=kafka-user \
  --from-literal=password=kafka-password \
  -n production

# Kafka ScaledObject with SASL authentication
cat > kafka-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor
  minReplicaCount: 1
  maxReplicaCount: 20
  cooldownPeriod: 60
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka-broker-1:9092,kafka-broker-2:9092
        consumerGroup: order-processing-group
        topic: orders
        lagThreshold: "50"          # Scale up when lag > 50 per partition
        offsetResetPolicy: latest
        allowIdleConsumers: "false"  # Don't count idle partitions
        scaleToZeroOnInvalidOffset: "false"
        excludePersistentLag: "false"
        tls: enable
        sasl: plaintext
      authenticationRef:
        name: kafka-trigger-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-trigger-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: username
      name: kafka-auth
      key: username
    - parameter: password
      name: kafka-auth
      key: password
EOF

kubectl apply -f kafka-scaledobject.yaml

# Monitor Kafka lag and replica count
watch kubectl get scaledobject kafka-consumer-scaler -n production

# Check HPA created by KEDA
kubectl get hpa -n production

RabbitMQ Scaler

Scale workers based on RabbitMQ queue depth:

# Create RabbitMQ credentials secret
kubectl create secret generic rabbitmq-auth \
  --from-literal=host=amqp://user:password@rabbitmq:5672 \
  -n production

# RabbitMQ queue-based scaling
cat > rabbitmq-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: rabbitmq-trigger-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: host
      name: rabbitmq-auth
      key: host
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: email-worker
  minReplicaCount: 0
  maxReplicaCount: 30
  cooldownPeriod: 120
  triggers:
    - type: rabbitmq
      metadata:
        protocol: amqp
        queueName: email-queue
        mode: QueueLength    # Scale based on queue length
        value: "10"          # 1 replica per 10 messages
        activationValue: "1"  # Activate at 1 message (from 0)
      authenticationRef:
        name: rabbitmq-trigger-auth
---
# HTTP API-based monitoring (more metrics available)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-rate-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: payment-worker
  triggers:
    - type: rabbitmq
      metadata:
        protocol: http
        queueName: payments
        mode: MessageRate    # Scale based on message rate/second
        value: "100"         # 1 replica per 100 messages/sec
        host: http://rabbitmq-management:15672/api
        vhostName: "/"
      authenticationRef:
        name: rabbitmq-http-auth
EOF

kubectl apply -f rabbitmq-scaledobject.yaml

Prometheus Scaler

Scale based on any Prometheus metric:

# Scale based on custom application metrics
cat > prometheus-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: api-gateway
  minReplicaCount: 2
  maxReplicaCount: 50
  cooldownPeriod: 120
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc:9090
        metricName: http_requests_per_second
        # Scale based on average requests per second
        query: |
          sum(rate(http_requests_total{job="api-gateway"}[2m]))
        threshold: "100"           # Scale up per 100 req/s
        activationThreshold: "10"  # Start scaling at 10 req/s
        # Optional: auth for secured Prometheus
      # authenticationRef:
      #   name: prometheus-auth
---
# Scale based on database connections
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: db-connection-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: connection-pool-manager
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc:9090
        metricName: postgres_connections
        query: |
          sum(pg_stat_activity_count{state="active"})
        threshold: "50"  # Scale at 50 active connections
EOF

kubectl apply -f prometheus-scaledobject.yaml

Cron and Time-Based Scaling

Pre-scale workloads for known traffic patterns:

# Scale up before business hours, scale down at night
cat > cron-scaledobject.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: business-hours-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: web-frontend
  minReplicaCount: 2
  maxReplicaCount: 20
  triggers:
    - type: cron
      metadata:
        timezone: America/New_York
        start: "0 8 * * 1-5"    # 8 AM weekdays
        end: "0 20 * * 1-5"     # 8 PM weekdays
        desiredReplicas: "10"    # Scale to 10 during business hours
    - type: cron
      metadata:
        timezone: America/New_York
        start: "0 20 * * 5"     # Friday 8 PM - pre-scale for weekend
        end: "0 8 * * 1"        # Monday 8 AM
        desiredReplicas: "5"
    # Combine with metrics-based scaling
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc:9090
        metricName: active_users
        query: count(user_sessions_active)
        threshold: "50"
EOF

kubectl apply -f cron-scaledobject.yaml

ScaledJob for Batch Workloads

KEDA can also scale Kubernetes Jobs for batch processing:

# Scale Jobs to process a queue
cat > scaled-job.yaml <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: image-processor
  namespace: production
spec:
  jobTargetRef:
    parallelism: 1
    completions: 1
    activeDeadlineSeconds: 600
    backoffLimit: 3
    template:
      spec:
        restartPolicy: Never
        containers:
          - name: processor
            image: your-org/image-processor:latest
            command: ["/process-image"]
            env:
              - name: QUEUE_NAME
                value: "image-processing"
              - name: RABBITMQ_HOST
                valueFrom:
                  secretKeyRef:
                    name: rabbitmq-auth
                    key: host
  pollingInterval: 10
  maxReplicaCount: 20           # Max parallel jobs
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 5
  scalingStrategy:
    strategy: "accurate"        # One job per queue message
    pendingPodConditions:
      - "Ready"
      - "PodScheduled"
  triggers:
    - type: rabbitmq
      metadata:
        protocol: amqp
        queueName: image-processing
        mode: QueueLength
        value: "1"              # 1 job per message
      authenticationRef:
        name: rabbitmq-trigger-auth
EOF

kubectl apply -f scaled-job.yaml

# Monitor scaled jobs
kubectl get scaledjob image-processor -n production
kubectl get jobs -n production -w

Troubleshooting

ScaledObject not scaling:

# Check ScaledObject conditions
kubectl describe scaledobject <name> -n <namespace>

# Look for errors in KEDA operator logs
kubectl -n keda logs deploy/keda-operator -f

# Check metrics server
kubectl -n keda logs deploy/keda-operator-metrics-apiserver -f

# Test trigger authentication
kubectl -n keda exec deploy/keda-operator -- env | grep -i trigger

Scale to zero not working:

# Verify minReplicaCount is 0
kubectl get scaledobject <name> -o jsonpath='{.spec.minReplicaCount}'

# Check KEDA has external trigger data
kubectl describe hpa keda-hpa-<scaledobject-name> -n <namespace>

# Ensure activationValue is set appropriately

Metrics server conflict with HPA:

# KEDA creates an HPA internally - don't create a separate HPA for the same deployment
kubectl get hpa -n production

# If conflicts exist, delete the manually created HPA
kubectl delete hpa <manual-hpa-name> -n production

TriggerAuthentication not working:

# Verify secret exists and has correct keys
kubectl get secret <secret-name> -n <namespace> -o yaml

# Check TriggerAuthentication status
kubectl describe triggerauthentication <name> -n <namespace>

Conclusion

KEDA transforms Kubernetes autoscaling from reactive CPU/memory-based scaling into proactive event-driven scaling that responds to actual workload demand. By scaling to zero during idle periods and rapidly expanding based on queue depth, message rates, or custom metrics, KEDA reduces infrastructure costs while ensuring responsiveness. Start with simple queue-depth scalers and progress to multi-trigger configurations as your workloads grow in complexity.