Open Policy Agent and Gatekeeper for Kubernetes

OPA Gatekeeper enforces custom policies in Kubernetes as an admission controller, preventing non-compliant resources from being created before they reach the API server. This guide covers installing Gatekeeper, writing Rego constraint templates, enforcing common security policies, and running in audit mode.

Prerequisites

  • Kubernetes cluster (1.25+)
  • kubectl configured with cluster-admin permissions
  • helm 3.x installed
  • Basic understanding of Kubernetes admission controllers

Install OPA Gatekeeper

# Option 1: Install with kubectl (latest release)
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/release-3.16/deploy/gatekeeper.yaml

# Option 2: Install with Helm
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update

helm install gatekeeper/gatekeeper \
  --name-template=gatekeeper \
  --namespace gatekeeper-system \
  --create-namespace \
  --set replicas=3 \
  --set logLevel=WARNING \
  --set auditInterval=60 \
  --set constraintViolationsLimit=20

# Verify installation
kubectl -n gatekeeper-system get pods
kubectl -n gatekeeper-system get validatingwebhookconfigurations

# Wait for Gatekeeper to be ready
kubectl -n gatekeeper-system rollout status deploy/gatekeeper-controller-manager

Understanding Constraint Templates and Constraints

Gatekeeper uses a two-step model:

  1. ConstraintTemplate - defines a new Custom Resource Definition (CRD) and the Rego policy logic
  2. Constraint - an instance of the CRD that configures where and how the policy applies
ConstraintTemplate (defines "RequireLabels" policy type)
    └── Constraint: RequireLabels "must-have-team" (applies policy to namespaces)

Common Security Policies

1. Require specific labels on namespaces:

# constraint-template-required-labels.yaml
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
  annotations:
    description: Requires resources to have specified labels.
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg, "details": {"missing_labels": missing}}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("Missing required labels: %v", [missing])
        }
---
# constraint-required-labels.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-label
spec:
  enforcementAction: deny   # or "warn" or "dryrun"
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["team", "environment"]
kubectl apply -f constraint-template-required-labels.yaml
kubectl apply -f constraint-required-labels.yaml

2. Require non-root containers:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8snoroot
spec:
  crd:
    spec:
      names:
        kind: K8sNoRoot
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8snoroot

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.securityContext.runAsNonRoot
          msg := sprintf("Container '%v' must set runAsNonRoot: true", [container.name])
        }

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          container.securityContext.runAsUser == 0
          msg := sprintf("Container '%v' must not run as root (uid 0)", [container.name])
        }

        violation[{"msg": msg}] {
          container := input.review.object.spec.initContainers[_]
          not container.securityContext.runAsNonRoot
          msg := sprintf("InitContainer '%v' must set runAsNonRoot: true", [container.name])
        }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sNoRoot
metadata:
  name: no-root-containers
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["apps"]
        kinds: ["Deployment", "StatefulSet", "DaemonSet"]
    excludedNamespaces:
      - kube-system
      - gatekeeper-system

3. Block privileged containers:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sprivilegedcontainer
spec:
  crd:
    spec:
      names:
        kind: K8sPrivilegedContainer
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sprivilegedcontainer

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          container.securityContext.privileged
          msg := sprintf("Privileged container '%v' is not allowed", [container.name])
        }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPrivilegedContainer
metadata:
  name: no-privileged-containers
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["*"]
        kinds: ["Pod"]
    excludedNamespaces:
      - kube-system

4. Enforce image registry allowlist:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
    spec:
      names:
        kind: K8sAllowedRepos
      validation:
        openAPIV3Schema:
          type: object
          properties:
            repos:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sallowedrepos

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not starts_with_allowed(container.image)
          msg := sprintf("Container '%v' uses image '%v' from a disallowed registry", [container.name, container.image])
        }

        starts_with_allowed(image) {
          allowed := input.parameters.repos[_]
          startswith(image, allowed)
        }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: allow-internal-registry
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["*"]
        kinds: ["Pod"]
    excludedNamespaces:
      - kube-system
  parameters:
    repos:
      - "registry.example.com/"
      - "gcr.io/my-project/"
      - "docker.io/myorg/"

Audit Mode

Run policies in audit mode first to discover existing violations without blocking deployments:

# Change enforcementAction to "dryrun" to audit without blocking
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-label-audit
spec:
  enforcementAction: dryrun   # "dryrun" = audit only, "warn" = warn but allow, "deny" = block
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["team"]
# View audit results
kubectl get k8srequiredlabels require-team-label -o yaml | grep -A20 "status:"

# List all violations across all constraints
kubectl get constraints -o json | jq '.items[].status.violations[]?'

# Get violations for a specific constraint
kubectl describe k8srequiredlabels require-team-label

Testing Policies

Test policies by trying to create non-compliant resources:

# Test required labels policy
kubectl create namespace test-no-labels
# Expected: Error from server: admission webhook denied the request

# Test non-root policy
kubectl apply -f - << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: test-root-pod
  namespace: default
spec:
  containers:
  - name: test
    image: nginx:alpine
    securityContext:
      runAsUser: 0   # root
EOF
# Expected: admission denied

# Test allowed registries policy
kubectl apply -f - << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: test-bad-registry
  namespace: default
spec:
  containers:
  - name: test
    image: docker.io/malicioususer/badimage:latest
EOF
# Expected: admission denied

# Test a compliant resource (should succeed)
kubectl apply -f - << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: compliant-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: compliant-app
  template:
    metadata:
      labels:
        app: compliant-app
        team: platform
        environment: production
    spec:
      containers:
      - name: app
        image: registry.example.com/myapp:latest
        securityContext:
          runAsNonRoot: true
          runAsUser: 1001
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
EOF

Managing Exceptions

Exclude namespaces or resources from policies using the match field:

spec:
  match:
    # Exclude system namespaces
    excludedNamespaces:
      - kube-system
      - kube-public
      - gatekeeper-system
      - cert-manager
      - monitoring

    # Only apply to specific namespaces
    namespaceSelector:
      matchLabels:
        policy-enforcement: "true"

    # Exclude specific resources by name
    excludedResources:
      - "kube-system/coredns"

Use namespace labels to opt in/out of specific policies:

# Label a namespace to enable policy enforcement
kubectl label namespace production policy-enforcement=true

# Label a namespace to exclude from a specific policy
kubectl label namespace legacy-app skip-registry-check=true

Troubleshooting

Webhook not firing (policy not enforced):

# Check if the webhook is configured
kubectl get validatingwebhookconfigurations gatekeeper-validating-webhook-configuration

# Check Gatekeeper controller logs
kubectl -n gatekeeper-system logs -l control-plane=controller-manager --tail=50

# Verify the constraint is synced
kubectl get k8srequiredlabels -o yaml | grep -A5 "byPod"

Policy failing for all resources (including system):

# Ensure excludedNamespaces includes kube-system and gatekeeper-system
# Check if Gatekeeper itself is exempt
kubectl get config config -n gatekeeper-system -o yaml | grep -A10 "match"

Rego syntax errors:

# Install opa CLI for local policy testing
curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64_static
chmod +x opa && sudo mv opa /usr/local/bin/

# Test Rego policy locally
opa eval --data policy.rego --input test-input.json "data.k8srequiredlabels.violation"

Constraint not updated after template change:

# Delete and recreate the constraint
kubectl delete k8srequiredlabels require-team-label
kubectl apply -f constraint-required-labels.yaml

Conclusion

OPA Gatekeeper provides policy-as-code enforcement for Kubernetes, preventing non-compliant resources from being admitted to the cluster through an admission webhook. Start by deploying constraints in dryrun mode to discover existing violations without disrupting workloads, then gradually switch to deny or warn mode as teams remediate issues. The Rego policy language is powerful but has a learning curve; use the opa CLI for local policy testing before deploying constraints to the cluster.