Detección de Desviación de Configuración del Servidor
Configuration drift occurs when actual server state diverges from desired configuration due to manual changes, failed deployments, or untracked modifications. Detecting and remediating drift is critical for maintaining consistent, predictable infrastructure. This guide covers drift detection techniques using Ansible, Terraform, AIDE, Tripwire, automated detection methods, and remediation strategies.
Tabla de Contenidos
- Configuration Drift Overview
- Ansible Check Mode
- Terraform Plan Analysis
- File Integrity Monitoring with AIDE
- Tripwire for Change Detection
- System Package Monitoring
- Automated Drift Detection
- Drift Remediation
- Monitoring and Alerting
- Conclusion
Descripción General de Desviación de Configuración
Configuration drift is the deviation of actual infrastructure from the declared desired state. It occurs through ad-hoc changes, manual fixes, security patches, or failed deployments.
Common causes:
- Manual Changes: Direct SSH modifications to fix issues
- Failed Deployments: Incomplete automation updates
- Security Patches: OS updates outside of automation
- Third-party Tools: Changes from monitoring or logging systems
- Emergency Fixes: Temporary changes during incidents
- Unchecked Automation: Automation that doesn't idempotently enforce state
Drift detection benefits:
- Consistency: Ensure infrastructure matches configuration
- Security: Detect unauthorized changes
- Compliance: Maintain compliance state
- Predictability: Know actual vs desired state
- Audit Trail: Track what changed and when
Drift detection methods:
┌────────────────────────────────────────┐
│ Desired State (Configuration) │
└────────────────────────┬───────────────┘
│
Compare
│
▼
┌────────────────────────────────────────┐
│ Actual State (Real Servers) │
└────────────────────────────────────────┘
│
Drift Report
│
▼
┌────────────────────────────────────────┐
│ Remediation (Auto or Manual) │
└────────────────────────────────────────┘
Modo de Verificación de Ansible
Use Ansible's check mode to detect configuration drift without applying changes.
Check mode fundamentals:
# Run in check mode (dry-run, no changes applied)
ansible-playbook site.yml --check
# Check mode with verbose output
ansible-playbook site.yml --check -v
# Check specific hosts
ansible-playbook site.yml --check --limit webservers
# Show differences detected
ansible-playbook site.yml --check --diff
Playbook for drift detection:
# drift-check.yml
---
- hosts: all
gather_facts: yes
tasks:
- name: Check package updates
apt:
update_cache: yes
cache_valid_time: 3600
register: apt_check
- name: Verify Nginx installed
package:
name: nginx
state: present
register: nginx_check
check_mode: yes
- name: Check Nginx configuration
stat:
path: /etc/nginx/nginx.conf
register: nginx_conf_stat
- name: Validate Nginx configuration
shell: nginx -t
register: nginx_validate
changed_when: false
failed_when: nginx_validate.rc != 0
- name: Check service status
systemd:
name: nginx
register: nginx_status
- name: Report drift
debug:
msg: |
Drift Detection Report:
- Packages available for update: {{ apt_check.changed }}
- Nginx installed: {{ not nginx_check.changed }}
- Config exists: {{ nginx_conf_stat.stat.exists }}
- Nginx runs valid: {{ nginx_validate.rc == 0 }}
- Service active: {{ nginx_status.status.ActiveState == 'active' }}
- name: Save drift report
copy:
content: |
Drift Report - {{ ansible_date_time.iso8601 }}
Host: {{ inventory_hostname }}
Packages:
- Updates available: {{ apt_check.changed }}
Nginx:
- Installation state: {{ nginx_check.changed }}
- Config valid: {{ nginx_validate.rc == 0 }}
- Service active: {{ nginx_status.status.ActiveState }}
dest: /var/log/drift-report-{{ ansible_date_time.date }}.txt
Ansible with check and enforce:
# deploy-with-drift-check.yml
---
- hosts: all
gather_facts: yes
vars:
auto_remediate: "{{ auto_remediate | default(false) }}"
tasks:
- name: Check configuration
block:
- name: Run configuration check
include_tasks: tasks/config-check.yml
register: drift_check
- name: Report drift
debug:
msg: "Configuration drift detected: {{ drift_check.changes }}"
when: drift_check.changed
rescue:
- name: Remediation needed
debug:
msg: "Manual intervention may be needed"
- name: Remediate drift
block:
- name: Apply configuration
include_tasks: tasks/config-apply.yml
when:
- drift_check.changed
- auto_remediate | bool
- name: Verify remediation
include_tasks: tasks/config-check.yml
register: drift_check_after
when:
- drift_check.changed
- auto_remediate | bool
Análisis de Plan de Terraform
Use Terraform's plan output to detect infrastructure drift.
Terraform plan for drift detection:
# Refresh state and plan
terraform plan -out=tfplan
# Show plan changes
terraform show tfplan
# Human-readable diff
terraform plan -out=tfplan && terraform show tfplan
# JSON output for analysis
terraform plan -json > tfplan.json
# Check for specific resource changes
terraform plan | grep "will be created\|will be updated\|will be deleted"
Drift detection script:
#!/bin/bash
# terraform-drift-check.sh
set -e
TERRAFORM_DIR="${1:-.}"
DRIFT_REPORT="/tmp/drift-report-$(date +%s).txt"
cd "$TERRAFORM_DIR"
# Refresh state
echo "Refreshing Terraform state..."
terraform refresh
# Plan and capture output
echo "Running Terraform plan..."
terraform plan -no-color > "$DRIFT_REPORT" 2>&1
# Check for drift
if grep -q "No changes\|perfect\|already matches desired state" "$DRIFT_REPORT"; then
echo "No drift detected"
exit 0
else
echo "Drift detected!"
echo ""
echo "Changes needed:"
grep "will be created\|will be updated\|will be destroyed\|will be replaced" "$DRIFT_REPORT" || true
echo ""
echo "Full report:"
cat "$DRIFT_REPORT"
exit 1
fi
Scheduled drift checks:
#!/bin/bash
# Cron job for drift detection
# Run every 6 hours
0 */6 * * * cd /opt/terraform && \
terraform refresh && \
terraform plan | mail -s "Drift Detection Report" [email protected]
# More sophisticated with alerting
0 */6 * * * cd /opt/terraform && \
terraform plan -json | \
jq -r 'select(.type=="resource_drift") | .message' | \
if read -r line; then \
curl -X POST https://slack-webhook.example.com \
-d "{\"text\":\"Terraform drift detected: $line\"}"; \
fi
Monitoreo de Integridad de Archivos con AIDE
AIDE (Advanced Intrusion Detection Environment) monitors file changes.
Install and configure AIDE:
# Install AIDE
sudo apt-get install -y aide aide-common
# Initialize database
sudo aideinit
# Wait for database creation (can take several minutes)
# This creates /var/lib/aide/aide.db.new
# Move to production location
sudo mv /var/lib/aide/aide.db.new /var/lib/aide/aide.db
AIDE configuration:
# /etc/aide/aide.conf.d/custom
# Monitor application directories
/opt/app R+b+sha512
/etc/app R+b+sha512
# Monitor critical system files
/etc/passwd R+b+sha512
/etc/shadow R+b+sha512
/etc/sudoers R+b+sha512
# Exclude frequently changing files
!/var/log
!/var/cache
!/tmp
Run AIDE checks:
# Check against database
sudo aide --check
# Generate report
sudo aide --check > /tmp/aide-report.txt
# Compare with baseline (if available)
sudo aide --compare
# Update database after approved changes
sudo aide --update
mv /var/lib/aide/aide.db.new /var/lib/aide/aide.db
Automated AIDE monitoring:
#!/bin/bash
# aide-monitor.sh
AIDE_DB="/var/lib/aide/aide.db"
REPORT_FILE="/var/log/aide-report-$(date +%Y%m%d).txt"
CHANGED_FILE="/var/log/aide-changes-$(date +%Y%m%d).txt"
# Run check
sudo aide --check > "$REPORT_FILE" 2>&1
# Extract changed files
if grep -q "changed" "$REPORT_FILE"; then
echo "File changes detected:"
grep "changed" "$REPORT_FILE" > "$CHANGED_FILE"
# Send alert
cat "$CHANGED_FILE" | mail -s "AIDE: File Changes Detected" [email protected]
exit 1
else
echo "No changes detected"
exit 0
fi
Cron job for AIDE:
# Run AIDE checks hourly
0 * * * * /usr/local/bin/aide-monitor.sh
# Run AIDE checks daily
0 2 * * * sudo aide --check > /var/log/aide-daily-$(date +\%Y\%m\%d).txt 2>&1
Tripwire para Detección de Cambios
Tripwire provides advanced file integrity monitoring.
Install Tripwire:
# Ubuntu/Debian
sudo apt-get install -y tripwire
# Configure
sudo twinstall.sh
# Accept default settings when prompted
# Default password: admin
# Initialize database
sudo tripwire --init
# Create baseline report
sudo tripwire --check --email-report
Tripwire policy configuration:
# /etc/tripwire/twpol.txt
# Monitor application
/opt/app -> $(NORMAL);
/opt/app/bin -> $(NORMAL);
/opt/app/conf -> $(NORMAL);
# Monitor system configuration
/etc/passwd -> $(PERMS);
/etc/shadow -> $(PERMS);
/etc/sudoers -> $(PERMS);
/etc/hosts -> $(NORMAL);
# Skip frequently changing files
!/var/log;
!/var/cache;
!/tmp;
!/var/tmp;
# Variable definitions
NORMAL = p+i+n+u+g+s+b+m+c+md5+rmd160;
PERMS = p+u+g;
Run Tripwire checks:
# Initialize policy
sudo tripwire -a -S /etc/tripwire/site.key -L
# Check integrity
sudo tripwire --check
# Email report
sudo tripwire --check --email-report
# Generate report
sudo tripwire --check --report-level 3 > /tmp/tripwire-report.txt
# Update database after approved changes
sudo tripwire --update
sudo tripwire --init
Monitoreo de Paquetes del Sistema
Monitor installed packages for drift.
Package inventory:
#!/bin/bash
# package-monitor.sh
# Generate package list
dpkg -l > /var/log/packages-installed.txt
# Compare with previous
if [ -f /var/log/packages-installed.prev ]; then
diff /var/log/packages-installed.prev /var/log/packages-installed.txt > /tmp/package-changes.txt
if [ -s /tmp/package-changes.txt ]; then
echo "Package changes detected:"
cat /tmp/package-changes.txt
# Send alert
mail -s "Package Changes Detected" [email protected] < /tmp/package-changes.txt
fi
fi
# Update baseline
cp /var/log/packages-installed.txt /var/log/packages-installed.prev
Check for security updates:
#!/bin/bash
# security-updates.sh
echo "Checking for available security updates..."
# Count security updates
SECURITY_UPDATES=$(apt list --upgradable 2>/dev/null | grep -c "Security")
if [ "$SECURITY_UPDATES" -gt 0 ]; then
echo "Security updates available: $SECURITY_UPDATES"
# List them
apt list --upgradable 2>/dev/null | grep "Security"
# Alert
echo "Security updates available" | \
mail -s "Security Updates Needed" [email protected]
fi
Detección Automatizada de Desviaciones
Set up automated monitoring systems.
Drift detection daemon:
#!/bin/bash
# drift-detection-daemon.sh
DRIFT_CHECK_INTERVAL=3600 # 1 hour
DRIFT_LOG="/var/log/drift-detection.log"
while true; do
echo "$(date): Running drift detection..." >> "$DRIFT_LOG"
# Run Terraform plan
(cd /opt/terraform && terraform plan -json | \
jq -r '.[] | select(.type=="resource_drift") | .message' >> "$DRIFT_LOG") || true
# Run Ansible check
(ansible-playbook /opt/ansible/drift-check.yml --check --diff >> "$DRIFT_LOG") || true
# Run file integrity check
sudo aide --check >> "$DRIFT_LOG" 2>&1 || true
# Sleep before next check
sleep "$DRIFT_CHECK_INTERVAL"
done
Systemd service for drift detection:
# /etc/systemd/system/drift-detection.service
[Unit]
Description=Configuration Drift Detection
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/drift-detection-daemon.sh
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Remediación de Desviaciones
Automatically fix drift when detected.
Auto-remediation workflow:
# remediate-drift.yml
---
- hosts: all
serial: 1 # One host at a time
tasks:
- name: Check for drift
include_tasks: tasks/drift-check.yml
register: drift_check
- name: Log drift detection
lineinfile:
path: /var/log/drift-remediation.log
line: "[{{ ansible_date_time.iso8601 }}] Drift detected on {{ inventory_hostname }}: {{ drift_check.changes | join(', ') }}"
create: yes
delegate_to: localhost
- name: Remediate drift
block:
- name: Apply desired configuration
shell: |
cd /opt/terraform
terraform apply -auto-approve
register: remediation_result
- name: Verify remediation
include_tasks: tasks/drift-check.yml
register: drift_check_after
- name: Report success
debug:
msg: "Drift remediated successfully"
when: not drift_check_after.changed
rescue:
- name: Remediation failed
debug:
msg: "Failed to remediate drift"
- name: Alert on failure
mail:
host: smtp.example.com
port: 25
subject: "Drift Remediation Failed - {{ inventory_hostname }}"
body: "Automatic remediation failed. Manual intervention required."
to: [email protected]
Monitoreo y Alertas
Alert on drift detection and remediation actions.
Prometheus metrics:
# /etc/prometheus/rules/drift.yml
groups:
- name: drift_detection
interval: 1m
rules:
- alert: ConfigurationDriftDetected
expr: drift_detection_changes_total > 0
for: 5m
annotations:
summary: "Configuration drift detected on {{ $labels.instance }}"
description: "{{ $value }} configuration changes detected"
- alert: FileIntegrityViolation
expr: aide_violations_total > 0
for: 5m
annotations:
summary: "File integrity violation on {{ $labels.instance }}"
Alerting rules:
#!/bin/bash
# send-alert.sh
ALERT_MESSAGE="$1"
SEVERITY="${2:-warning}"
# Send to multiple channels
case "$SEVERITY" in
critical)
# Send Slack alert
curl -X POST https://hooks.slack.com/services/... \
-d "{\"text\":\":rotating_light: CRITICAL: $ALERT_MESSAGE\"}"
# Send PagerDuty
curl -X POST https://events.pagerduty.com/v2/enqueue \
-d "{\"routing_key\":\"...\",\"payload\":{\"summary\":\"$ALERT_MESSAGE\"}}"
;;
warning)
# Send email
echo "$ALERT_MESSAGE" | mail -s "Drift Warning" [email protected]
;;
esac
Conclusión
Configuration drift detection is critical for maintaining infrastructure consistency and security. By combining Ansible check mode for quick drift detection, Terraform plan analysis for infrastructure changes, file integrity monitoring with AIDE and Tripwire, and automated detection systems, you create a comprehensive drift detection and remediation framework. Automated remediation with proper alerting ensures infrastructure stays in the desired state while maintaining audit trails of all changes.


