Playbooks de Ansible: Ejemplos Prácticos para Infraestructura del Mundo Real
Introducción
Los playbooks de Ansible son la piedra angular de la automatización de infraestructura, transformando procesos manuales complejos en configuraciones repetibles y controladas por versiones. Mientras que los comandos ad-hoc son útiles para tareas rápidas, los playbooks proporcionan el poder para orquestar operaciones de múltiples pasos, gestionar infraestructura compleja e implementar estrategias de despliegue sofisticadas.
Esta guía completa presenta ejemplos prácticos de playbooks de Ansible listos para producción que puede adaptar a las necesidades de su infraestructura. Cada ejemplo está diseñado para resolver problemas del mundo real enfrentados por administradores de sistemas e ingenieros de DevOps, desde desplegar pilas de aplicaciones completas hasta implementar procedimientos de recuperación ante desastres.
Ya sea que esté gestionando un puñado de servidores u orquestando miles de instancias en la nube, estos ejemplos de playbooks le ayudarán a automatizar tareas repetitivas, reducir el error humano e implementar las mejores prácticas de infraestructura como código. Cada ejemplo incluye explicaciones detalladas, código funcional completo y mejores prácticas que puede aplicar inmediatamente a sus proyectos.
Comprensión de la Estructura de Playbooks
Antes de sumergirnos en los ejemplos, entendamos la anatomía de un playbook bien estructurado:
---
# Top-level play
- name: Descriptive play name
hosts: target_hosts
become: yes # Privilege escalation
gather_facts: yes # Gather system information
vars:
# Play-specific variables
app_version: "1.0.0"
pre_tasks:
# Tasks that run before roles
- name: Update cache
apt:
update_cache: yes
roles:
# Reusable role includes
- common
- webserver
tasks:
# Main tasks
- name: Task description
module_name:
parameter: value
notify: handler_name
post_tasks:
# Tasks that run after everything
- name: Final verification
uri:
url: http://localhost
handlers:
# Event-driven tasks
- name: handler_name
systemd:
name: nginx
state: restarted
Requisitos Previos
Para usar estos playbooks de manera efectiva, asegúrese de tener:
- Ansible 2.9 o superior instalado en su nodo de control
- Acceso SSH a nodos gestionados con autenticación basada en claves
- Privilegios Sudo/root en nodos gestionados
- Comprensión básica de la sintaxis YAML
- Archivo de inventario correctamente configurado
- Python 3.6+ en todos los nodos gestionados
Estructura del Proyecto
Organice su proyecto Ansible de esta manera:
ansible-project/
├── ansible.cfg
├── inventory/
│ ├── production
│ ├── staging
│ └── development
├── group_vars/
│ ├── all.yml
│ ├── webservers.yml
│ └── databases.yml
├── host_vars/
│ └── special-host.yml
├── playbooks/
│ ├── site.yml
│ ├── webservers.yml
│ └── databases.yml
├── roles/
│ ├── common/
│ ├── nginx/
│ └── postgresql/
└── files/
└── templates/
Ejemplo 1: Despliegue Completo de Pila LEMP
Este playbook despliega una pila completa de Linux, Nginx, MySQL (MariaDB), PHP con endurecimiento de seguridad:
---
# playbooks/lemp-stack.yml
- name: Deploy LEMP Stack
hosts: webservers
become: yes
vars:
php_version: "8.2"
mysql_root_password: "{{ vault_mysql_root_password }}"
app_user: "www-data"
app_domain: "example.com"
tasks:
# System preparation
- name: Update apt cache
apt:
update_cache: yes
cache_valid_time: 3600
- name: Install system dependencies
apt:
name:
- software-properties-common
- apt-transport-https
- ca-certificates
- curl
- gnupg
state: present
# Nginx installation and configuration
- name: Install Nginx
apt:
name: nginx
state: present
- name: Create web root directory
file:
path: "/var/www/{{ app_domain }}"
state: directory
owner: "{{ app_user }}"
group: "{{ app_user }}"
mode: '0755'
- name: Configure Nginx virtual host
template:
src: templates/nginx-vhost.j2
dest: "/etc/nginx/sites-available/{{ app_domain }}"
mode: '0644'
notify: reload nginx
- name: Enable Nginx site
file:
src: "/etc/nginx/sites-available/{{ app_domain }}"
dest: "/etc/nginx/sites-enabled/{{ app_domain }}"
state: link
notify: reload nginx
- name: Remove default Nginx site
file:
path: /etc/nginx/sites-enabled/default
state: absent
notify: reload nginx
# MariaDB installation
- name: Install MariaDB server
apt:
name:
- mariadb-server
- mariadb-client
- python3-pymysql
state: present
- name: Start and enable MariaDB
systemd:
name: mariadb
state: started
enabled: yes
- name: Set MariaDB root password
mysql_user:
name: root
password: "{{ mysql_root_password }}"
login_unix_socket: /var/run/mysqld/mysqld.sock
state: present
- name: Create MariaDB configuration for root
template:
src: templates/my.cnf.j2
dest: /root/.my.cnf
mode: '0600'
- name: Remove anonymous MariaDB users
mysql_user:
name: ''
host_all: yes
state: absent
- name: Remove MariaDB test database
mysql_db:
name: test
state: absent
# PHP installation
- name: Add PHP repository
apt_repository:
repo: "ppa:ondrej/php"
state: present
- name: Install PHP and extensions
apt:
name:
- "php{{ php_version }}-fpm"
- "php{{ php_version }}-mysql"
- "php{{ php_version }}-curl"
- "php{{ php_version }}-gd"
- "php{{ php_version }}-mbstring"
- "php{{ php_version }}-xml"
- "php{{ php_version }}-zip"
- "php{{ php_version }}-opcache"
state: present
- name: Configure PHP-FPM pool
template:
src: templates/php-fpm-pool.j2
dest: "/etc/php/{{ php_version }}/fpm/pool.d/www.conf"
mode: '0644'
notify: restart php-fpm
- name: Configure PHP settings
lineinfile:
path: "/etc/php/{{ php_version }}/fpm/php.ini"
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- { regexp: '^;?upload_max_filesize', line: 'upload_max_filesize = 64M' }
- { regexp: '^;?post_max_size', line: 'post_max_size = 64M' }
- { regexp: '^;?memory_limit', line: 'memory_limit = 256M' }
- { regexp: '^;?max_execution_time', line: 'max_execution_time = 300' }
notify: restart php-fpm
# Security hardening
- name: Install and configure UFW
apt:
name: ufw
state: present
- name: Configure UFW defaults
ufw:
direction: "{{ item.direction }}"
policy: "{{ item.policy }}"
loop:
- { direction: 'incoming', policy: 'deny' }
- { direction: 'outgoing', policy: 'allow' }
- name: Allow SSH
ufw:
rule: allow
port: '22'
proto: tcp
- name: Allow HTTP
ufw:
rule: allow
port: '80'
proto: tcp
- name: Allow HTTPS
ufw:
rule: allow
port: '443'
proto: tcp
- name: Enable UFW
ufw:
state: enabled
# SSL certificate with Let's Encrypt
- name: Install Certbot
apt:
name:
- certbot
- python3-certbot-nginx
state: present
- name: Obtain SSL certificate
command: >
certbot --nginx --non-interactive --agree-tos
--email admin@{{ app_domain }}
-d {{ app_domain }} -d www.{{ app_domain }}
args:
creates: "/etc/letsencrypt/live/{{ app_domain }}/fullchain.pem"
- name: Setup SSL renewal cron job
cron:
name: "Renew Let's Encrypt certificates"
minute: "0"
hour: "3"
job: "certbot renew --quiet --post-hook 'systemctl reload nginx'"
# Deploy sample application
- name: Deploy index.php
copy:
content: |
<?php
phpinfo();
?>
dest: "/var/www/{{ app_domain }}/index.php"
owner: "{{ app_user }}"
group: "{{ app_user }}"
mode: '0644'
handlers:
- name: reload nginx
systemd:
name: nginx
state: reloaded
- name: restart php-fpm
systemd:
name: "php{{ php_version }}-fpm"
state: restarted
Plantilla Requerida: nginx-vhost.j2
# templates/nginx-vhost.j2
server {
listen 80;
listen [::]:80;
server_name {{ app_domain }} www.{{ app_domain }};
root /var/www/{{ app_domain }};
index index.php index.html index.htm;
location / {
try_files $uri $uri/ =404;
}
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/var/run/php/php{{ php_version }}-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
}
location ~ /\.ht {
deny all;
}
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Gzip compression
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss;
}
Ejemplo 2: Despliegue de Aplicación Multi-Nivel
Despliegue una aplicación completa con balanceador de carga, servidores web y clúster de base de datos:
---
# playbooks/multi-tier-app.yml
- name: Configure load balancers
hosts: loadbalancers
become: yes
tasks:
- name: Install HAProxy
apt:
name: haproxy
state: present
- name: Configure HAProxy
template:
src: templates/haproxy.cfg.j2
dest: /etc/haproxy/haproxy.cfg
mode: '0644'
validate: 'haproxy -f %s -c'
notify: restart haproxy
- name: Enable HAProxy
systemd:
name: haproxy
enabled: yes
state: started
handlers:
- name: restart haproxy
systemd:
name: haproxy
state: restarted
- name: Configure web application servers
hosts: appservers
become: yes
serial: 1 # Rolling deployment
vars:
app_name: "myapp"
app_version: "{{ deploy_version | default('latest') }}"
app_port: 3000
tasks:
- name: Install Node.js
apt:
name:
- nodejs
- npm
state: present
- name: Create application user
user:
name: "{{ app_name }}"
system: yes
shell: /bin/bash
home: "/opt/{{ app_name }}"
- name: Create app directory
file:
path: "/opt/{{ app_name }}"
state: directory
owner: "{{ app_name }}"
group: "{{ app_name }}"
mode: '0755'
- name: Deploy application code
git:
repo: "https://github.com/yourorg/{{ app_name }}.git"
dest: "/opt/{{ app_name }}/app"
version: "{{ app_version }}"
force: yes
become_user: "{{ app_name }}"
notify: restart app
- name: Install npm dependencies
npm:
path: "/opt/{{ app_name }}/app"
production: yes
become_user: "{{ app_name }}"
notify: restart app
- name: Create environment file
template:
src: templates/app-env.j2
dest: "/opt/{{ app_name }}/.env"
owner: "{{ app_name }}"
group: "{{ app_name }}"
mode: '0600'
notify: restart app
- name: Create systemd service
template:
src: templates/app-service.j2
dest: "/etc/systemd/system/{{ app_name }}.service"
mode: '0644'
notify:
- reload systemd
- restart app
- name: Enable and start application
systemd:
name: "{{ app_name }}"
enabled: yes
state: started
- name: Wait for application to be ready
uri:
url: "http://localhost:{{ app_port }}/health"
status_code: 200
register: result
until: result.status == 200
retries: 10
delay: 3
handlers:
- name: reload systemd
systemd:
daemon_reload: yes
- name: restart app
systemd:
name: "{{ app_name }}"
state: restarted
- name: Configure database servers
hosts: databases
become: yes
vars:
postgres_version: "15"
db_name: "myapp_production"
db_user: "myapp"
db_password: "{{ vault_db_password }}"
tasks:
- name: Install PostgreSQL
apt:
name:
- "postgresql-{{ postgres_version }}"
- "postgresql-contrib-{{ postgres_version }}"
- python3-psycopg2
state: present
- name: Ensure PostgreSQL is running
systemd:
name: postgresql
state: started
enabled: yes
- name: Create application database
postgresql_db:
name: "{{ db_name }}"
state: present
become_user: postgres
- name: Create application user
postgresql_user:
name: "{{ db_user }}"
password: "{{ db_password }}"
db: "{{ db_name }}"
priv: ALL
state: present
become_user: postgres
- name: Configure PostgreSQL for network access
lineinfile:
path: "/etc/postgresql/{{ postgres_version }}/main/postgresql.conf"
regexp: "^#?listen_addresses"
line: "listen_addresses = '*'"
notify: restart postgresql
- name: Allow application servers to connect
postgresql_pg_hba:
dest: "/etc/postgresql/{{ postgres_version }}/main/pg_hba.conf"
contype: host
users: "{{ db_user }}"
source: "{{ hostvars[item]['ansible_default_ipv4']['address'] }}/32"
databases: "{{ db_name }}"
method: md5
loop: "{{ groups['appservers'] }}"
notify: restart postgresql
handlers:
- name: restart postgresql
systemd:
name: postgresql
state: restarted
- name: Run database migrations
hosts: appservers[0]
become: yes
become_user: myapp
tasks:
- name: Run migrations
command: npm run migrate
args:
chdir: /opt/myapp/app
run_once: yes
Ejemplo 3: Automatización de Recuperación ante Desastres y Respaldo
Solución completa de respaldo con rotación y almacenamiento fuera del sitio:
---
# playbooks/backup-automation.yml
- name: Configure automated backups
hosts: all
become: yes
vars:
backup_dir: "/var/backups"
backup_retention_days: 7
backup_s3_bucket: "company-backups"
backup_schedule: "0 2 * * *" # 2 AM daily
tasks:
- name: Install backup tools
apt:
name:
- rsync
- borgbackup
- awscli
- pigz
state: present
- name: Create backup directory
file:
path: "{{ backup_dir }}"
state: directory
mode: '0700'
owner: root
group: root
- name: Create backup script
copy:
content: |
#!/bin/bash
set -euo pipefail
# Configuration
BACKUP_DIR="{{ backup_dir }}"
RETENTION_DAYS={{ backup_retention_days }}
S3_BUCKET="{{ backup_s3_bucket }}"
HOSTNAME=$(hostname -f)
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# Logging
LOG_FILE="${BACKUP_DIR}/backup.log"
exec 1> >(tee -a "${LOG_FILE}")
exec 2>&1
echo "=== Backup started at $(date) ==="
# Backup system files
echo "Backing up system files..."
tar -czf "${BACKUP_DIR}/system_${TIMESTAMP}.tar.gz" \
/etc \
/home \
/root \
--exclude='/home/*/.cache' \
--exclude='/home/*/tmp'
{% if 'databases' in group_names %}
# Database backup
echo "Backing up databases..."
if systemctl is-active --quiet postgresql; then
sudo -u postgres pg_dumpall | pigz > "${BACKUP_DIR}/postgres_${TIMESTAMP}.sql.gz"
fi
if systemctl is-active --quiet mariadb; then
mysqldump --all-databases --single-transaction | pigz > "${BACKUP_DIR}/mysql_${TIMESTAMP}.sql.gz"
fi
{% endif %}
{% if 'webservers' in group_names %}
# Web content backup
echo "Backing up web content..."
tar -czf "${BACKUP_DIR}/web_${TIMESTAMP}.tar.gz" /var/www
{% endif %}
# Upload to S3
echo "Uploading to S3..."
aws s3 sync "${BACKUP_DIR}" "s3://${S3_BUCKET}/${HOSTNAME}/" \
--exclude "*.log" \
--storage-class STANDARD_IA
# Cleanup old local backups
echo "Cleaning up old backups..."
find "${BACKUP_DIR}" -name "*.tar.gz" -mtime +${RETENTION_DAYS} -delete
find "${BACKUP_DIR}" -name "*.sql.gz" -mtime +${RETENTION_DAYS} -delete
echo "=== Backup completed at $(date) ==="
dest: /usr/local/bin/automated-backup.sh
mode: '0700'
owner: root
group: root
- name: Configure AWS credentials
template:
src: templates/aws-credentials.j2
dest: /root/.aws/credentials
mode: '0600'
- name: Schedule backup cron job
cron:
name: "Automated system backup"
minute: "{{ backup_schedule.split()[0] }}"
hour: "{{ backup_schedule.split()[1] }}"
job: "/usr/local/bin/automated-backup.sh"
state: present
- name: Create backup monitoring script
copy:
content: |
#!/bin/bash
BACKUP_DIR="{{ backup_dir }}"
MAX_AGE_HOURS=26
LATEST_BACKUP=$(find "${BACKUP_DIR}" -name "*.tar.gz" -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" ")
if [ -z "$LATEST_BACKUP" ]; then
echo "CRITICAL: No backups found"
exit 2
fi
AGE_HOURS=$(( ($(date +%s) - $(stat -c %Y "$LATEST_BACKUP")) / 3600 ))
if [ $AGE_HOURS -gt $MAX_AGE_HOURS ]; then
echo "WARNING: Latest backup is ${AGE_HOURS} hours old"
exit 1
fi
echo "OK: Latest backup is ${AGE_HOURS} hours old"
exit 0
dest: /usr/local/bin/check-backup.sh
mode: '0755'
- name: Test backup script
command: /usr/local/bin/automated-backup.sh
async: 3600
poll: 0
register: backup_test
- name: Verify backup completion
async_status:
jid: "{{ backup_test.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 60
delay: 60
Ejemplo 4: Despliegues Continuos sin Tiempo de Inactividad
Implemente despliegues blue-green con verificaciones de salud:
---
# playbooks/rolling-deployment.yml
- name: Blue-Green deployment with zero downtime
hosts: webservers
become: yes
serial: 1
max_fail_percentage: 0
vars:
app_name: "webapp"
app_version: "{{ deploy_version }}"
app_port: 8080
health_check_url: "http://localhost:{{ app_port }}/health"
health_check_retries: 30
health_check_delay: 2
pre_tasks:
- name: Remove from load balancer
haproxy:
state: disabled
host: "{{ inventory_hostname }}"
socket: /run/haproxy/admin.sock
backend: app_backend
delegate_to: "{{ item }}"
loop: "{{ groups['loadbalancers'] }}"
- name: Wait for connections to drain
wait_for:
timeout: 10
tasks:
- name: Stop current application
systemd:
name: "{{ app_name }}"
state: stopped
- name: Backup current version
command: >
mv /opt/{{ app_name }}/current
/opt/{{ app_name }}/rollback_{{ ansible_date_time.epoch }}
args:
removes: /opt/{{ app_name }}/current
ignore_errors: yes
- name: Deploy new version
git:
repo: "https://github.com/yourorg/{{ app_name }}.git"
dest: "/opt/{{ app_name }}/releases/{{ app_version }}"
version: "{{ app_version }}"
become_user: "{{ app_name }}"
- name: Install dependencies
npm:
path: "/opt/{{ app_name }}/releases/{{ app_version }}"
production: yes
become_user: "{{ app_name }}"
- name: Create symlink to current version
file:
src: "/opt/{{ app_name }}/releases/{{ app_version }}"
dest: "/opt/{{ app_name }}/current"
state: link
- name: Start application
systemd:
name: "{{ app_name }}"
state: started
- name: Wait for application health check
uri:
url: "{{ health_check_url }}"
status_code: 200
timeout: 5
register: health_check
until: health_check.status == 200
retries: "{{ health_check_retries }}"
delay: "{{ health_check_delay }}"
failed_when: false
- name: Rollback if health check fails
block:
- name: Stop failed deployment
systemd:
name: "{{ app_name }}"
state: stopped
- name: Restore previous version
shell: |
rm -f /opt/{{ app_name }}/current
ROLLBACK=$(ls -t /opt/{{ app_name }}/rollback_* | head -1)
mv "$ROLLBACK" /opt/{{ app_name }}/current
args:
executable: /bin/bash
- name: Start rolled back version
systemd:
name: "{{ app_name }}"
state: started
- name: Fail deployment
fail:
msg: "Deployment failed health check, rolled back to previous version"
when: health_check.status != 200
post_tasks:
- name: Add back to load balancer
haproxy:
state: enabled
host: "{{ inventory_hostname }}"
socket: /run/haproxy/admin.sock
backend: app_backend
delegate_to: "{{ item }}"
loop: "{{ groups['loadbalancers'] }}"
- name: Verify in load balancer rotation
uri:
url: "http://{{ hostvars[item]['ansible_default_ipv4']['address'] }}/haproxy?stats"
return_content: yes
delegate_to: "{{ item }}"
loop: "{{ groups['loadbalancers'] }}"
register: lb_status
failed_when: "'{{ inventory_hostname }}' not in lb_status.content"
- name: Cleanup old releases
shell: |
cd /opt/{{ app_name }}/releases
ls -t | tail -n +4 | xargs -r rm -rf
cd /opt/{{ app_name }}
ls -t rollback_* 2>/dev/null | tail -n +3 | xargs -r rm -rf
args:
executable: /bin/bash
Ejemplo 5: Configuración de Monitoreo de Infraestructura
Despliegue pila completa de monitoreo con Prometheus y Grafana:
---
# playbooks/monitoring-stack.yml
- name: Deploy Prometheus monitoring
hosts: monitoring
become: yes
vars:
prometheus_version: "2.45.0"
grafana_version: "latest"
alertmanager_version: "0.26.0"
tasks:
- name: Create prometheus user
user:
name: prometheus
system: yes
shell: /bin/false
create_home: no
- name: Create prometheus directories
file:
path: "{{ item }}"
state: directory
owner: prometheus
group: prometheus
mode: '0755'
loop:
- /etc/prometheus
- /var/lib/prometheus
- name: Download Prometheus
get_url:
url: "https://github.com/prometheus/prometheus/releases/download/v{{ prometheus_version }}/prometheus-{{ prometheus_version }}.linux-amd64.tar.gz"
dest: /tmp/prometheus.tar.gz
- name: Extract Prometheus
unarchive:
src: /tmp/prometheus.tar.gz
dest: /tmp
remote_src: yes
- name: Copy Prometheus binaries
copy:
src: "/tmp/prometheus-{{ prometheus_version }}.linux-amd64/{{ item }}"
dest: "/usr/local/bin/{{ item }}"
mode: '0755'
remote_src: yes
loop:
- prometheus
- promtool
- name: Configure Prometheus
template:
src: templates/prometheus.yml.j2
dest: /etc/prometheus/prometheus.yml
owner: prometheus
group: prometheus
mode: '0644'
notify: reload prometheus
- name: Create Prometheus systemd service
copy:
content: |
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/prometheus.service
mode: '0644'
notify:
- reload systemd
- restart prometheus
- name: Start Prometheus
systemd:
name: prometheus
state: started
enabled: yes
# Grafana installation
- name: Add Grafana repository
apt_repository:
repo: "deb https://packages.grafana.com/oss/deb stable main"
state: present
filename: grafana
- name: Add Grafana GPG key
apt_key:
url: https://packages.grafana.com/gpg.key
state: present
- name: Install Grafana
apt:
name: grafana
state: present
update_cache: yes
- name: Configure Grafana
template:
src: templates/grafana.ini.j2
dest: /etc/grafana/grafana.ini
mode: '0640'
owner: grafana
group: grafana
notify: restart grafana
- name: Start Grafana
systemd:
name: grafana-server
state: started
enabled: yes
- name: Configure firewall for Prometheus
ufw:
rule: allow
port: '9090'
proto: tcp
- name: Configure firewall for Grafana
ufw:
rule: allow
port: '3000'
proto: tcp
handlers:
- name: reload systemd
systemd:
daemon_reload: yes
- name: restart prometheus
systemd:
name: prometheus
state: restarted
- name: reload prometheus
systemd:
name: prometheus
state: reloaded
- name: restart grafana
systemd:
name: grafana-server
state: restarted
- name: Deploy Node Exporters
hosts: all
become: yes
vars:
node_exporter_version: "1.7.0"
tasks:
- name: Create node_exporter user
user:
name: node_exporter
system: yes
shell: /bin/false
create_home: no
- name: Download Node Exporter
get_url:
url: "https://github.com/prometheus/node_exporter/releases/download/v{{ node_exporter_version }}/node_exporter-{{ node_exporter_version }}.linux-amd64.tar.gz"
dest: /tmp/node_exporter.tar.gz
- name: Extract Node Exporter
unarchive:
src: /tmp/node_exporter.tar.gz
dest: /tmp
remote_src: yes
- name: Copy Node Exporter binary
copy:
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-amd64/node_exporter"
dest: /usr/local/bin/node_exporter
mode: '0755'
remote_src: yes
- name: Create Node Exporter systemd service
copy:
content: |
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter \
--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/) \
--collector.netclass.ignored-devices=^(veth.*|docker.*|br-.*)$$
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/node_exporter.service
mode: '0644'
- name: Start Node Exporter
systemd:
name: node_exporter
state: started
enabled: yes
daemon_reload: yes
Ejemplo 6: Cumplimiento de Seguridad y Endurecimiento
Implemente benchmarks CIS y mejores prácticas de seguridad:
---
# playbooks/security-hardening.yml
- name: Apply security hardening
hosts: all
become: yes
vars:
allowed_ssh_users: ["admin", "deploy"]
ssh_port: 22
max_auth_tries: 3
password_max_days: 90
password_min_days: 1
password_warn_age: 7
tasks:
# System updates
- name: Update all packages
apt:
upgrade: dist
update_cache: yes
autoremove: yes
autoclean: yes
- name: Install security tools
apt:
name:
- aide
- auditd
- fail2ban
- rkhunter
- lynis
state: present
# SSH hardening
- name: Configure SSH daemon
lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
state: present
validate: '/usr/sbin/sshd -t -f %s'
loop:
- { regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no' }
- { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
- { regexp: '^#?PubkeyAuthentication', line: 'PubkeyAuthentication yes' }
- { regexp: '^#?PermitEmptyPasswords', line: 'PermitEmptyPasswords no' }
- { regexp: '^#?X11Forwarding', line: 'X11Forwarding no' }
- { regexp: '^#?MaxAuthTries', line: 'MaxAuthTries {{ max_auth_tries }}' }
- { regexp: '^#?ClientAliveInterval', line: 'ClientAliveInterval 300' }
- { regexp: '^#?ClientAliveCountMax', line: 'ClientAliveCountMax 2' }
- { regexp: '^#?Protocol', line: 'Protocol 2' }
- { regexp: '^#?AllowUsers', line: 'AllowUsers {{ allowed_ssh_users | join(" ") }}' }
notify: restart sshd
# Password policies
- name: Configure password aging
lineinfile:
path: /etc/login.defs
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- { regexp: '^PASS_MAX_DAYS', line: 'PASS_MAX_DAYS {{ password_max_days }}' }
- { regexp: '^PASS_MIN_DAYS', line: 'PASS_MIN_DAYS {{ password_min_days }}' }
- { regexp: '^PASS_WARN_AGE', line: 'PASS_WARN_AGE {{ password_warn_age }}' }
# Kernel hardening
- name: Configure sysctl security parameters
sysctl:
name: "{{ item.name }}"
value: "{{ item.value }}"
state: present
reload: yes
sysctl_file: /etc/sysctl.d/99-security.conf
loop:
# Network security
- { name: 'net.ipv4.conf.all.rp_filter', value: '1' }
- { name: 'net.ipv4.conf.default.rp_filter', value: '1' }
- { name: 'net.ipv4.icmp_echo_ignore_broadcasts', value: '1' }
- { name: 'net.ipv4.conf.all.accept_source_route', value: '0' }
- { name: 'net.ipv4.conf.default.accept_source_route', value: '0' }
- { name: 'net.ipv4.conf.all.accept_redirects', value: '0' }
- { name: 'net.ipv4.conf.default.accept_redirects', value: '0' }
- { name: 'net.ipv4.conf.all.secure_redirects', value: '0' }
- { name: 'net.ipv4.conf.default.secure_redirects', value: '0' }
- { name: 'net.ipv4.conf.all.send_redirects', value: '0' }
- { name: 'net.ipv4.conf.default.send_redirects', value: '0' }
- { name: 'net.ipv4.tcp_syncookies', value: '1' }
- { name: 'net.ipv4.tcp_timestamps', value: '0' }
# Kernel security
- { name: 'kernel.dmesg_restrict', value: '1' }
- { name: 'kernel.kptr_restrict', value: '2' }
- { name: 'kernel.yama.ptrace_scope', value: '1' }
- { name: 'fs.suid_dumpable', value: '0' }
# Fail2Ban configuration
- name: Configure Fail2Ban for SSH
copy:
content: |
[sshd]
enabled = true
port = {{ ssh_port }}
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
bantime = 3600
findtime = 600
dest: /etc/fail2ban/jail.d/sshd.conf
mode: '0644'
notify: restart fail2ban
# Audit daemon
- name: Configure auditd rules
copy:
content: |
# Delete all existing rules
-D
# Buffer size
-b 8192
# Failure mode
-f 1
# Monitor user/group changes
-w /etc/group -p wa -k identity
-w /etc/passwd -p wa -k identity
-w /etc/gshadow -p wa -k identity
-w /etc/shadow -p wa -k identity
# Monitor system calls
-a always,exit -F arch=b64 -S adjtimex -S settimeofday -k time-change
-a always,exit -F arch=b32 -S adjtimex -S settimeofday -S stime -k time-change
# Monitor network environment
-a always,exit -F arch=b64 -S sethostname -S setdomainname -k system-locale
-a always,exit -F arch=b32 -S sethostname -S setdomainname -k system-locale
# Monitor login/logout events
-w /var/log/faillog -p wa -k logins
-w /var/log/lastlog -p wa -k logins
# Monitor sudo usage
-w /etc/sudoers -p wa -k sudo_changes
-w /etc/sudoers.d/ -p wa -k sudo_changes
dest: /etc/audit/rules.d/hardening.rules
mode: '0640'
notify: restart auditd
# File integrity monitoring
- name: Initialize AIDE database
command: aideinit
args:
creates: /var/lib/aide/aide.db.new
- name: Setup AIDE cron job
cron:
name: "AIDE file integrity check"
minute: "0"
hour: "5"
job: "/usr/bin/aide --check | mail -s 'AIDE Report' root@localhost"
# Disable unnecessary services
- name: Disable unnecessary services
systemd:
name: "{{ item }}"
state: stopped
enabled: no
loop:
- bluetooth
- cups
- avahi-daemon
ignore_errors: yes
# Remove unnecessary packages
- name: Remove unnecessary packages
apt:
name:
- telnet
- rsh-client
- rsh-redone-client
state: absent
purge: yes
handlers:
- name: restart sshd
systemd:
name: sshd
state: restarted
- name: restart fail2ban
systemd:
name: fail2ban
state: restarted
- name: restart auditd
systemd:
name: auditd
state: restarted
Mejores Prácticas para Playbooks de Producción
1. Usar Ansible Vault para Secretos
# Create encrypted variable file
ansible-vault create group_vars/production/vault.yml
# Edit encrypted file
ansible-vault edit group_vars/production/vault.yml
# Content example:
vault_mysql_root_password: "super_secret_password"
vault_api_keys:
aws: "AKIAIOSFODNN7EXAMPLE"
sendgrid: "SG.example123"
Referenciar en playbooks:
vars:
mysql_root_password: "{{ vault_mysql_root_password }}"
2. Implementar Manejo Apropiado de Errores
- name: Task with error handling
command: /usr/bin/some-command
register: result
failed_when: false
changed_when: result.rc == 0
- name: Handle errors gracefully
block:
- name: Risky operation
command: /usr/bin/risky-command
rescue:
- name: Handle failure
debug:
msg: "Command failed, rolling back"
- name: Rollback action
command: /usr/bin/rollback-command
always:
- name: Cleanup
file:
path: /tmp/tempfile
state: absent
3. Usar Etiquetas Estratégicamente
- name: Full application setup
hosts: appservers
tasks:
- name: Install dependencies
apt:
name: "{{ packages }}"
tags: [install, packages]
- name: Deploy code
git:
repo: "{{ repo_url }}"
dest: /opt/app
tags: [deploy, code]
- name: Configure application
template:
src: config.j2
dest: /opt/app/config.yml
tags: [configure, config]
Ejecutar etiquetas específicas:
ansible-playbook site.yml --tags "deploy"
ansible-playbook site.yml --tags "install,configure"
4. Implementar Pruebas y Validación
- name: Validate deployment
hosts: webservers
tasks:
- name: Check if service is running
systemd:
name: nginx
state: started
check_mode: yes
register: service_status
failed_when: false
- name: Verify HTTP response
uri:
url: http://localhost
status_code: 200
timeout: 5
register: http_check
until: http_check.status == 200
retries: 5
delay: 2
- name: Validate configuration syntax
command: nginx -t
changed_when: false
- name: Assert all checks passed
assert:
that:
- service_status.state == "started"
- http_check.status == 200
fail_msg: "Validation failed"
success_msg: "All validations passed"
5. Documentar con Comentarios y Metadatos
---
# ============================================================================
# Playbook: production-deployment.yml
# Description: Deploy application to production environment
# Author: DevOps Team <[email protected]>
# Version: 2.1.0
# Last Updated: 2024-01-15
#
# Dependencies:
# - Ansible 2.9+
# - Python 3.6+
# - AWS CLI configured
#
# Variables Required:
# - deploy_version: Application version to deploy
# - environment: Target environment (production/staging)
#
# Usage:
# ansible-playbook production-deployment.yml -e deploy_version=v1.2.3
# ============================================================================
- name: Deploy application (v{{ deploy_version }})
hosts: production
# Task execution settings
serial: 2 # Deploy 2 servers at a time
max_fail_percentage: 10 # Fail if more than 10% of hosts fail
tasks:
# Each task should have a clear, descriptive name
- name: Validate deployment prerequisites
assert:
that:
- deploy_version is defined
- deploy_version is match('^v[0-9]+\.[0-9]+\.[0-9]+$')
fail_msg: "deploy_version must be in format v1.2.3"
Solución de Problemas de Ejecuciones de Playbook
Depurar Tareas Fallidas
- name: Debug playbook execution
hosts: all
tasks:
- name: Run command with debugging
command: /usr/bin/my-command
register: command_result
ignore_errors: yes
- name: Display command output
debug:
var: command_result
verbosity: 2
- name: Show specific values
debug:
msg: "Return code: {{ command_result.rc }}, Output: {{ command_result.stdout }}"
Ejecutar con verbosidad:
ansible-playbook debug.yml -v # verbose
ansible-playbook debug.yml -vv # more verbose
ansible-playbook debug.yml -vvv # debug
ansible-playbook debug.yml -vvvv # connection debug
Ejecución en Seco y Modo de Verificación
# Test without making changes
ansible-playbook site.yml --check
# Show what would change
ansible-playbook site.yml --check --diff
# Step through playbook interactively
ansible-playbook site.yml --step
Conclusión
Estos ejemplos prácticos de playbooks de Ansible demuestran escenarios de automatización del mundo real que puede adaptar a las necesidades de su infraestructura. Desde despliegues simples de pila LEMP hasta aplicaciones complejas de múltiples niveles con despliegues sin tiempo de inactividad, Ansible proporciona la flexibilidad y el poder para automatizar prácticamente cualquier tarea de infraestructura.
Conclusiones clave:
- Estructure playbooks para reutilización y mantenibilidad
- Implemente manejo apropiado de errores y mecanismos de rollback
- Use variables y plantillas para configuraciones específicas del entorno
- Aplique mejores prácticas de seguridad desde el inicio
- Pruebe exhaustivamente antes de desplegar a producción
- Documente sus playbooks de manera completa
- Use control de versiones para todo el código de Ansible
A medida que construya su biblioteca de automatización de Ansible, enfóquese en crear playbooks idempotentes y bien probados que puedan ejecutarse de manera segura múltiples veces. Comience con playbooks simples y aumente gradualmente la complejidad a medida que gane experiencia. Recuerde que el objetivo no es solo automatizar, sino crear infraestructura como código confiable y mantenible que todo su equipo pueda entender y a la que pueda contribuir.
Continúe explorando temas avanzados como módulos personalizados, inventario dinámico, Ansible Tower/AWX para orquestación empresarial e integración con pipelines de CI/CD para llevar su automatización al siguiente nivel.


