Prometheus Instalación y Configuración
Prometheus is an open-source monitoreo and alerting toolkit designed for reliability and scalability. It collects metrics from configured targets at regular intervals, evaluates alerting rules, and can trigger alerts based on predefined conditions. This comprehensive guide covers everything needed Para instalar, configure, and secure Prometheus En su infrastructure.
Tabla de Contenidos
- Introducción
- Requisitos del Sistema
- Instalación
- Configuración
- Scrape Configuración
- Servicio Management
- Datos Retention
- PromQL Basics
- Seguridad Considerations
- Monitoreo Prometheus
- [Solución de Problemas](#solución de problemas)
- Conclusión
Introducción
Prometheus works by pulling metrics from instrumented applications and infrastructure components. UnComo traditional push-based monitoreo, Prometheus' pull model provides better control, simpler architecture, and easier debugging. The metrics are stored in a time-series database with powerful querying capabilities through PromQL.
Requisitos del Sistema
Antes de instalar Prometheus, Asegúrese de que su system meets these requirements:
- Linux kernel 2.6.32 or later
- At least 1GB RAM (2GB+ Recomendado for producción)
- At least 10GB storage (scale based on retention period and metric volume)
- Internet connectivity for downloading packages
- Root or sudo access
Instalación
Paso 1: Download Prometheus
Iniciar by downloading the latest stable release of Prometheus:
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.50.0/prometheus-2.50.0.linux-amd64.tar.gz
tar -xvzf prometheus-2.50.0.linux-amd64.tar.gz
cd prometheus-2.50.0.linux-amd64
Paso 2: Crear System Usuario and Directorios
Crear a dedicated user for Prometheus and set up necessary directories:
sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus
Paso 3: Copy Binaries and Archivos
Move the Prometheus binaries and files to system locations:
sudo cp prometheus promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool
sudo cp prometheus.yml /etc/prometheus/
sudo cp consoles -r /etc/prometheus/
sudo cp console_libraries -r /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus/consoles /etc/prometheus/console_libraries
Paso 4: Verificar Instalación
Verificar that Prometheus is properly installed:
prometheus --version
promtool --version
Configuración
Basic Configuración
The main configuration file is located at /etc/prometheus/prometheus.yml. Here's a minimal producción-ready configuration:
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'prometheus-prod'
environment: 'production'
alerting:
alertmanagers:
- static_configs:
- targets:
- 'localhost:9093'
rule_files:
- '/etc/prometheus/rules/*.yml'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Avanzado Configuración Opciones
For producción environments, consider these additional settings:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_timeout: 10s
external_labels:
cluster: 'us-east-1'
region: 'production'
remote_write:
- url: 'http://localhost:9009/api/v1/push'
queue_config:
capacity: 10000
max_shards: 200
min_shards: 1
max_samples_per_send: 500
batch_send_wait_time: 5s
min_backoff: 30ms
max_backoff: 100ms
remote_read:
- url: 'http://localhost:9009/api/v1/read'
read_recent: true
Scrape Configuración
Scrape Targets Configuración
Define targets Para monitorear using various discovery methods:
scrape_configs:
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100', '192.168.1.10:9100', '192.168.1.11:9100']
labels:
datacenter: 'us-east-1'
rack: '1a'
- targets: ['192.168.1.12:9100']
labels:
datacenter: 'us-west-1'
- job_name: 'mysql-servers'
static_configs:
- targets:
- '192.168.1.20:9104'
- '192.168.1.21:9104'
labels:
environment: 'production'
- job_name: 'postgres-servers'
scrape_interval: 30s
static_configs:
- targets: ['localhost:9187']
metric_relabel_configs:
- source_labels: [__name__]
regex: 'pg_stat_.*'
action: drop
Servicio Discovery Methods
For dynamic environments, use service discovery:
scrape_configs:
- job_name: 'consul-services'
consul_sd_configs:
- server: 'localhost:8500'
datacenter: 'us-east-1'
relabel_configs:
- source_labels: [__meta_consul_service]
target_label: service
- job_name: 'docker-containers'
docker_sd_configs:
- host: 'unix:///var/run/docker.sock'
relabel_configs:
- source_labels: [__meta_docker_container_name]
target_label: container
Relabeling Configuración
Use relabeling to add, drop, or modify labels:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: 'app-(web|api)'
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod_name
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_port]
action: replace
regex: '([^:]+)(?::\d+)?;(\d+)'
replacement: '$1:$2'
target_label: __address__
Servicio Management
Crear Systemd Servicio
Crear a systemd service file for Prometheus:
sudo tee /etc/systemd/system/prometheus.service > /dev/null << 'EOF'
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.enable-lifecycle
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
Habilitar and Iniciar Servicio
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
sudo systemctl status prometheus
View Servicio Registros
sudo journalctl -u prometheus -f
sudo journalctl -u prometheus --since "1 hour ago"
Datos Retention
Configurar Retention Policy
Set retention time and size limits En el systemd service:
sudo systemctl edit prometheus
Modificar the ExecStart line to include:
--storage.tsdb.retention.time=30d \
--storage.tsdb.retention.size=50GB
Monitor Storage Usage
du -sh /var/lib/prometheus/
df -h /var/lib/prometheus/
# Check current blocks
ls -la /var/lib/prometheus/wal/
ls -la /var/lib/prometheus/
Cleanup and Maintenance
Prometheus automatically manages old data based on retention policies. To manually trigger cleanup:
# Validate configuration before cleanup
promtool check config /etc/prometheus/prometheus.yml
# Check WAL corruption
promtool tsdb list /var/lib/prometheus/
# Repair corrupted database
promtool tsdb repair /var/lib/prometheus/
PromQL Basics
Simple Consultas
Retrieve current metric values:
# Get CPU usage
node_cpu_seconds_total
# Get memory available
node_memory_MemAvailable_bytes
# Get specific instance
node_memory_MemAvailable_bytes{instance="192.168.1.10:9100"}
Range Vectors
Consulta metrics over time ranges:
# Last 5 minutes of CPU usage
node_cpu_seconds_total[5m]
# Last hour of memory usage
node_memory_MemAvailable_bytes[1h]
# Last 7 days
up[7d]
Aggregation and Functions
Perform calculations on metrics:
# Average CPU usage across instances
avg(node_cpu_seconds_total)
# Sum of requests per second
sum(rate(http_requests_total[5m]))
# Top 5 memory consumers
topk(5, node_memory_MemAvailable_bytes)
# Disk usage percentage
(node_filesystem_size_bytes - node_filesystem_avail_bytes) / node_filesystem_size_bytes * 100
Avanzado PromQL Consultas
Complex queries for real-world monitoreo:
# CPU usage percentage
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Request latency p95
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Service error rate
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) * 100
# Memory pressure
(node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) < 0.1
Seguridad Considerations
Red Seguridad
Configurar firewall rules to restrict access:
# Allow only specific IPs
sudo ufw allow from 192.168.1.0/24 to any port 9090
sudo ufw allow from 10.0.0.0/8 to any port 9090
# Allow local access only
sudo ufw allow 127.0.0.1/32 port 9090
Autenticación and Reverse Proxy
Use a reverse proxy for authentication:
# Install Nginx
sudo apt-get update
sudo apt-get install -y nginx
# Create basic auth file
sudo htpasswd -c /etc/nginx/.htpasswd prometheus_user
Configurar Nginx for Prometheus:
upstream prometheus {
server 127.0.0.1:9090;
}
server {
listen 443 ssl http2;
server_name prometheus.example.com;
ssl_certificate /etc/ssl/certs/cert.pem;
ssl_certificate_key /etc/ssl/private/key.pem;
auth_basic "Prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
proxy_pass http://prometheus;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Archivo Permisos
Ensure proper file permissions:
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chmod -R 750 /etc/prometheus
sudo chown -R prometheus:prometheus /var/lib/prometheus
sudo chmod -R 750 /var/lib/prometheus
Monitoreo Prometheus
Self-Monitoreo
Habilitar Prometheus Para monitorear itself:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
relabel_configs:
- source_labels: [__address__]
target_label: instance
Key Métricas Para monitorear
# Prometheus health
up{job="prometheus"}
# Scrape duration
prometheus_tsdb_symbol_table_size_bytes
# WAL size
prometheus_tsdb_wal_segment_creation_failures_total
# Memory usage
process_resident_memory_bytes{job="prometheus"}
# Goroutine count
go_goroutines{job="prometheus"}
Solución de Problemas
Configuración Validation
Before applying configuration changes:
promtool check config /etc/prometheus/prometheus.yml
promtool check config --lint-fatal /etc/prometheus/prometheus.yml
Verificar Rules
Verificar alerting rules syntax:
promtool check rules /etc/prometheus/rules/*.yml
Rendimiento Issues
Verificar rendimiento metrics:
# Check scrape job duration
promtool query instant 'prometheus_tsdb_symbol_table_size_bytes'
# View active targets
curl -s http://localhost:9090/api/v1/targets | jq .
# Check failed scrapes
curl -s http://localhost:9090/api/v1/targets?state=down | jq .
Storage Issues
Diagnose storage problems:
# Check WAL integrity
promtool tsdb list /var/lib/prometheus/ --human-readable
# Check block health
promtool tsdb analyze /var/lib/prometheus/
# Verify blocks
promtool tsdb list /var/lib/prometheus/ | head -20
Debug Registro
Habilitar debug registro:
sudo systemctl edit prometheus
Agregar to ExecStart:
--log.level=debug
Then restart:
sudo systemctl restart prometheus
Conclusión
Prometheus provides a robust foundation for monitoreo infrastructure and applications. By properly Instalando, configuring, and maintaining Prometheus with attention to seguridad and rendimiento, you Crear un reliable monitoreo backbone. Regular backup of configuration files, monitoreo the monitoreo system itself, and staying updated with new releases Asegúrese de que su observability platform remains effective and secure. Iniciar with basic monitoreo, gradually add more exporters and complexity, and leverage the powerful PromQL language to gain deep insights ina su systems.


