Checkmk Instalación y Configuración

Checkmk is an enterprise-grade monitoreo solution that builds on Nagios Mientras providing modern features Como auto-discovery, automatic host detection, rule-based configuration, and a powerful web interface. Esta guía covers installation, site creation, agent despliegue, host discovery, configuration management, and notifications.

Tabla de Contenidos

Introducción

Checkmk modernizes infrastructure monitoreo with automatic discovery, rule-based configuration, and streamlined operations. UnComo manual configuration-based systems, Checkmk detects services automatically and applies rules for consistent monitoreo policy across thousands of hosts.

Requisitos del Sistema

  • Linux (Ubuntu 20.04+, CentOS 8+, Debian 11+)
  • Minimum 2GB RAM
  • 10GB storage
  • 2+ CPU cores
  • Internet access for repository downloads
  • Root access for installation

Instalación

Method 1: Using Official Repository

# Add Checkmk repository (Ubuntu/Debian)
wget https://download.checkmk.com/checkmk/2.2.0/check-mk-raw-2.2.0_0.jammy_amd64.deb

# Install
sudo apt-get update
sudo apt-get install -y ./check-mk-raw-2.2.0_0.jammy_amd64.deb

# Verify installation
checkmk --version

Method 2: Docker Instalación

# Pull Checkmk image
docker pull checkmk/check-mk-raw:2.2.0

# Run container
docker run -d \
  -p 80:80 \
  -p 443:443 \
  -v /opt/checkmk:/opt/omd/sites \
  --name checkmk \
  checkmk/check-mk-raw:2.2.0

# Access web interface
# http://localhost/

Method 3: CentOS/RHEL

# Add repository
sudo rpm -ivh https://download.checkmk.com/checkmk/2.2.0/check-mk-raw-2.2.0-el8-38.x86_64.rpm

# Install dependencies
sudo yum install -y gcc gcc-c++ kernel-devel

# Complete installation
sudo yum install -y check-mk-raw

Site Management

Crear Monitoreo Site

# Create new site
sudo omd create mysite

# Enable and start site
sudo omd enable mysite
sudo omd start mysite

# Verify site
sudo omd status mysite

Site Configuración

# Login to site shell
sudo su - mysite

# View site configuration
omd config

# Edit configuration
omd config set APACHE_TCP_ADDR 0.0.0.0
omd config set DOKUWIKI on
omd config set GRAPHITE on
omd config set INFLUXDB on

# Apply changes
omd restart

Multi-Site Management

# Create additional sites
sudo omd create production
sudo omd create staging
sudo omd create development

# List all sites
sudo omd sites

# Configure site-specific settings
sudo su - production
omd config set MKEVENTD on
omd restart

Web Interface

Initial Access

  1. Navigate to http://your-server/mysite/
  2. Default credentials: cmkadmin/cmk
  3. Change password immediately
Monitoring > Hosts          # Host view
Monitoring > Services       # Service monitoring
Monitoring > Events         # Event console
Setup > Hosts              # Host configuration
Setup > Services           # Service configuration
Setup > Rules              # Monitoring rules
Administration > Users     # User management

Agent Instalación

On Linux Hosts

# Download agent from Checkmk UI
# Setup > Agents > Linux

# Or download directly
wget http://your-checkmk-server/mysite/check_mk/agents/check-mk-agent_2.2.0-1_all.deb

# Install
sudo apt-get install -y ./check-mk-agent_2.2.0-1_all.deb

# Enable and start xinetd
sudo systemctl enable xinetd
sudo systemctl start xinetd

# Verify agent
echo "get_agent_version" | nc localhost 6556

Configurar Agent for Checkmk

# Edit agent configuration
sudo nano /etc/check_mk/agents/plugins/mk_inventory

# Or modify main agent
sudo nano /usr/share/check_mk_agent/agents/check_mk_agent.linux

# Restart services
sudo systemctl restart xinetd

On Windows Hosts

# Download Windows agent from Checkmk
# Setup > Agents > Windows

# Install agent
.\check_mk_agent-2.2.0.msi /S

# Or with configuration
msiexec /i check_mk_agent-2.2.0.msi /qn PORT=6556 SKIP_PYTHON=1

# Verify installation
netstat -an | findstr 6556

# Check service
Get-Service | findstr check_mk

Agent Registration

# Register agent in Checkmk
# Hosts > Register with Agent on IP

# Or use Automigration Rules
# Setup > Agents > Host monitoring rules

Host Discovery

Automatic Host Discovery

# Via web interface:
# Setup > Hosts > Discovery

# Or via CLI
sudo su - mysite
cmk -II hostname
cmk -O hostname

Configurar Discovery Rules

Navigate to Configuración > Hosts > Discovery rules:

Rule Name: Discover all TCP services
Description: Discover services on monitored hosts
Condition: If host name contains 'prod'
Discovery: Check TCP ports (1-1024)

Bulk Host Addition

Crear hosts.txt:

myhost1         folders=production  address=192.168.1.10  alias="Production Server 1"
myhost2         folders=staging     address=192.168.1.11  alias="Staging Server"
db-server       folders=database    address=192.168.1.20  alias="Database Server"

Import:

sudo su - mysite

# Validate
cmk -RP hosts.txt

# Import
cmk -I hosts.txt

# Deploy configuration
cmk -U
cmk -O

Configuración Management

Rule-Based Configuración

Navigate to Configuración > Monitoreo > Host and service parameters:

Rule Name: Linux Memory Warning
Description: Set memory thresholds for Linux hosts
Condition: If host is tagged linux
Parameter: Memory usage
Value: 80% warning, 90% critical

Host Attributes

Set host properties:

Site: mysite
Network address: 192.168.1.10
Host labels: environment=production, team=platform
Alias: Web Server 1
Contact groups: admins, platform-team

Servicio Configuración

Auto-configure services for hosts:

Setup > Services > Service discovery

1. Discovery > Automatic service discovery
2. Run discovery on host
3. Review and finalize
4. Deploy configuration

View Configuración

# Check generated configuration
sudo su - mysite
ls -la etc/check_mk/conf.d/

# Validate configuration
cmk -l

# Show deployment status
cmk -c

Monitoreo Rules

Crear Custom Monitoreo Rule

Navigate to Configuración > Monitoreo > Threshold and Métrica Rules:

Name: High CPU Alert
Description: Alert when CPU exceeds thresholds
Measurement: CPU load
Condition:
  - If host is tagged prod
  - And service contains "CPU Load"
Threshold: 
  - Warning: 80%
  - Critical: 95%

Verificar Plugin Rules

Configurar check behavior:

Setup > Monitoring > Exporter Settings

Plugin: mem
Levels: 
  Warning: 85%
  Critical: 95%
Infotext: Show absolute values

Discovery Rules

Control automatic service discovery:

Setup > Hosts > Discovery rules

Rule: Discover MySQL services
Condition: Port 3306 is open
Action: Automatically create service "MySQL"

Notificaciones

Email Configuración

sudo su - mysite

# Edit notification settings
nano etc/checkmk/main.mk

# Configure SMTP
mail_from = "[email protected]"
smtp_server = "smtp.gmail.com:587"
smtp_auth = ("[email protected]", "app-password")
smtp_use_tls = True

Configurar Notificación Rules

Navigate to Configuración > Notificaciones:

Name: Email on Critical
Condition:
  - Service status is CRITICAL
  - Host attribute: environment = production
Notify user: ops-team
Notification type: Email

Webhook Integración

# Create custom notification script
mkdir -p ~/local/share/check_mk/notifications/

cat > ~/local/share/check_mk/notifications/slack.py << 'EOF'
#!/usr/bin/env python3
import requests
import sys

webhook_url = "https://hooks.slack.com/services/YOUR/WEBHOOK"
message = f"Alert: {os.environ.get('HOSTNAME')} - {os.environ.get('SERVICEDESC')}"

requests.post(webhook_url, json={"text": message})
EOF

chmod +x ~/local/share/check_mk/notifications/slack.py

Avanzado Features

Evento Console

Configurar event processing:

sudo su - mysite

# Enable Event Console
omd config set MKEVENTD on
omd restart

# Configure syslog reception
nano etc/check_mk/mkeventd.d/wato/global.mk

# Add input:
config = {
    "inputs": [
        ("syslog", {}),
    ]
}

Business Intelligence (BI)

Crear business-oriented views:

Setup > Business Intelligence > Aggregations

Name: Website Availability
Description: Monitors website SLA
Components:
  - service "HTTP" on web-01
  - service "HTTP" on web-02
  - service "Database" on db-01
Aggregation: At least 2 of 3 must be up

Panel Creation

Crear custom dashboards:

Monitoring > Dashboards > Create Dashboard

Widget 1: Host Status
  Show: production hosts
  
Widget 2: Service Status
  Filter: Critical services only

Widget 3: Service Levels
  Show: SLO violations

Copia de Seguridad and Recuperación

# Backup site
sudo su - mysite
backup

# Or manual backup
tar -czf ~/mysite-backup.tar.gz \
  etc/ var/log/ share/doc/

# Restore
sudo omd stop mysite
sudo omd rm mysite
sudo omd create mysite
sudo omd restore mysite < backup-file.tar
sudo omd start mysite

Solución de Problemas

Verificar Site Estado

# View site status
sudo omd status mysite

# Check Apache
sudo omd start mysite apache

# Monitor logs
tail -f /opt/omd/sites/mysite/var/log/apache/error.log

Verificar Agent Communication

# Test agent connection
sudo su - mysite
cmk --debug --all

# Or directly
nc -zv hostname 6556

# Check agent output
check_mk_agent=hostname

Configuración Validation

sudo su - mysite

# Validate configuration
cmk -l

# Check for errors
cmk --reload

# Show plugin status
cmk -P

Rendimiento Issues

# Check site resource usage
top -p $(pgrep -f "omd")

# Monitor Livestatus
echo "GET services" | nc localhost 6557 | wc -l

# Check database size
ls -lh var/mysql/

# Analyze performance
cmk --debug --all > /tmp/debug.out

Conclusión

Checkmk brings modern, rule-based configuration to enterprise monitoreo. By following Esta guía, you've deployed a scalable monitoreo platform with automatic discovery and flexible configuration management. Focus on designing rule hierarchies that enforce consistent monitoreo policies, regularly updating agent versions, and leveraging business intelligence features to align monitoreo with organizational objectives. The combination of automatic discovery and powerful configuration rules makes Checkmk ideal for managing large, complex infrastructures.