Checkmk Installation and Configuration

Checkmk is an enterprise-grade monitoring solution that builds on Nagios while providing modern features like auto-discovery, automatic host detection, rule-based configuration, and a powerful web interface. This guide covers installation, site creation, agent deployment, host discovery, configuration management, and notifications.

Table of Contents

Introduction

Checkmk modernizes infrastructure monitoring with automatic discovery, rule-based configuration, and streamlined operations. Unlike manual configuration-based systems, Checkmk detects services automatically and applies rules for consistent monitoring policy across thousands of hosts.

System Requirements

  • Linux (Ubuntu 20.04+, CentOS 8+, Debian 11+)
  • Minimum 2GB RAM
  • 10GB storage
  • 2+ CPU cores
  • Internet access for repository downloads
  • Root access for installation

Installation

Method 1: Using Official Repository

# Add Checkmk repository (Ubuntu/Debian)
wget https://download.checkmk.com/checkmk/2.2.0/check-mk-raw-2.2.0_0.jammy_amd64.deb

# Install
sudo apt-get update
sudo apt-get install -y ./check-mk-raw-2.2.0_0.jammy_amd64.deb

# Verify installation
checkmk --version

Method 2: Docker Installation

# Pull Checkmk image
docker pull checkmk/check-mk-raw:2.2.0

# Run container
docker run -d \
  -p 80:80 \
  -p 443:443 \
  -v /opt/checkmk:/opt/omd/sites \
  --name checkmk \
  checkmk/check-mk-raw:2.2.0

# Access web interface
# http://localhost/

Method 3: CentOS/RHEL

# Add repository
sudo rpm -ivh https://download.checkmk.com/checkmk/2.2.0/check-mk-raw-2.2.0-el8-38.x86_64.rpm

# Install dependencies
sudo yum install -y gcc gcc-c++ kernel-devel

# Complete installation
sudo yum install -y check-mk-raw

Site Management

Create Monitoring Site

# Create new site
sudo omd create mysite

# Enable and start site
sudo omd enable mysite
sudo omd start mysite

# Verify site
sudo omd status mysite

Site Configuration

# Login to site shell
sudo su - mysite

# View site configuration
omd config

# Edit configuration
omd config set APACHE_TCP_ADDR 0.0.0.0
omd config set DOKUWIKI on
omd config set GRAPHITE on
omd config set INFLUXDB on

# Apply changes
omd restart

Multi-Site Management

# Create additional sites
sudo omd create production
sudo omd create staging
sudo omd create development

# List all sites
sudo omd sites

# Configure site-specific settings
sudo su - production
omd config set MKEVENTD on
omd restart

Web Interface

Initial Access

  1. Navigate to http://your-server/mysite/
  2. Default credentials: cmkadmin/cmk
  3. Change password immediately
Monitoring > Hosts          # Host view
Monitoring > Services       # Service monitoring
Monitoring > Events         # Event console
Setup > Hosts              # Host configuration
Setup > Services           # Service configuration
Setup > Rules              # Monitoring rules
Administration > Users     # User management

Agent Installation

On Linux Hosts

# Download agent from Checkmk UI
# Setup > Agents > Linux

# Or download directly
wget http://your-checkmk-server/mysite/check_mk/agents/check-mk-agent_2.2.0-1_all.deb

# Install
sudo apt-get install -y ./check-mk-agent_2.2.0-1_all.deb

# Enable and start xinetd
sudo systemctl enable xinetd
sudo systemctl start xinetd

# Verify agent
echo "get_agent_version" | nc localhost 6556

Configure Agent for Checkmk

# Edit agent configuration
sudo nano /etc/check_mk/agents/plugins/mk_inventory

# Or modify main agent
sudo nano /usr/share/check_mk_agent/agents/check_mk_agent.linux

# Restart services
sudo systemctl restart xinetd

On Windows Hosts

# Download Windows agent from Checkmk
# Setup > Agents > Windows

# Install agent
.\check_mk_agent-2.2.0.msi /S

# Or with configuration
msiexec /i check_mk_agent-2.2.0.msi /qn PORT=6556 SKIP_PYTHON=1

# Verify installation
netstat -an | findstr 6556

# Check service
Get-Service | findstr check_mk

Agent Registration

# Register agent in Checkmk
# Hosts > Register with Agent on IP

# Or use Automigration Rules
# Setup > Agents > Host monitoring rules

Host Discovery

Automatic Host Discovery

# Via web interface:
# Setup > Hosts > Discovery

# Or via CLI
sudo su - mysite
cmk -II hostname
cmk -O hostname

Configure Discovery Rules

Navigate to Setup > Hosts > Discovery rules:

Rule Name: Discover all TCP services
Description: Discover services on monitored hosts
Condition: If host name contains 'prod'
Discovery: Check TCP ports (1-1024)

Bulk Host Addition

Create hosts.txt:

myhost1         folders=production  address=192.168.1.10  alias="Production Server 1"
myhost2         folders=staging     address=192.168.1.11  alias="Staging Server"
db-server       folders=database    address=192.168.1.20  alias="Database Server"

Import:

sudo su - mysite

# Validate
cmk -RP hosts.txt

# Import
cmk -I hosts.txt

# Deploy configuration
cmk -U
cmk -O

Configuration Management

Rule-Based Configuration

Navigate to Setup > Monitoring > Host and service parameters:

Rule Name: Linux Memory Warning
Description: Set memory thresholds for Linux hosts
Condition: If host is tagged linux
Parameter: Memory usage
Value: 80% warning, 90% critical

Host Attributes

Set host properties:

Site: mysite
Network address: 192.168.1.10
Host labels: environment=production, team=platform
Alias: Web Server 1
Contact groups: admins, platform-team

Service Configuration

Auto-configure services for hosts:

Setup > Services > Service discovery

1. Discovery > Automatic service discovery
2. Run discovery on host
3. Review and finalize
4. Deploy configuration

View Configuration

# Check generated configuration
sudo su - mysite
ls -la etc/check_mk/conf.d/

# Validate configuration
cmk -l

# Show deployment status
cmk -c

Monitoring Rules

Create Custom Monitoring Rule

Navigate to Setup > Monitoring > Threshold and Metric Rules:

Name: High CPU Alert
Description: Alert when CPU exceeds thresholds
Measurement: CPU load
Condition:
  - If host is tagged prod
  - And service contains "CPU Load"
Threshold: 
  - Warning: 80%
  - Critical: 95%

Check Plugin Rules

Configure check behavior:

Setup > Monitoring > Exporter Settings

Plugin: mem
Levels: 
  Warning: 85%
  Critical: 95%
Infotext: Show absolute values

Discovery Rules

Control automatic service discovery:

Setup > Hosts > Discovery rules

Rule: Discover MySQL services
Condition: Port 3306 is open
Action: Automatically create service "MySQL"

Notifications

Email Configuration

sudo su - mysite

# Edit notification settings
nano etc/checkmk/main.mk

# Configure SMTP
mail_from = "[email protected]"
smtp_server = "smtp.gmail.com:587"
smtp_auth = ("[email protected]", "app-password")
smtp_use_tls = True

Configure Notification Rules

Navigate to Setup > Notifications:

Name: Email on Critical
Condition:
  - Service status is CRITICAL
  - Host attribute: environment = production
Notify user: ops-team
Notification type: Email

Webhook Integration

# Create custom notification script
mkdir -p ~/local/share/check_mk/notifications/

cat > ~/local/share/check_mk/notifications/slack.py << 'EOF'
#!/usr/bin/env python3
import requests
import sys

webhook_url = "https://hooks.slack.com/services/YOUR/WEBHOOK"
message = f"Alert: {os.environ.get('HOSTNAME')} - {os.environ.get('SERVICEDESC')}"

requests.post(webhook_url, json={"text": message})
EOF

chmod +x ~/local/share/check_mk/notifications/slack.py

Advanced Features

Event Console

Configure event processing:

sudo su - mysite

# Enable Event Console
omd config set MKEVENTD on
omd restart

# Configure syslog reception
nano etc/check_mk/mkeventd.d/wato/global.mk

# Add input:
config = {
    "inputs": [
        ("syslog", {}),
    ]
}

Business Intelligence (BI)

Create business-oriented views:

Setup > Business Intelligence > Aggregations

Name: Website Availability
Description: Monitors website SLA
Components:
  - service "HTTP" on web-01
  - service "HTTP" on web-02
  - service "Database" on db-01
Aggregation: At least 2 of 3 must be up

Dashboard Creation

Create custom dashboards:

Monitoring > Dashboards > Create Dashboard

Widget 1: Host Status
  Show: production hosts
  
Widget 2: Service Status
  Filter: Critical services only

Widget 3: Service Levels
  Show: SLO violations

Backup and Recovery

# Backup site
sudo su - mysite
backup

# Or manual backup
tar -czf ~/mysite-backup.tar.gz \
  etc/ var/log/ share/doc/

# Restore
sudo omd stop mysite
sudo omd rm mysite
sudo omd create mysite
sudo omd restore mysite < backup-file.tar
sudo omd start mysite

Troubleshooting

Check Site Health

# View site status
sudo omd status mysite

# Check Apache
sudo omd start mysite apache

# Monitor logs
tail -f /opt/omd/sites/mysite/var/log/apache/error.log

Verify Agent Communication

# Test agent connection
sudo su - mysite
cmk --debug --all

# Or directly
nc -zv hostname 6556

# Check agent output
check_mk_agent=hostname

Configuration Validation

sudo su - mysite

# Validate configuration
cmk -l

# Check for errors
cmk --reload

# Show plugin status
cmk -P

Performance Issues

# Check site resource usage
top -p $(pgrep -f "omd")

# Monitor Livestatus
echo "GET services" | nc localhost 6557 | wc -l

# Check database size
ls -lh var/mysql/

# Analyze performance
cmk --debug --all > /tmp/debug.out

Conclusion

Checkmk brings modern, rule-based configuration to enterprise monitoring. By following this guide, you've deployed a scalable monitoring platform with automatic discovery and flexible configuration management. Focus on designing rule hierarchies that enforce consistent monitoring policies, regularly updating agent versions, and leveraging business intelligence features to align monitoring with organizational objectives. The combination of automatic discovery and powerful configuration rules makes Checkmk ideal for managing large, complex infrastructures.