Blackbox Exporter for Endpoint Monitoring
Blackbox Exporter enables external endpoint monitoring by probing targets and exposing metrics about their availability and performance. Unlike agent-based monitoring, blackbox monitoring tests actual user-facing endpoints without requiring installation on target systems. This guide covers installation, module configuration, Prometheus integration, and Grafana dashboards.
Table of Contents
- Introduction
- System Requirements
- Installation
- Module Configuration
- Prometheus Integration
- Grafana Dashboards
- Advanced Monitoring
- Alerting Rules
- Performance Tuning
- Troubleshooting
- Conclusion
Introduction
Blackbox Exporter simulates user interactions with endpoints, providing synthetic monitoring insights. It tests HTTP/HTTPS endpoints, TCP connections, DNS resolution, ICMP pings, and other protocols. This user perspective complements internal metrics for complete observability.
System Requirements
- Linux kernel 2.6.32 or later
- At least 256MB RAM
- 100MB disk space
- Network access to monitored endpoints
- Privileged access for ICMP (ping)
- Root or CAP_NET_RAW capabilities
Installation
Step 1: Download and Install
# Create user
sudo useradd --no-create-home --shell /bin/false blackbox-exporter
# Download
cd /tmp
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gz
tar -xvzf blackbox_exporter-0.24.0.linux-amd64.tar.gz
cd blackbox_exporter-0.24.0.linux-amd64
# Install
sudo cp blackbox_exporter /usr/local/bin/
sudo chown blackbox-exporter:blackbox-exporter /usr/local/bin/blackbox_exporter
sudo chmod +x /usr/local/bin/blackbox_exporter
# Create directories
sudo mkdir -p /etc/blackbox-exporter
sudo chown blackbox-exporter:blackbox-exporter /etc/blackbox-exporter
Step 2: Create Configuration
sudo tee /etc/blackbox-exporter/blackbox.yml > /dev/null << 'EOF'
modules:
http_2xx:
prober: http
timeout: 5s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
valid_status_codes: [200, 201, 202, 204, 206, 301, 302, 304, 307, 308]
method: GET
preferred_ip_protocol: "ip4"
http_post_2xx:
prober: http
timeout: 5s
http:
method: POST
headers:
Content-Type: application/json
body: '{"test":"data"}'
tcp_connect:
prober: tcp
timeout: 5s
tcp_connect_tls:
prober: tcp
timeout: 5s
tcp:
tls: true
tls_config:
insecure_skip_verify: false
dns:
prober: dns
timeout: 5s
dns:
transport_protocol: "tcp"
preferred_ip_protocol: "ip4"
query_name: "www.prometheus.io"
query_type: "A"
validate_answer_rrs:
fail_if_matches_regexp:
- "nonexistent\\.invalid"
fail_if_all_match_regexp:
- "127\\.0\\.0\\.1"
fail_if_none_matches_regexp:
- ".*"
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
pop3s_banner:
prober: tcp
timeout: 5s
tcp:
query_response:
- expect: "^\\+OK"
tls: true
tls_config:
insecure_skip_verify: false
ssh_banner:
prober: tcp
timeout: 5s
tcp:
query_response:
- expect: "^SSH-2.0-"
query_response_log: true
EOF
sudo chown blackbox-exporter:blackbox-exporter /etc/blackbox-exporter/blackbox.yml
Step 3: Create Systemd Service
sudo tee /etc/systemd/system/blackbox-exporter.service > /dev/null << 'EOF'
[Unit]
Description=Blackbox Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=blackbox-exporter
Group=blackbox-exporter
Type=simple
ExecStart=/usr/local/bin/blackbox_exporter \
--config.file=/etc/blackbox-exporter/blackbox.yml \
--web.listen-address=0.0.0.0:9115
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=blackbox-exporter
# Capabilities for ICMP
AmbientCapabilities=CAP_NET_RAW
CapabilityBoundingSet=CAP_NET_RAW
SecureBits=keep-caps
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable blackbox-exporter
sudo systemctl start blackbox-exporter
Step 4: Verify Installation
# Check service
sudo systemctl status blackbox-exporter
# Test endpoint
curl http://localhost:9115/metrics | head -20
# Test probe
curl 'http://localhost:9115/probe?target=https://www.google.com&module=http_2xx'
Module Configuration
HTTP Monitoring
modules:
# Basic HTTP check
http_2xx:
prober: http
timeout: 5s
http:
method: GET
valid_status_codes: [200, 301, 302]
# HTTPS with custom headers
https_with_auth:
prober: http
timeout: 10s
http:
method: GET
headers:
Authorization: "Bearer YOUR_TOKEN"
User-Agent: "Blackbox Exporter"
valid_status_codes: [200]
body_regexp:
- "success"
# SSL Certificate check
https_cert:
prober: http
timeout: 5s
http:
tls_config:
insecure_skip_verify: false
follow_redirects: false
TCP Monitoring
modules:
tcp_database:
prober: tcp
timeout: 5s
tcp:
preferred_ip_protocol: "ip4"
query_response:
- send: ""
expect: ""
mysql_check:
prober: tcp
timeout: 5s
tcp:
preferred_ip_protocol: "ip4"
tls: false
postgresql_check:
prober: tcp
timeout: 5s
tcp:
tls: true
tls_config:
insecure_skip_verify: true
DNS Monitoring
modules:
dns_a_record:
prober: dns
timeout: 5s
dns:
transport_protocol: "udp"
preferred_ip_protocol: "ip4"
query_name: "example.com"
query_type: "A"
validate_answer_rrs:
fail_if_matches_regexp:
- "127\\.0\\.0\\.1"
fail_if_none_matches_regexp:
- ".*"
dns_mx_record:
prober: dns
timeout: 5s
dns:
query_name: "example.com"
query_type: "MX"
ICMP Monitoring
modules:
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
dont_fragment: false
Prometheus Integration
Add Blackbox Scrape Configuration
# /etc/prometheus/prometheus.yml
scrape_configs:
- job_name: 'blackbox-http'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://www.google.com
- https://github.com
- https://your-api.example.com/health
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
- job_name: 'blackbox-tcp'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets:
- 192.168.1.50:5432
- 192.168.1.51:3306
- 192.168.1.52:27017
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
- job_name: 'blackbox-icmp'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets:
- 8.8.8.8
- 1.1.1.1
- 192.168.1.1
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
- job_name: 'blackbox-dns'
metrics_path: /probe
params:
module: [dns_a_record]
static_configs:
- targets:
- 8.8.8.8
- 1.1.1.1
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
Reload Prometheus
curl -X POST http://localhost:9090/-/reload
Grafana Dashboards
Create Dashboard Panels
Endpoint Availability
# HTTP status
probe_http_status_code
# Success rate
rate(probe_success[5m]) * 100
# Response time
histogram_quantile(0.95, probe_duration_seconds)
TCP Connectivity
# TCP success
probe_success{job="blackbox-tcp"}
# Connection time
probe_connect_duration_seconds
DNS Resolution
# DNS lookup time
probe_dns_lookup_duration_seconds
# DNS success rate
probe_success{job="blackbox-dns"}
ICMP Ping
# Ping success
probe_success{job="blackbox-icmp"}
# RTT
probe_icmp_duration_seconds
Advanced Monitoring
Custom Probe Targets
# Test specific endpoint
curl 'http://localhost:9115/probe?target=https://api.example.com/health&module=http_2xx'
# Monitor with retries
curl 'http://localhost:9115/probe?target=database.example.com:5432&module=tcp_connect'
Multi-Module Checks
# Create combined checks
probe_dns_lookup_duration_seconds + probe_http_duration_seconds
Synthetic Monitoring
Monitor user journeys:
modules:
login_check:
prober: http
timeout: 10s
http:
method: POST
headers:
Content-Type: application/json
body: '{"username":"test","password":"test"}'
valid_status_codes: [200]
Alerting Rules
Create Alert Rules
# /etc/prometheus/alert_rules.yml
groups:
- name: blackbox_alerts
rules:
- alert: EndpointDown
expr: probe_success == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Endpoint {{ $labels.instance }} is down"
description: "Endpoint {{ $labels.instance }} has been down for 5 minutes"
- alert: SlowEndpoint
expr: histogram_quantile(0.95, probe_duration_seconds) > 2
for: 10m
labels:
severity: warning
annotations:
summary: "Endpoint {{ $labels.instance }} is slow"
description: "Endpoint response time exceeds 2 seconds"
- alert: SSLCertificateExpiring
expr: probe_ssl_earliest_cert_expiry - time() < 7 * 86400
for: 1h
labels:
severity: warning
annotations:
summary: "SSL certificate for {{ $labels.instance }} expires in less than 7 days"
- alert: HighErrorRate
expr: rate(probe_http_status_code{code=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "High error rate on {{ $labels.instance }}"
Performance Tuning
Concurrent Probes
Limit concurrent probes to avoid overwhelming targets:
# Add to systemd service
ExecStart=/usr/local/bin/blackbox_exporter \
--config.file=/etc/blackbox-exporter/blackbox.yml \
--web.listen-address=0.0.0.0:9115 \
--config.check-interval=30s
Timeout Configuration
modules:
# Fast checks
http_quick:
prober: http
timeout: 2s
# Slower operations
http_full:
prober: http
timeout: 10s
Troubleshooting
Check Probe Results
# Test HTTP probe
curl -v 'http://localhost:9115/probe?target=https://example.com&module=http_2xx'
# Test TCP probe
curl 'http://localhost:9115/probe?target=example.com:443&module=tcp_connect_tls'
# Test ICMP (requires capabilities)
curl 'http://localhost:9115/probe?target=8.8.8.8&module=icmp'
Debug Configuration
# Validate config
blackbox_exporter --config.file=/etc/blackbox-exporter/blackbox.yml
# Check logs
journalctl -u blackbox-exporter -f
# Verify metrics
curl http://localhost:9115/metrics | grep probe
Common Issues
# ICMP not working - check capabilities
getcap /usr/local/bin/blackbox_exporter
sudo setcap cap_net_raw=ep /usr/local/bin/blackbox_exporter
# TLS errors - disable verification if needed
# Update module: tls_config.insecure_skip_verify = true
# Firewall blocking - verify connectivity
telnet target 443
Conclusion
Blackbox Exporter provides comprehensive external endpoint monitoring without requiring agents. By following this guide, you've deployed synthetic monitoring that simulates user interactions. Focus on monitoring all critical user-facing endpoints, setting appropriate alerting thresholds based on SLOs, and regularly reviewing probe results to identify trends. Combined with internal metrics, blackbox monitoring provides complete visibility into application availability and performance.


