HAProxy Health Checks and Failover
HAProxy provides sophisticated health checking mechanisms to ensure traffic routes only to healthy backend servers. Unlike passive health checking that waits for failures, HAProxy actively probes backend servers, enabling rapid failover and automatic recovery. This guide covers HTTP and TCP health checks, configuration parameters, backup servers, sorry servers, and monitoring strategies.
Table of Contents
- Health Check Overview
- HTTP Health Checks
- TCP Health Checks
- Health Check Parameters
- Advanced Health Checks
- Backup Servers
- Sorry Servers
- Agent Checks
- Persistence During Failover
- Monitoring Health Status
- Troubleshooting
Health Check Overview
Health checks detect unhealthy servers before requests fail. HAProxy supports:
- Active Checks: Proactively send test requests to backends
- HTTP Checks: Verify HTTP response codes and content
- TCP Checks: Verify TCP connectivity
- Agent Checks: Custom agent-based checks
Active health checking enables:
- Immediate detection of failures
- Automatic removal from rotation
- Quick recovery when servers return
- Reduced client-side error rates
HTTP Health Checks
Basic HTTP health check configuration:
cat > /etc/haproxy/haproxy.cfg <<'EOF'
global
log stdout local0
stats socket /run/haproxy/admin.sock mode 660 level admin
defaults
mode http
timeout connect 5000
timeout client 50000
timeout server 50000
frontend web_in
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
server web1 192.168.1.100:8000 check
server web2 192.168.1.101:8000 check
server web3 192.168.1.102:8000 check
EOF
The option httpchk sends GET /health HTTP/1.1 requests to each backend:
sudo systemctl reload haproxy
Monitor health status:
echo "show stat" | socat - /run/haproxy/admin.sock | grep -i status
TCP Health Checks
Use TCP checks for non-HTTP services:
backend database_servers
balance roundrobin
option tcp-check
tcp-check connect port 5432
server db1 192.168.1.150:5432 check
server db2 192.168.1.151:5432 check
TCP checks verify only that the port is reachable, no application-level validation.
Configure TCP checks with specific options:
backend cache_servers
balance roundrobin
option tcp-check
tcp-check connect port 6379 timeout 2s
server redis1 192.168.1.160:6379 check
server redis2 192.168.1.161:6379 check
Health Check Parameters
Fine-tune health check behavior with specific parameters:
backend api_servers
balance roundrobin
option httpchk GET /api/health HTTP/1.1\r\nHost:\ api.example.com
server api1 192.168.1.100:8080 check inter 2000 fall 3 rise 2 weight 1
server api2 192.168.1.101:8080 check inter 2000 fall 3 rise 2 weight 1
server api3 192.168.1.102:8080 check inter 2000 fall 3 rise 2 weight 1 backup
Parameter explanations:
check: Enable health checkinginter 2000: Check interval in milliseconds (default 2000)fall 3: Mark down after 3 consecutive failuresrise 2: Mark up after 2 consecutive successesweight 1: Server weight for load balancingbackup: Use only when primary servers fail
Advanced Health Checks
Validate HTTP response status codes:
backend web_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
server web1 192.168.1.100:8000 check
server web2 192.168.1.101:8000 check
Check for specific response headers:
backend api_servers
option httpchk GET /status HTTP/1.1\r\nHost:\ api.example.com
http-check expect status 200
http-check expect header Content-Type "application/json"
server api1 192.168.1.110:8080 check
server api2 192.168.1.111:8080 check
Validate response body content:
backend web_servers
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
http-check expect body "OK"
server web1 192.168.1.100:8000 check
server web2 192.168.1.101:8000 check
Use Lua-based health checks for complex logic:
backend dynamic_servers
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
http-check expect custom lognot "error"
server srv1 192.168.1.100:8000 check
server srv2 192.168.1.101:8000 check
Backup Servers
Designate backup servers as failover targets:
backend web_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
# Primary servers
server web1 192.168.1.100:8000 check inter 2000 fall 3 rise 2
server web2 192.168.1.101:8000 check inter 2000 fall 3 rise 2
# Backup servers (used only if all primaries are down)
server web3 192.168.1.102:8000 check inter 2000 fall 3 rise 2 backup
server web4 192.168.1.103:8000 check inter 2000 fall 3 rise 2 backup
When all primary servers fail, HAProxy routes to backup servers.
Sorry Servers
A "sorry server" displays a maintenance message when all real backends are unavailable:
backend web_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
server web1 192.168.1.100:8000 check
server web2 192.168.1.101:8000 check
server sorry_server 127.0.0.1:8888 backup
listen sorry_backend
bind 127.0.0.1:8888
mode http
default_content_type "text/html; charset=utf-8"
errorfile 503 /etc/haproxy/sorry.http
Create /etc/haproxy/sorry.http:
HTTP/1.1 503 Service Unavailable
Content-Type: text/html; charset=utf-8
Content-Length: 200
<!DOCTYPE html>
<html>
<head>
<title>Maintenance</title>
<style>
body { font-family: Arial, sans-serif; text-align: center; padding: 50px; }
h1 { color: #333; }
</style>
</head>
<body>
<h1>Service Unavailable</h1>
<p>We are currently performing maintenance.</p>
<p>Please try again later.</p>
</body>
</html>
Agent Checks
Use HAProxy agent checks for more sophisticated health determination. Deploy a small agent on each backend server:
cat > /usr/local/bin/haproxy-agent.py <<'EOF'
#!/usr/bin/env python3
import socket
import sys
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
class HealthHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/health':
# Check system health
health_status = check_health()
weight = 100 if health_status['healthy'] else 0
self.send_response(200)
self.send_header('Content-Type', 'text/plain')
self.end_headers()
self.wfile.write(f"{weight}\n".encode())
else:
self.send_response(404)
self.end_headers()
def log_message(self, format, *args):
pass # Suppress logging
def check_health():
# Implement custom health logic
return {'healthy': True}
if __name__ == '__main__':
server = HTTPServer(('127.0.0.1', 5555), HealthHandler)
server.serve_forever()
EOF
chmod +x /usr/local/bin/haproxy-agent.py
Configure HAProxy agent check:
backend api_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ api.example.com
# Agent check for dynamic weight adjustment
server api1 192.168.1.100:8080 check agent-check agent-port 5555
server api2 192.168.1.101:8080 check agent-check agent-port 5555
The agent returns a weight (0-100), allowing dynamic load balancing adjustment.
Persistence During Failover
Maintain session persistence even when servers fail:
backend api_servers
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ api.example.com
http-check expect status 200
stick-table type string len 32 size 100k expire 30m
stick on cookie(JSESSIONID)
stick on src if !{ req.hdr(Authorization) }
server api1 192.168.1.100:8080 check
server api2 192.168.1.101:8080 check
server api3 192.168.1.102:8080 check
When a server fails but the client's sticky session is on that server, HAProxy:
- Marks the server down
- Reroutes to another server
- Maintains the sticky session for next request
Monitoring Health Status
Use the stats page to monitor health:
listen stats
bind *:8404
mode http
stats enable
stats uri /stats
stats refresh 5s
stats show-legends
Access stats:
curl http://localhost:8404/stats
Extract health information via admin socket:
echo "show servers state" | socat - /run/haproxy/admin.sock
echo "show backend" | socat - /run/haproxy/admin.sock
Monitor specific backend:
watch -n 1 'echo "show stat" | socat - /run/haproxy/admin.sock | grep "api_servers"'
Troubleshooting
Check if health checks are running:
sudo tcpdump -i any -n "port 8000 and (tcp[tcpflags] & tcp-syn) != 0"
Verify health check connectivity manually:
curl -v http://192.168.1.100:8000/health
Test HTTP health check response:
curl -v "http://192.168.1.100:8000/health" \
-H "Host: example.com"
Check HAProxy logs for health check failures:
tail -f /var/log/haproxy.log | grep -i "health\|down\|up"
Review HAProxy configuration:
haproxy -f /etc/haproxy/haproxy.cfg -c
Monitor server state changes:
sudo journalctl -u haproxy -f | grep -i "server\|health"
Increase logging detail:
global
log stdout local0 debug
Reload and test:
sudo systemctl reload haproxy
curl http://localhost/test
Conclusion
HAProxy's robust health checking and failover mechanisms ensure high availability and reliability. By actively monitoring backend health, quickly detecting failures, and managing failover through backup and sorry servers, HAProxy maintains service availability even during infrastructure issues. Combined with sticky sessions and sophisticated check parameters, HAProxy provides production-grade resilience for critical applications.


