Benchmarking con wrk y siege

Introducción

Mientras que Apache Bench (ab) es excelente para benchmarking simple, wrk y siege ofrecen características avanzadas para pruebas de carga más realistas y flexibles. Wrk proporciona pruebas programables basadas en Lua con rendimiento multi-hilo, mientras que siege soporta listas de URL y escenarios de prueba complejos. Ambas herramientas son esenciales para pruebas exhaustivas de rendimiento de servidores web y planificación de capacidad.

Wrk es una herramienta moderna de benchmarking HTTP capaz de generar carga significativa con una sola máquina multi-core. Combina multi-threading con un diseño basado en eventos (epoll/kqueue) para manejar eficientemente millones de conexiones. Siege se especializa en pruebas realistas con soporte para múltiples URL, retardos aleatorios y pruebas basadas en transacciones que simulan el comportamiento real de los usuarios.

Esta guía completa cubre instalación, uso, scripting, interpretación de resultados y escenarios de prueba del mundo real para wrk y siege. Aprenderá cómo realizar pruebas de rendimiento avanzadas que representen con precisión las cargas de trabajo de producción.

wrk: Benchmarking HTTP Moderno

Instalación

# Ubuntu/Debian - compile from source
apt-get install build-essential libssl-dev git -y
git clone https://github.com/wg/wrk.git
cd wrk
make
sudo cp wrk /usr/local/bin/

# Verify installation
wrk --version

Uso Básico

# Basic syntax
wrk -t<threads> -c<connections> -d<duration> URL

# Simple test: 12 threads, 400 connections, 30 seconds
wrk -t12 -c400 -d30s http://localhost/

# Example output:
Running 30s test @ http://localhost/
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    42.15ms   18.34ms  312.45ms   87.23%
    Req/Sec   785.43     124.56     1.23k    68.45%
  282450 requests in 30.02s, 1.45GB read
Requests/sec:   9408.23
Transfer/sec:     49.56MB

Entendiendo la Salida de wrk

# Thread Stats:
Latency    42.15ms   18.34ms  312.45ms   87.23%
# Avg: Average latency (42.15ms)
# Stdev: Standard deviation (18.34ms) - consistency measure
# Max: Maximum latency (312.45ms)
# +/- Stdev: 87.23% of requests within 1 standard deviation

Req/Sec   785.43     124.56     1.23k    68.45%
# Avg: Requests per second per thread (785)
# Stdev: Variation (124.56)
# Max: Peak requests per second (1,230)
# +/- Stdev: Consistency percentage

# Summary:
282450 requests in 30.02s, 1.45GB read
# Total requests completed
# Test duration
# Total data transferred

Requests/sec:   9408.23
# Overall throughput (most important metric)

Transfer/sec:     49.56MB
# Bandwidth used

Opciones Avanzadas

# Custom HTTP headers
wrk -t12 -c400 -d30s -H "Authorization: Bearer token123" http://localhost/api/

# Multiple headers
wrk -t12 -c400 -d30s \
    -H "Authorization: Bearer token123" \
    -H "Accept: application/json" \
    -H "User-Agent: wrk-benchmark" \
    http://localhost/api/

# Timeout configuration
wrk -t12 -c400 -d30s --timeout 10s http://localhost/

# Latency distribution
wrk -t12 -c400 -d30s --latency http://localhost/

# Output with latency:
  Latency Distribution
     50%   38.24ms
     75%   52.67ms
     90%   68.45ms
     99%  145.23ms

Scripting Lua con wrk

Estructura Básica de Script

-- request.lua
-- Customize request generation

wrk.method = "POST"
wrk.body   = '{"name":"test","email":"[email protected]"}'
wrk.headers["Content-Type"] = "application/json"

-- Request function (called for each request)
request = function()
    return wrk.format(wrk.method, wrk.path, wrk.headers, wrk.body)
end

-- Response function (called for each response)
response = function(status, headers, body)
    if status ~= 200 then
        print("Error: " .. status)
    end
end

-- Done function (called when test completes)
done = function(summary, latency, requests)
    print("Total requests: " .. summary.requests)
    print("Total errors: " .. summary.errors.connect + summary.errors.read + summary.errors.write + summary.errors.timeout)
end
# Run with script
wrk -t12 -c400 -d30s -s request.lua http://localhost/api/users

Script Avanzado: Múltiples Endpoints

-- multi-url.lua
-- Test multiple endpoints randomly

local urls = {
    "/",
    "/products",
    "/products/1",
    "/api/users",
    "/api/orders"
}

request = function()
    -- Select random URL
    local path = urls[math.random(#urls)]
    return wrk.format("GET", path)
end

-- Track response times by path
local responses = {}

response = function(status, headers, body)
    local path = wrk.path
    if not responses[path] then
        responses[path] = {count = 0, errors = 0}
    end

    responses[path].count = responses[path].count + 1
    if status ~= 200 then
        responses[path].errors = responses[path].errors + 1
    end
end

done = function(summary, latency, requests)
    io.write("Path Statistics:\n")
    for path, stats in pairs(responses) do
        io.write(string.format("  %s: %d requests, %d errors (%.2f%%)\n",
            path, stats.count, stats.errors,
            (stats.errors / stats.count) * 100))
    end
end

Script de Autenticación

-- auth.lua
-- Test with authentication

local token = nil

-- Setup function (called once before test)
setup = function(thread)
    thread:set("token", "Bearer abc123xyz789")
end

-- Init function (called for each thread)
init = function(args)
    token = args.token
end

request = function()
    wrk.headers["Authorization"] = token
    return wrk.format("GET", wrk.path)
end

POST con Datos Dinámicos

-- dynamic-post.lua
-- Generate unique POST data for each request

local counter = 0

request = function()
    counter = counter + 1

    local body = string.format([[
        {"id": %d, "name": "User%d", "email": "user%[email protected]"}
    ]], counter, counter, counter)

    return wrk.format("POST", "/api/users", nil, body)
end

siege: Benchmarking Basado en Transacciones

Instalación

# Ubuntu/Debian
apt-get install siege -y

# CentOS/Rocky Linux
dnf install siege -y

# From source
wget http://download.joedog.org/siege/siege-latest.tar.gz
tar -xzf siege-latest.tar.gz
cd siege-*/
./configure
make
sudo make install

# Verify installation
siege --version

Uso Básico

# Simple test: 25 concurrent users, 100 repetitions
siege -c25 -r100 http://localhost/

# Timed test: 25 concurrent users, 60 seconds
siege -c25 -t60s http://localhost/

# Example output:
Transactions:                  2450 hits
Availability:                 98.00 %
Elapsed time:                 59.87 secs
Data transferred:             12.45 MB
Response time:                 0.61 secs
Transaction rate:             40.93 trans/sec
Throughput:                    0.21 MB/sec
Concurrency:                  24.89
Successful transactions:      2450
Failed transactions:            50
Longest transaction:           3.45
Shortest transaction:          0.12

Entendiendo la Salida de siege

Transactions:                  2450 hits
# Total successful requests

Availability:                 98.00 %
# Success rate (should be > 99%)

Response time:                 0.61 secs
# Average response time (lower is better)

Transaction rate:             40.93 trans/sec
# Throughput (requests per second)

Concurrency:                  24.89
# Average concurrent connections
# Should be close to -c value (25 in example)
# Lower = waiting for server responses

Longest transaction:           3.45
Shortest transaction:          0.12
# Response time range

Múltiples URL (Lista de URL)

# Create URL list file
cat > urls.txt << 'EOF'
http://localhost/
http://localhost/products
http://localhost/about
http://localhost/contact
http://localhost/api/users
EOF

# Test all URLs randomly
siege -c50 -t60s -f urls.txt

# Test URLs sequentially (internet mode)
siege -c50 -t60s -i -f urls.txt

# -i: Internet mode (random URL selection)
# Without -i: Sequential access

Solicitudes POST

# Simple POST
siege -c10 -r50 "http://localhost/api/users POST {\"name\":\"test\"}"

# POST with URL list
cat > post-urls.txt << 'EOF'
http://localhost/api/users POST {"name":"User1","email":"[email protected]"}
http://localhost/api/products POST {"name":"Product1","price":29.99}
http://localhost/api/orders POST {"user_id":1,"product_id":1}
EOF

siege -c25 -t60s -f post-urls.txt

Encabezados y Autenticación

# Custom headers
siege -c25 -t30s \
    -H "Authorization: Bearer token123" \
    -H "Accept: application/json" \
    http://localhost/api/

# From configuration file
# Edit ~/.siege/siege.conf or /etc/siege/siege.conf
# Add:
# header = Authorization: Bearer token123
# header = Accept: application/json

Opciones de Configuración

# View configuration
siege --config

# Common settings in siege.conf:
# connection = close          # Use keep-alive: connection = keep-alive
# timeout = 30                # Socket timeout
# failures = 1024             # Failures before giving up
# delay = 0                   # Delay between requests (seconds)
# chunked = true              # Handle chunked encoding
# verbose = false             # Verbose output
# show-logfile = true         # Show log file location
# logging = true              # Enable logging

Escenarios de Prueba del Mundo Real

Escenario 1: Comparación Antes/Después de Optimización

Prueba con wrk:

#!/bin/bash
# compare-wrk.sh

URL="http://localhost/"

echo "=== BEFORE Optimization ==="
wrk -t12 -c400 -d30s --latency $URL | tee before-wrk.txt

read -p "Apply optimizations, then press Enter to continue..."

echo
echo "=== AFTER Optimization ==="
wrk -t12 -c400 -d30s --latency $URL | tee after-wrk.txt

# Extract key metrics
echo
echo "=== Comparison ==="
echo "Before:"
grep "Requests/sec" before-wrk.txt
grep "50%" before-wrk.txt
echo "After:"
grep "Requests/sec" after-wrk.txt
grep "50%" after-wrk.txt

Resultados:

BEFORE Optimization:
Requests/sec:   2,340.12
50%   145.23ms

AFTER Optimization:
Requests/sec:   8,920.45 (281% improvement)
50%   38.67ms (73% faster)

Prueba con siege:

#!/bin/bash
# compare-siege.sh

CONFIG="-c100 -t60s -i"
URLS="urls.txt"

echo "=== BEFORE Optimization ==="
siege $CONFIG -f $URLS 2>&1 | tee before-siege.txt

read -p "Apply optimizations, then press Enter..."

echo
echo "=== AFTER Optimization ==="
siege $CONFIG -f $URLS 2>&1 | tee after-siege.txt

echo
echo "=== Comparison ==="
echo "Before:"
grep "Transaction rate\|Response time\|Availability" before-siege.txt
echo "After:"
grep "Transaction rate\|Response time\|Availability" after-siege.txt

Escenario 2: Planificación de Capacidad

#!/bin/bash
# capacity-test.sh

URL="http://localhost/"

echo "Testing capacity with increasing concurrency..."
echo "Concurrency,Requests/sec,Latency_p50,Latency_p99" > capacity-results.csv

for concurrency in 10 50 100 200 400 800 1600; do
    echo "Testing concurrency: $concurrency"

    result=$(wrk -t12 -c$concurrency -d30s --latency $URL 2>&1)
    rps=$(echo "$result" | grep "Requests/sec" | awk '{print $2}')
    p50=$(echo "$result" | grep "50%" | awk '{print $2}')
    p99=$(echo "$result" | grep "99%" | awk '{print $2}')

    echo "$concurrency,$rps,$p50,$p99" >> capacity-results.csv

    sleep 10  # Cool down between tests
done

echo "Results saved to capacity-results.csv"
cat capacity-results.csv

# Analyze results to find optimal concurrency

Resultados de Ejemplo:

Concurrency,Requests/sec,Latency_p50,Latency_p99
10,1250.23,8.45ms,24.12ms
50,4580.45,11.23ms,32.45ms
100,7840.12,13.67ms,45.23ms
200,9420.67,21.45ms,78.34ms
400,9680.34,41.23ms,156.78ms  <- Peak performance
800,8920.45,89.67ms,345.23ms  <- Degradation starts
1600,6450.23,245.78ms,890.45ms <- Severe degradation

Conclusion: Optimal concurrency around 400-500

Escenario 3: Prueba de Carga Sostenida

Prueba Sostenida con wrk:

# 10-minute sustained load test
wrk -t12 -c400 -d600s --latency http://localhost/

# Monitor during test (separate terminal):
watch -n 5 '
    echo "=== System Resources ==="
    mpstat 1 1 | tail -2
    echo
    free -h
    echo
    ss -s
'

# Look for:
# - Memory leaks (memory usage increasing)
# - Connection exhaustion (connections growing)
# - Performance degradation (response time increasing)

Prueba Sostenida con siege:

# 15-minute test with realistic delays
siege -c100 -t900s -i -d1 -f urls.txt

# -d1: 1 second random delay between requests (0-2 seconds)
# Simulates real user behavior

# Check log file for detailed transaction data
cat /var/log/siege.log

Escenario 4: Prueba de Estrés de API

wrk con Datos POST:

-- api-stress.lua
local requests = 0
local errors = 0

request = function()
    requests = requests + 1

    local bodies = {
        '{"action":"create","data":{"name":"Item1"}}',
        '{"action":"update","data":{"id":1,"name":"Updated"}}',
        '{"action":"delete","data":{"id":2}}',
        '{"action":"list","page":1,"limit":10}'
    }

    local body = bodies[math.random(#bodies)]

    return wrk.format("POST", "/api/actions", {["Content-Type"] = "application/json"}, body)
end

response = function(status, headers, body)
    if status ~= 200 and status ~= 201 then
        errors = errors + 1
        print("Error: " .. status .. " after " .. requests .. " requests")
    end
end

done = function(summary, latency, requests)
    print("Total requests: " .. requests.requests)
    print("Total errors: " .. errors)
    print("Error rate: " .. string.format("%.2f%%", (errors / requests.requests) * 100))
end
wrk -t12 -c500 -d300s -s api-stress.lua http://localhost/

Escenario 5: Prueba de Pool de Conexiones de Base de Datos

-- db-pool-test.lua
-- Test database connection handling

local counter = 0
local slow_queries = 0

response = function(status, headers, body)
    local latency = tonumber(headers["X-Response-Time"] or "0")

    if latency > 1000 then  -- Queries > 1 second
        slow_queries = slow_queries + 1
    end
end

done = function(summary, latency, requests)
    print("\nDatabase Connection Pool Analysis:")
    print("Total queries: " .. summary.requests)
    print("Slow queries (>1s): " .. slow_queries)
    print("Slow query rate: " .. string.format("%.2f%%", (slow_queries / summary.requests) * 100))

    if slow_queries / summary.requests > 0.05 then
        print("\nWARNING: > 5% slow queries. Increase connection pool size.")
    end
end

Comparación de Rendimiento: ab vs wrk vs siege

#!/bin/bash
# tool-comparison.sh

URL="http://localhost/"
DURATION="30s"
CONCURRENCY=200

echo "=== Tool Comparison Test ==="
echo "URL: $URL"
echo "Duration: $DURATION"
echo "Concurrency: $CONCURRENCY"
echo

# Apache Bench
echo "1. Apache Bench:"
ab -n 10000 -c $CONCURRENCY -k $URL 2>&1 | grep -E "Requests per second|Time per request"
echo

# wrk
echo "2. wrk:"
wrk -t12 -c$CONCURRENCY -d$DURATION $URL 2>&1 | grep "Requests/sec"
echo

# siege
echo "3. siege:"
siege -c$CONCURRENCY -t$DURATION $URL 2>&1 | grep -E "Transaction rate|Response time"
echo

Resultados Típicos (mismo servidor, misma prueba):

1. Apache Bench:
Requests per second: 8,450.23
Time per request: 23.67ms

2. wrk:
Requests/sec: 9,120.45

3. siege:
Transaction rate: 8,890.12 trans/sec
Response time: 0.02 secs

Observations:
- wrk: Highest throughput (multi-threaded efficiency)
- ab: Good baseline, widely available
- siege: Best for complex scenarios and URL lists

Monitoreo y Análisis

Monitoreo en Tiempo Real Durante las Pruebas

#!/bin/bash
# monitor-during-test.sh

# Start monitoring in background
(
    while true; do
        echo "$(date '+%H:%M:%S') $(free -m | awk 'NR==2{print $3"MB"}') $(mpstat 1 1 | awk 'NR==4{print $3"%"}') $(ss -s | grep TCP: | awk '{print $2}')"
        sleep 5
    done
) > monitor.log &
MONITOR_PID=$!

# Run test
echo "Starting test..."
wrk -t12 -c400 -d120s --latency http://localhost/ > test-results.txt

# Stop monitoring
kill $MONITOR_PID

# Analyze monitor log
echo
echo "=== Resource Usage During Test ==="
awk '{sum1+=$2; sum2+=$3; sum3+=$4; count++} END {print "Avg Memory:", sum1/count, "Avg CPU:", sum2/count"%", "Avg Connections:", sum3/count}' monitor.log

Analizando Resultados

#!/bin/bash
# analyze-results.sh

RESULTS="$1"

# Extract key metrics
RPS=$(grep "Requests/sec" $RESULTS | awk '{print $2}')
LATENCY_AVG=$(grep "Latency" $RESULTS | awk '{print $2}')
LATENCY_P50=$(grep "50%" $RESULTS | awk '{print $2}')
LATENCY_P99=$(grep "99%" $RESULTS | awk '{print $2}')

# Generate report
cat << EOF
=== Performance Analysis ===

Throughput: $RPS requests/sec

Latency Analysis:
- Average: $LATENCY_AVG
- 50th percentile (median): $LATENCY_P50
- 99th percentile: $LATENCY_P99

Performance Rating:
EOF

# Rate performance
if (( $(echo "$RPS > 10000" | bc -l) )); then
    echo "- Excellent throughput (>10k req/s)"
elif (( $(echo "$RPS > 5000" | bc -l) )); then
    echo "- Good throughput (5k-10k req/s)"
elif (( $(echo "$RPS > 1000" | bc -l) )); then
    echo "- Moderate throughput (1k-5k req/s)"
else
    echo "- Low throughput (<1k req/s) - needs optimization"
fi

Mejores Prácticas

1. Configuración de Prueba Realista

# Bad: Unrealistic concurrency
wrk -t1 -c10000 -d30s http://localhost/
# One thread can't handle 10,000 connections efficiently

# Good: Threads = CPU cores, reasonable concurrency
wrk -t12 -c400 -d30s http://localhost/

2. Período de Calentamiento

# Warm up caches and JIT
echo "Warming up..."
wrk -t4 -c100 -d10s http://localhost/ > /dev/null

echo "Running actual test..."
wrk -t12 -c400 -d60s --latency http://localhost/

3. Prueba de Carga Progresiva

#!/bin/bash
# progressive-load.sh

for concurrency in 50 100 200 400 800; do
    echo "Testing with $concurrency concurrent connections..."
    wrk -t12 -c$concurrency -d30s http://localhost/ | grep "Requests/sec"

    # Cool down
    sleep 30
done

4. Múltiples Ejecuciones de Prueba

# Run 5 times and average
for i in {1..5}; do
    echo "Run $i:"
    wrk -t12 -c400 -d30s http://localhost/ | grep "Requests/sec"
done

5. Documentar el Entorno de Prueba

# Document system specs
cat << EOF > test-environment.txt
Test Date: $(date)
Server: $(uname -n)
CPU: $(lscpu | grep "Model name" | cut -d: -f2)
RAM: $(free -h | grep Mem | awk '{print $2}')
Disk: $(df -h / | tail -1 | awk '{print $2}')
Network: $(ethtool eth0 2>/dev/null | grep Speed | cut -d: -f2)
OS: $(cat /etc/os-release | grep PRETTY_NAME | cut -d= -f2)

Web Server: $(nginx -v 2>&1 || apache2 -v 2>&1 | head -1)
PHP Version: $(php -v | head -1)
EOF

Conclusión

wrk y siege son herramientas poderosas de prueba de carga que ofrecen características avanzadas más allá de Apache Bench:

Ventajas de wrk:

  • Rendimiento multi-hilo
  • Scripting Lua para escenarios complejos
  • Distribución detallada de latencia
  • Alta capacidad de rendimiento
  • Bajo uso de recursos

Ventajas de siege:

  • Soporte para múltiples URL
  • Pruebas basadas en transacciones
  • Simulación realista de usuarios
  • Archivos simples de lista de URL
  • Modo internet (URL aleatorias)

Casos de Uso:

  • wrk: Pruebas de rendimiento máximo, pruebas de carga de API, escenarios con scripts
  • siege: Comportamiento realista de usuarios, pruebas de múltiples endpoints, pruebas de disponibilidad
  • Ambos: Estrategia de prueba completa

Métricas Clave a Rastrear:

  • Rendimiento (requests/sec)
  • Latencia (p50, p95, p99)
  • Disponibilidad (tasa de éxito)
  • Uso de recursos (CPU, memoria)
  • Tasa de error

Al combinar las pruebas de rendimiento bruto de wrk con la simulación de escenarios realistas de siege, puede evaluar y optimizar de manera integral el rendimiento de su servidor web para cargas de trabajo de producción.