Structured Logging Best Practices for Servers

Structured logging replaces unformatted text logs with machine-readable JSON objects that contain consistent fields for timestamps, log levels, correlation IDs, and contextual metadata, making logs searchable, aggregatable, and directly ingestible by observability systems. Implementing structured logging across server applications dramatically reduces mean time to resolution when debugging production issues.

Prerequisites

  • Linux server running application workloads
  • Application source code access for implementing structured logging
  • Log aggregation system (Loki, Elasticsearch, Graylog, or similar)
  • Basic understanding of JSON format

JSON Log Format Standards

A well-structured log entry follows consistent field naming:

{
  "timestamp": "2024-01-15T10:30:45.123Z",
  "level": "error",
  "message": "Database connection failed",
  "service": "user-api",
  "version": "2.3.1",
  "environment": "production",
  "host": "web-01.example.com",
  "pid": 12345,
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
  "span_id": "00f067aa0ba902b7",
  "user_id": "usr_abc123",
  "request_id": "req_def456",
  "error": {
    "type": "ConnectionTimeoutError",
    "message": "Connect timeout after 5000ms",
    "stack": "Error: Connect timeout...\n    at ..."
  },
  "db": {
    "host": "postgres.internal",
    "port": 5432,
    "name": "userdb",
    "operation": "SELECT",
    "duration_ms": 5001
  }
}

Standardize on these field names across all services:

# Required fields for every log entry:
# timestamp  - ISO 8601 UTC (never local time)
# level      - debug/info/warn/error/fatal (lowercase)
# message    - Human-readable description
# service    - Service or application name
# host       - Hostname of the server

# HTTP request fields (for web services):
# http.method, http.path, http.status, http.duration_ms
# http.request_id, http.user_agent, http.remote_addr

# Error fields:
# error.type, error.message, error.stack

# Database fields:
# db.type (mysql/postgres), db.host, db.name, db.operation, db.duration_ms

Log Levels and When to Use Them

# DEBUG: Detailed diagnostic information
# - Variable values during computation
# - Entering/exiting functions with parameters
# - Cache hits/misses
# - NOT enabled in production (performance impact)

# INFO: Normal operational events
# - Server started/stopped
# - User authenticated
# - Background job started/completed
# - Configuration loaded

# WARN: Unexpected but recoverable situations
# - Deprecated API endpoint called
# - Retry attempt
# - Near resource limits (disk 80% full)
# - Slow query threshold exceeded

# ERROR: Failures that affect functionality
# - Database connection failed
# - External API returned error
# - File not found (when required)
# - Unhandled exceptions

# FATAL: Critical failures requiring immediate shutdown
# - Cannot bind to port
# - Required service unavailable at startup
# - Data corruption detected

Log level configuration by environment:

# Production: INFO and above (no DEBUG)
LOG_LEVEL=info

# Staging: INFO and above, DEBUG for specific services
LOG_LEVEL=info
LOG_LEVEL_AUTH_SERVICE=debug

# Development: DEBUG
LOG_LEVEL=debug

# Dynamic log level change without restart (example: systemd environment)
sudo systemctl edit --runtime myapp.service
# [Service]
# Environment="LOG_LEVEL=debug"
sudo systemctl kill -s SIGUSR1 myapp.service  # Signal to reload config

Correlation IDs and Request Tracing

Correlation IDs link log entries across multiple services:

# Nginx: Generate and pass trace IDs in headers
cat > /etc/nginx/conf.d/trace-headers.conf <<EOF
# Generate a unique request ID if not present
map \$http_x_request_id \$request_id_final {
    default   \$request_id;
    "~*."     \$http_x_request_id;
}

server {
    listen 80;
    server_name api.example.com;

    location / {
        # Pass request ID to upstream and log it
        proxy_set_header X-Request-ID \$request_id_final;
        proxy_pass http://backend;
    }

    log_format structured escape=json
        '{"timestamp":"\$time_iso8601",'
        '"level":"info",'
        '"message":"HTTP request",'
        '"service":"nginx",'
        '"host":"\$hostname",'
        '"http":{'
            '"method":"\$request_method",'
            '"path":"\$request_uri",'
            '"status":\$status,'
            '"duration_ms":\$request_time,'
            '"bytes_sent":\$bytes_sent,'
            '"remote_addr":"\$remote_addr",'
            '"request_id":"\$request_id_final",'
            '"user_agent":"\$http_user_agent"'
        '}}';

    access_log /var/log/nginx/access.json structured;
}
EOF

sudo nginx -t && sudo systemctl reload nginx

Context Enrichment

Add consistent context to every log entry using middleware:

# Python: Structlog with context enrichment
import structlog
import uuid
from functools import wraps

# Configure structlog for JSON output
structlog.configure(
    processors=[
        structlog.stdlib.filter_by_level,
        structlog.stdlib.add_log_level,
        structlog.stdlib.add_logger_name,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.UnicodeDecoder(),
        structlog.processors.JSONRenderer()
    ],
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
    wrapper_class=structlog.stdlib.BoundLogger,
    cache_logger_on_first_use=True,
)

log = structlog.get_logger()

# Flask middleware to add request context
from flask import Flask, g, request
import os

app = Flask(__name__)

@app.before_request
def before_request():
    g.request_id = request.headers.get('X-Request-ID', str(uuid.uuid4()))
    g.trace_id = request.headers.get('X-Trace-ID', str(uuid.uuid4()))
    # Bind context to all log calls in this request
    structlog.contextvars.bind_contextvars(
        request_id=g.request_id,
        trace_id=g.trace_id,
        service='user-api',
        environment=os.getenv('ENVIRONMENT', 'production'),
        version=os.getenv('APP_VERSION', '1.0.0'),
    )

@app.after_request
def after_request(response):
    structlog.contextvars.clear_contextvars()
    return response

# Usage in route handlers
@app.route('/users/<user_id>')
def get_user(user_id):
    log.info("fetching user", user_id=user_id)
    
    try:
        user = db.get_user(user_id)
        log.info("user found", user_id=user_id, email=user.email)
        return jsonify(user.to_dict())
    except UserNotFoundError:
        log.warning("user not found", user_id=user_id)
        return jsonify({'error': 'Not found'}), 404
    except DatabaseError as e:
        log.error("database error", user_id=user_id, error=str(e), exc_info=True)
        return jsonify({'error': 'Internal error'}), 500

Logging in Different Languages

# Python with structlog - outputs JSON automatically
pip install structlog

import structlog
logger = structlog.get_logger()
logger.info("user.created", user_id="123", email="[email protected]", role="admin")
# Output: {"event":"user.created","user_id":"123","email":"[email protected]","role":"admin","timestamp":"2024-01-15T10:30:45.123Z","level":"info"}
// Node.js with pino - fastest JSON logger
npm install pino pino-pretty

const pino = require('pino');
const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  base: {
    service: 'api-service',
    environment: process.env.NODE_ENV,
    version: process.env.npm_package_version,
  },
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Child logger with bound context
const requestLogger = logger.child({ 
  request_id: 'req_123',
  user_id: 'usr_456' 
});

requestLogger.info({ http_path: '/api/users', method: 'GET' }, 'incoming request');
requestLogger.error({ error: err.message, stack: err.stack }, 'request failed');
// Go with zerolog
import "github.com/rs/zerolog"
import "github.com/rs/zerolog/log"

// Configure global logger
zerolog.TimeFieldFormat = zerolog.TimeFormatUnixMs
log.Logger = log.With().
    Str("service", "user-api").
    Str("environment", os.Getenv("ENVIRONMENT")).
    Logger()

// Usage
log.Info().
    Str("request_id", requestID).
    Str("user_id", userID).
    Str("http_path", r.URL.Path).
    Int("http_status", 200).
    Dur("duration_ms", time.Since(start)).
    Msg("request completed")

log.Error().
    Err(err).
    Str("user_id", userID).
    Str("operation", "database_query").
    Msg("database error")

System-Level Structured Logging

Configure system services to output structured logs:

# Configure journald to output JSON (systemd 252+)
cat > /etc/systemd/journald.conf.d/json.conf <<EOF
[Journal]
Storage=persistent
MaxRetentionSec=1month
ForwardToSyslog=yes
EOF

# Read journald logs as JSON
journalctl -o json --since "1 hour ago" | head -5

# Key journald JSON fields:
# SYSLOG_IDENTIFIER - service name
# MESSAGE - log message
# PRIORITY - 0-7 (syslog severity)
# _HOSTNAME - server hostname
# _PID - process ID
# __REALTIME_TIMESTAMP - Unix microseconds

# Convert journald to structured syslog for Promtail/Loki
# Promtail can read journald directly:
scrape_configs:
  - job_name: journald
    journal:
      max_age: 12h
      labels:
        job: journald
    relabel_configs:
      - source_labels: ['__journal__systemd_unit']
        target_label: unit
      - source_labels: ['__journal_syslog_identifier']
        target_label: service

# Nginx structured logging (already shown above)
# Apache structured logging
cat > /etc/apache2/conf-available/json-log.conf <<EOF
LogFormat "{\"timestamp\":\"%{%Y-%m-%dT%H:%M:%S}t.000Z\",\"level\":\"info\",\"service\":\"apache\",\"http\":{\"method\":\"%m\",\"path\":\"%U%q\",\"status\":%>s,\"duration_ms\":%D,\"remote_addr\":\"%a\",\"request_id\":\"%{X-Request-ID}i\"}}" json
CustomLog /var/log/apache2/access.json json
EOF

Integration with Aggregation Systems

Configure rsyslog to forward JSON logs:

# rsyslog: Forward JSON logs to Loki via HTTP
cat > /etc/rsyslog.d/50-loki.conf <<EOF
# Load HTTP output module
module(load="omhttp")

# Forward all logs with JSON format
*.* action(
  type="omhttp"
  server="loki.example.com"
  serverport="3100"
  httpcontenttype="application/json"
  restpath="loki/api/v1/push"
  template="lokiTemplate"
  batch.format="newline"
  batch.maxbytes="1024000"
  queue.type="LinkedList"
  queue.size="10000"
  queue.dequeuebatchsize="1000"
  action.resumeRetryCount="-1"
)

template(name="lokiTemplate" type="list") {
  constant(value="{\"streams\":[{\"stream\":{\"host\":\"")
  property(name="hostname")
  constant(value="\",\"job\":\"rsyslog\"},\"values\":[[\"")
  property(name="timegenerated" dateformat="unixtimestamp" date.inUTC="on")
  constant(value="000000000\",\"")
  property(name="msg" format="json")
  constant(value="\"]]}]}")
}
EOF

sudo systemctl restart rsyslog

# Configure Fluentd to parse and forward JSON logs
cat > /etc/fluent/conf.d/parse-json.conf <<EOF
<filter app.**>
  @type parser
  key_name message
  reserve_data true
  remove_key_name_field true
  <parse>
    @type json
  </parse>
</filter>

<match app.**>
  @type elasticsearch
  host elasticsearch.example.com
  port 9200
  logstash_format true
  logstash_prefix app-logs
  include_timestamp true
  flush_interval 5s
</match>
EOF

Troubleshooting

JSON parsing errors in log aggregator:

# Validate your log output is valid JSON
your-app 2>&1 | head -10 | python3 -m json.tool

# Check for multi-line logs breaking JSON parsing
# Solution: Use a single-line JSON format (no pretty-printing)

# Test with jq
your-app 2>&1 | jq -c . | head -5

Missing fields in aggregated logs:

# Verify log schema in Elasticsearch/Loki
curl -s http://elasticsearch:9200/app-logs-*/_mapping | jq '.[] | .mappings.properties | keys'

# Check if fields are being dropped (field limit hit)
# Elasticsearch default is 1000 fields per index
curl -s http://elasticsearch:9200/_settings | jq '."index.mapping.total_fields.limit"'

Performance impact of logging:

# Use async logging to avoid blocking application threads
# Python example: use asyncio or separate logging thread
# Node.js: pino is already async by design

# Measure logging overhead
time your-app --log-level=info &
time your-app --log-level=debug &

# Limit log throughput with sampling
# Log 1 in 100 debug messages in production

Conclusion

Structured JSON logging with consistent fields, correlation IDs, and contextual metadata transforms logs from text dumps into queryable, actionable observability data. The key discipline is consistency: every service must use the same field names for the same concepts so that cross-service correlation works in your aggregation system. Start with a base logging library wrapper that automatically adds service name, environment, and request context, and enforce this pattern across your entire stack for maximum observability.