Graylog Installation and Configuration

Graylog is a powerful centralized log management platform that aggregates, indexes, and analyzes log data from multiple sources using Elasticsearch for storage and MongoDB for configuration. With built-in support for syslog, GELF, and Beats inputs, plus streams, pipelines, and dashboards, Graylog provides a complete solution for centralized log management at scale.

Prerequisites

  • Ubuntu 22.04/20.04 or CentOS/Rocky Linux 8+
  • Minimum 4 GB RAM (8 GB+ recommended for production)
  • Minimum 4 CPU cores
  • Elasticsearch 7.x or OpenSearch 2.x
  • MongoDB 5.x or 6.x
  • Java 17 (bundled with Graylog 5.x)

Installing Dependencies

# Install MongoDB 6.x (Ubuntu/Debian)
curl -fsSL https://www.mongodb.org/static/pgp/server-6.0.asc | \
  sudo gpg -o /usr/share/keyrings/mongodb-server-6.0.gpg --dearmor
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-6.0.gpg ] \
  https://repo.mongodb.org/apt/ubuntu $(lsb_release -cs)/mongodb-org/6.0 multiverse" | \
  sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo systemctl enable --now mongod

# Install OpenSearch 2.x (replaces Elasticsearch for Graylog 5+)
curl -o- https://artifacts.opensearch.org/publickeys/opensearch.pgp | \
  sudo gpg --dearmor --batch --yes -o /usr/share/keyrings/opensearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/opensearch-keyring.gpg] \
  https://artifacts.opensearch.org/releases/bundle/opensearch/2.x/apt stable main" | \
  sudo tee /etc/apt/sources.list.d/opensearch-2.x.list
sudo apt-get update
sudo apt-get install -y opensearch

# Configure OpenSearch for Graylog
cat > /etc/opensearch/opensearch.yml <<EOF
cluster.name: graylog
node.name: graylog-os-01
network.host: 127.0.0.1
http.port: 9200
discovery.type: single-node
plugins.security.disabled: true  # For single-node non-production setup
EOF

sudo systemctl enable --now opensearch

# Verify OpenSearch is running
curl -X GET "http://localhost:9200/"

Installing Graylog

# Install Graylog 5.x (Ubuntu/Debian)
wget https://packages.graylog2.org/repo/packages/graylog-5.1-repository_latest.deb
sudo dpkg -i graylog-5.1-repository_latest.deb
sudo apt-get update
sudo apt-get install -y graylog-server

# Generate password secret (must be at least 64 characters)
pwgen -N 1 -s 96

# Generate SHA-256 hash for admin password
echo -n "your-admin-password" | sha256sum | cut -d " " -f1

# Configure Graylog
sudo cp /etc/graylog/server/server.conf /etc/graylog/server/server.conf.bak

cat > /etc/graylog/server/server.conf <<EOF
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = PASTE-YOUR-96-CHAR-SECRET-HERE
root_username = admin
root_password_sha2 = PASTE-SHA256-HASH-HERE
root_email = [email protected]
root_timezone = UTC
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
http_external_uri = http://graylog.example.com:9000/
elasticsearch_hosts = http://127.0.0.1:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 1
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost:27017/graylog
mongodb_max_connections = 1000
EOF

sudo systemctl enable graylog-server
sudo systemctl start graylog-server

# Access Graylog at http://graylog.example.com:9000
# Login with admin / your-admin-password

Configuring Inputs

Set up inputs to receive log data:

# Via Graylog API - create a Syslog UDP input
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/system/inputs \
  -d '{
    "title": "Syslog UDP",
    "type": "org.graylog2.inputs.syslog.udp.SyslogUDPInput",
    "global": true,
    "configuration": {
      "bind_address": "0.0.0.0",
      "port": 514,
      "recv_buffer_size": 1048576,
      "number_worker_threads": 4,
      "override_source": null,
      "force_rdns": false,
      "allow_override_date": true,
      "store_full_message": false,
      "expand_structured_data": false
    }
  }'

# Create a GELF TCP input (for applications using GELF protocol)
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/system/inputs \
  -d '{
    "title": "GELF TCP",
    "type": "org.graylog2.inputs.gelf.tcp.GELFTCPInput",
    "global": true,
    "configuration": {
      "bind_address": "0.0.0.0",
      "port": 12201,
      "recv_buffer_size": 1048576,
      "number_worker_threads": 4,
      "tls_enable": false,
      "tcp_keepalive": false,
      "max_message_size": 2097152,
      "override_source": null,
      "decompress_size_limit": 8388608
    }
  }'

# Configure rsyslog to forward to Graylog (on log sources)
cat > /etc/rsyslog.d/90-graylog.conf <<EOF
# Forward all logs to Graylog via GELF
*.* @graylog.example.com:514;RSYSLOG_SyslogProtocol23Format

# Or use TCP for reliability
#*.* @@graylog.example.com:514;RSYSLOG_SyslogProtocol23Format
EOF

sudo systemctl restart rsyslog

Streams and Routing

Streams route and categorize incoming messages:

# Create a stream for application errors via API
STREAM_ID=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/streams \
  -d '{
    "title": "Application Errors",
    "description": "All application error messages",
    "matching_type": "AND",
    "remove_matches_from_default_stream": true
  }' | jq -r '.stream_id')

echo "Created stream: $STREAM_ID"

# Add rule: match messages where level <= 3 (error/critical/alert/emergency)
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/streams/${STREAM_ID}/rules \
  -d '{
    "field": "level",
    "type": 5,
    "value": "3",
    "inverted": false,
    "description": "Error level or higher"
  }'

# Add rule: match messages with specific application
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/streams/${STREAM_ID}/rules \
  -d '{
    "field": "application_name",
    "type": 1,
    "value": "webapp",
    "inverted": false,
    "description": "From webapp"
  }'

# Resume the stream (streams start paused)
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  "http://graylog.example.com:9000/api/streams/${STREAM_ID}/resume"

Pipeline Processing

Pipelines transform and enrich messages before indexing:

# Create a pipeline rule for IP geolocation enrichment
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/system/pipelines/rule \
  -d '{
    "title": "Parse HTTP request fields",
    "description": "Extract fields from HTTP access log messages",
    "source": "rule \"parse_nginx\"\nwhen\n  has_field(\"message\") AND contains(to_string($message.source), \"nginx\")\nthen\n  let parsed = grok(\"%{COMBINEDAPACHELOG}\", to_string($message.message));\n  set_fields(parsed);\n  set_field(\"http_method\", parsed[\"verb\"]);\n  set_field(\"http_status\", to_long(parsed[\"response\"]));\n  set_field(\"response_bytes\", to_long(parsed[\"bytes\"]));\nend"
  }'

# Create a pipeline using the rule
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/system/pipelines/pipeline \
  -d '{
    "title": "Nginx Log Enrichment",
    "description": "Parse and enrich nginx access logs",
    "source": "pipeline \"nginx_enrichment\"\nstage 0 match either\n  rule \"parse_nginx\"\nend"
  }'

Dashboards

# Create a dashboard via Graylog UI at http://graylog.example.com:9000/dashboards
# Or via API:
DASHBOARD_ID=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/dashboards \
  -d '{
    "title": "Infrastructure Overview",
    "description": "Key metrics dashboard"
  }' | jq -r '.id')

# Recommended searches for Graylog dashboards:
# Error rate over time: level:<=3
# Top sources by message count: (in Graylog UI - Source Quick Values widget)
# HTTP 5xx errors: http_status:>=500
# Authentication failures: message:"authentication failure" OR message:"Failed password"

Alerting Rules

# Create event definition for error rate alert
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/events/definitions \
  -d '{
    "title": "High Error Rate",
    "description": "More than 100 errors in 5 minutes",
    "priority": 3,
    "alert": true,
    "config": {
      "type": "aggregation-v1",
      "query": "level:<=3",
      "query_parameters": [],
      "streams": [],
      "group_by": [],
      "series": [
        {
          "id": "count-errors",
          "function": "count",
          "field": null
        }
      ],
      "conditions": {
        "expression": {
          "expr": "number-ref",
          "ref": "count-errors",
          "value": 100,
          "left": {
            "expr": "number-ref",
            "ref": "count-errors"
          },
          "right": {
            "expr": "number",
            "value": 100
          }
        }
      },
      "execute_every_ms": 300000,
      "search_within_ms": 300000,
      "use_cron_scheduling": false
    },
    "field_spec": {},
    "key_spec": [],
    "notification_settings": {
      "grace_period_ms": 300000,
      "backlog_size": 10
    },
    "notifications": [],
    "storage": []
  }'

# Configure Slack notification
curl -s -X POST \
  -H "Content-Type: application/json" \
  -u admin:your-password \
  http://graylog.example.com:9000/api/events/notifications \
  -d '{
    "title": "Slack Alerts",
    "description": "Send alerts to Slack",
    "config": {
      "type": "http-notification-v1",
      "url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
    }
  }'

Troubleshooting

Graylog not receiving messages:

# Check input status
curl -s -u admin:password \
  http://graylog.example.com:9000/api/system/inputs | jq '.inputs[].state'

# Test UDP syslog input
echo "<14>$(date '+%b %d %H:%M:%S') testhost testapp: Test message from rsyslog" | \
  nc -u graylog.example.com 514

# Test GELF TCP
echo '{"version":"1.1","host":"test","short_message":"Test","level":6}' | \
  nc graylog.example.com 12201

# Check Graylog logs
sudo journalctl -u graylog-server -n 100

Elasticsearch/OpenSearch connection errors:

# Verify OpenSearch is running
curl -X GET "http://localhost:9200/_cluster/health?pretty"

# Check Graylog can reach OpenSearch
curl -s http://localhost:9200/_cat/indices/graylog_* | head -10

# View Graylog index stats
curl -s -u admin:password \
  http://graylog.example.com:9000/api/system/indexer/overview | jq

High memory usage:

# Adjust Java heap size (default is 1GB, increase for production)
# Edit /etc/default/graylog-server:
GRAYLOG_SERVER_JAVA_OPTS="-Xms4g -Xmx4g -server -XX:+UseG1GC"

sudo systemctl restart graylog-server

# Optimize OpenSearch heap
# Edit /etc/opensearch/jvm.options:
# -Xms4g
# -Xmx4g

Conclusion

Graylog provides a complete centralized log management solution with powerful search, stream-based routing, pipeline processing, and alerting built on top of proven Elasticsearch/OpenSearch and MongoDB backends. Start by configuring syslog and GELF inputs for broad log collection, create streams to categorize important log types, and build dashboards for the metrics your team monitors most. For production deployments, allocate sufficient RAM (8GB+) and configure index rotation policies to manage storage costs.