Centralized Logging with ELK Stack Complete Guide

The ELK Stack (Elasticsearch, Logstash, Kibana) provides a comprehensive log aggregation, processing, and visualization solution. This guide covers installing Elasticsearch, configuring Logstash pipelines, deploying Filebeat for log collection, creating Kibana dashboards, and setting up index patterns for enterprise-scale log management.

Table of Contents

Introduction

The ELK Stack provides powerful log analysis capabilities for troubleshooting, security, and operational insights. Unlike traditional log file searches, ELK enables complex queries across billions of log entries in milliseconds, making it essential for modern infrastructure.

Architecture

ELK Stack Flow

Applications/Infrastructure
        ↓
    Log Files
        ↓
    ┌───────────────────────────────┐
    │      Filebeat Agents          │
    │  ├─ Log Collection            │
    │  ├─ Filtering                 │
    │  └─ Forwarding                │
    └───────────────────────────────┘
             ↓
    ┌───────────────────────────────┐
    │   Logstash Pipelines          │
    │  ├─ Input (Beats, Syslog)    │
    │  ├─ Filter (Parsing, Enrich) │
    │  └─ Output (Elasticsearch)    │
    └───────────────────────────────┘
             ↓
    ┌───────────────────────────────┐
    │     Elasticsearch             │
    │  ├─ Indexing                  │
    │  ├─ Storage                   │
    │  └─ Search                    │
    └───────────────────────────────┘
             ↓
    ┌───────────────────────────────┐
    │        Kibana                 │
    │  ├─ Visualization             │
    │  ├─ Dashboards               │
    │  ├─ Alerting                 │
    │  └─ Exploration              │
    └───────────────────────────────┘

System Requirements

  • Linux (Ubuntu 20.04+, CentOS 8+)
  • Java 11+ (for Elasticsearch and Logstash)
  • Minimum 4GB RAM for Elasticsearch
  • At least 20GB storage (scales with log volume)
  • Network connectivity between components
  • Root or sudo access

Elasticsearch Installation

Install Java

sudo apt-get update
sudo apt-get install -y openjdk-11-jre-headless
java -version

Install Elasticsearch

# Add Elastic repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

# Install Elasticsearch
sudo apt-get update
sudo apt-get install -y elasticsearch

# Configure
sudo nano /etc/elasticsearch/elasticsearch.yml

Elasticsearch Configuration

# /etc/elasticsearch/elasticsearch.yml

cluster.name: elk-cluster
node.name: es-node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

network.host: 0.0.0.0
http.port: 9200

# Disable SSL for development (enable for production)
xpack.security.enabled: false

# Cluster settings
discovery.type: single-node
action.auto_create_index: ".watches,.triggered_watches,.watcher-history-*"

# Performance tuning
indices.memory.index_buffer_size: 30%
thread_pool.bulk.queue_size: 500

Start Elasticsearch

sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
sudo systemctl status elasticsearch

# Verify
curl -X GET http://localhost:9200/

Logstash Installation

Install Logstash

# Add repository (already added above)
sudo apt-get install -y logstash

# Create user
sudo usermod -aG elasticsearch logstash

Logstash Directory Structure

sudo mkdir -p /etc/logstash/conf.d
sudo mkdir -p /var/log/logstash
sudo chown -R logstash:logstash /etc/logstash /var/log/logstash

Filebeat Installation

Install Filebeat

sudo apt-get install -y filebeat

# Create configuration
sudo tee /etc/filebeat/filebeat.yml > /dev/null << 'EOF'
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/syslog
    - /var/log/auth.log
  
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  fields:
    service: nginx

output.logstash:
  hosts: ["localhost:5000"]

logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644
EOF

sudo systemctl enable filebeat
sudo systemctl start filebeat

Log Pipeline Configuration

Syslog Pipeline

sudo tee /etc/logstash/conf.d/01-syslog.conf > /dev/null << 'EOF'
input {
  beats {
    port => 5000
  }
}

filter {
  # Parse syslog format
  if [fileset][name] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGLINE}" }
    }
    date {
      match => [ "timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
      remove_field => [ "timestamp" ]
    }
  }
  
  # Parse auth logs
  if [fileset][name] == "auth" {
    grok {
      match => { "message" => "%{GREEDYDATA:auth_message}" }
    }
  }
  
  # Add metadata
  mutate {
    add_field => { "[@metadata][index_name]" => "logs-%{+YYYY.MM.dd}" }
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "%{[@metadata][index_name]}"
  }
  
  # Also output for debugging
  stdout { codec => rubydebug }
}
EOF

sudo systemctl restart logstash

Nginx Pipeline

sudo tee /etc/logstash/conf.d/02-nginx.conf > /dev/null << 'EOF'
input {
  beats {
    port => 5000
    type => "nginx"
  }
}

filter {
  if [type] == "nginx" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    
    # Convert response time to milliseconds
    mutate {
      convert => [ "response_time", "float" ]
      convert => [ "status", "integer" ]
      convert => [ "bytes", "integer" ]
    }
    
    # Parse user agent
    useragent {
      source => "user_agent"
      target => "ua"
    }
    
    # Geolocate IP
    geoip {
      source => "clientip"
    }
    
    # Extract response code category
    mutate {
      add_field => { "response_category" => "%{status}" }
    }
    
    if [response_category] =~ /^2/ {
      mutate { update => { "response_category" => "Success" } }
    } else if [response_category] =~ /^3/ {
      mutate { update => { "response_category" => "Redirect" } }
    } else if [response_category] =~ /^4/ {
      mutate { update => { "response_category" => "Client Error" } }
    } else if [response_category] =~ /^5/ {
      mutate { update => { "response_category" => "Server Error" } }
    }
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "nginx-%{+YYYY.MM.dd}"
  }
}
EOF

Application JSON Logs Pipeline

sudo tee /etc/logstash/conf.d/03-app-json.conf > /dev/null << 'EOF'
input {
  beats {
    port => 5000
    type => "app"
  }
}

filter {
  if [type] == "app" {
    # Parse JSON
    json {
      source => "message"
    }
    
    # Extract level
    if [level] {
      mutate {
        lowercase => [ "level" ]
      }
    }
    
    # Timestamp handling
    if [timestamp] {
      date {
        match => [ "timestamp", "ISO8601" ]
        remove_field => [ "timestamp" ]
      }
    }
    
    # Extract service and environment
    mutate {
      add_field => { "[@metadata][index_name]" => "app-logs-%{+YYYY.MM.dd}" }
    }
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "%{[@metadata][index_name]}"
  }
}
EOF

Kibana Dashboards

Access Kibana

Navigate to http://localhost:5601

Create Index Patterns

  1. Stack Management > Index Patterns
  2. Create Index Pattern
  3. Index pattern name: logs-* or nginx-*
  4. Time field: @timestamp
  5. Create pattern

Create Dashboard

  1. Dashboards > Create Dashboard
  2. Add panels:
Panel 1: Log Count Over Time
Query: logs-*
Visualize: Line chart of document count
Breakdown: time

Panel 2: Error Rate by Service
Query: logs-* | level: "error"
Visualize: Pie chart
Breakdown: service field

Panel 3: Top 10 HTTP Endpoints
Query: nginx-*
Visualize: Bar chart
Breakdown: request field

Panel 4: Response Time Distribution
Query: nginx-*
Visualize: Histogram of response_time

Index Management

Index Patterns

# List indices
curl -X GET http://localhost:9200/_cat/indices

# View index settings
curl -X GET http://localhost:9200/logs-2024.01.15/_settings

# Delete old indices
curl -X DELETE http://localhost:9200/logs-2024.01.01

Index Lifecycle Management (ILM)

# Create ILM policy
curl -X PUT http://localhost:9200/_ilm/policy/logs-policy -H 'Content-Type: application/json' -d '{
  "policy": "logs-policy",
  "phases": {
    "hot": {
      "min_age": "0d",
      "actions": {
        "rollover": {
          "max_primary_store_size": "50GB",
          "max_age": "1d"
        }
      }
    },
    "warm": {
      "min_age": "7d",
      "actions": {
        "set_priority": {
          "priority": 50
        }
      }
    },
    "cold": {
      "min_age": "30d",
      "actions": {
        "searchable_snapshot": {}
      }
    },
    "delete": {
      "min_age": "90d",
      "actions": {
        "delete": {}
      }
    }
  }
}'

Index Templates

# Create index template
curl -X PUT http://localhost:9200/_index_template/logs-template -H 'Content-Type: application/json' -d '{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "index.lifecycle.name": "logs-policy"
    },
    "mappings": {
      "properties": {
        "timestamp": { "type": "date" },
        "level": { "type": "keyword" },
        "service": { "type": "keyword" },
        "message": { "type": "text" },
        "trace_id": { "type": "keyword" }
      }
    }
  }
}'

Performance Optimization

Elasticsearch Tuning

# /etc/elasticsearch/elasticsearch.yml

# JVM heap allocation
-Xms4g
-Xmx4g

# Performance settings
indices.queries.cache.size: 40%
indices.fielddata.cache.size: 30%
thread_pool.write.queue_size: 500
thread_pool.search.queue_size: 1000

# Indexing performance
index.refresh_interval: 30s
index.number_of_replicas: 0

Logstash Performance

# /etc/logstash/jvm.options
-Xms2g
-Xmx2g

# /etc/logstash/logstash.yml
pipeline.batch.size: 500
pipeline.batch.delay: 50

Troubleshooting

Check Component Health

# Elasticsearch health
curl http://localhost:9200/_cluster/health

# Node status
curl http://localhost:9200/_nodes/stats

# Logstash status
curl http://localhost:9600/

# Filebeat status
sudo systemctl status filebeat

Debug Log Processing

# View Logstash logs
tail -f /var/log/logstash/logstash-plain.log

# Check Filebeat output
tail -f /var/log/filebeat/filebeat

# Verify Elasticsearch indexing
curl http://localhost:9200/_cat/indices?v

Resolve Common Issues

# No data in Kibana
# 1. Check Filebeat is running: systemctl status filebeat
# 2. Verify Logstash processing: tail -f /var/log/logstash
# 3. Confirm indices created: curl http://localhost:9200/_cat/indices

# High disk usage
# 1. Check index sizes: curl http://localhost:9200/_cat/indices?h=i,store.size
# 2. Apply ILM policies
# 3. Delete old indices

# Slow queries
# 1. Check shard count
# 2. Reduce refresh interval
# 3. Adjust thread pools

Conclusion

The ELK Stack provides enterprise-grade log aggregation and analysis. By following this guide, you've deployed a complete logging infrastructure. Focus on designing effective log parsing pipelines, setting up appropriate index retention policies, and creating dashboards that facilitate troubleshooting. Well-organized, searchable logs are invaluable for debugging production issues and understanding system behavior.