Loki Instalación for Registro Aggregation

Loki is a log aggregation system designed for DevOps and observability. UnComo traditional log aggregation systems, Loki uses a label-based index strategy similar to Prometheus, making it cost-effective and easy to operate. Esta guía covers Loki server despliegue, Promtail configuration, label strategies, LogQL queries, and retention policies.

Tabla de Contenidos

Introducción

Loki provides horizontally-scalable, highly-available log aggregation without indexing the full content of logs. By indexing only metadata (labels), Loki dramatically reduces storage costs Mientras maintaining powerful querying capabilities through LogQL. It integrates seamlessly with Prometheus and Grafana for complete observability.

Architecture

Component Descripción General

Applications/Services
        ↓
   Promtail Agents
        ↓
   Loki Server (3 modes)
   ├─ Single Binary
   ├─ Simple Scalable
   └─ Microservices
        ↓
   Time-Series Storage
   ├─ Object Storage (S3, GCS)
   └─ Database (BoltDB, Cassandra)
        ↓
   Grafana Visualization

Requisitos del Sistema

  • Linux kernel 2.6.32 or later
  • Minimum 2GB RAM (4GB+ Recomendado)
  • 10GB+ storage (depends on retention policy)
  • Go 1.19+ (for compilation)
  • Object storage access (S3, GCS, MinIO) or local storage

Loki Instalación

Paso 1: Download and Install

# Create Loki user
sudo useradd --no-create-home --shell /bin/false loki

# Download latest release
cd /tmp
wget https://github.com/grafana/loki/releases/download/v2.9.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip

# Install binary
sudo mv loki-linux-amd64 /usr/local/bin/loki
sudo chown loki:loki /usr/local/bin/loki
sudo chmod +x /usr/local/bin/loki

# Verify installation
/usr/local/bin/loki --version

Paso 2: Crear Directorios

# Create configuration and data directories
sudo mkdir -p /etc/loki /var/lib/loki
sudo chown loki:loki /etc/loki /var/lib/loki
sudo chmod 750 /etc/loki /var/lib/loki

# Create log directory
sudo mkdir -p /var/log/loki
sudo chown loki:loki /var/log/loki
sudo chmod 755 /var/log/loki

Paso 3: Crear Configuración Archivo

sudo tee /etc/loki/loki-config.yml > /dev/null << 'EOF'
auth_enabled: false

ingester:
  chunk_idle_period: 3m
  chunk_retain_period: 1m
  max_chunk_age: 1h
  chunk_encoding: snappy
  chunk_size_target: 524288
  chunk_size_bytes: 1048576
  max_streams_utilization_factor: 1.5
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_line_size: 1024000
  ingestion_rate_mb: 12
  ingestion_burst_size_mb: 20
  max_entries_limit_per_second: 1000

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

server:
  http_listen_port: 3100
  log_level: info

storage_config:
  boltdb_shipper:
    active_index_directory: /var/lib/loki/index
    shared_store: filesystem
    cache_location: /var/lib/loki/cache
  filesystem:
    directory: /var/lib/loki/chunks

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: true
  retention_period: 720h
EOF

sudo chown loki:loki /etc/loki/loki-config.yml

Paso 4: Crear Systemd Servicio

sudo tee /etc/systemd/system/loki.service > /dev/null << 'EOF'
[Unit]
Description=Loki
After=network.target

[Service]
User=loki
Group=loki
Type=simple
ExecStart=/usr/local/bin/loki -config.file=/etc/loki/loki-config.yml
Restart=on-failure
RestartSec=5

StandardOutput=journal
StandardError=journal
SyslogIdentifier=loki

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable loki
sudo systemctl start loki
sudo systemctl status loki

Configuración

Single Binary Configuración for Producción

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096
  log_level: info
  log_format: json

ingester:
  chunk_idle_period: 3m
  chunk_retain_period: 1m
  max_chunk_age: 2h
  chunk_encoding: snappy
  chunk_size_target: 1048576
  chunk_size_bytes: 1572864
  max_streams_utilization_factor: 2.0
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    num_tokens: 128
    heartbeat_timeout: 5m

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_line_size: 2097152
  ingestion_rate_mb: 50
  ingestion_burst_size_mb: 100
  max_entries_limit_per_second: 10000
  retention_period: 720h

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: loki_index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /var/lib/loki/index
    shared_store: s3
    cache_location: /var/lib/loki/cache
  s3:
    s3: s3://region/bucket
    endpoint: https://s3.amazonaws.com
    access_key_id: YOUR_KEY
    secret_access_key: YOUR_SECRET
    s3forcepathstyle: true

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: true
  retention_period: 720h
  poll_interval: 10m
  creation_grace_period: 10m
EOF

Promtail Instalación

Paso 1: Download and Install Promtail

# Create promtail user
sudo useradd --no-create-home --shell /bin/false promtail

# Download
cd /tmp
wget https://github.com/grafana/loki/releases/download/v2.9.0/promtail-linux-amd64.zip
unzip promtail-linux-amd64.zip

# Install
sudo mv promtail-linux-amd64 /usr/local/bin/promtail
sudo chmod +x /usr/local/bin/promtail

# Create directories
sudo mkdir -p /etc/promtail
sudo chown promtail:promtail /etc/promtail

Paso 2: Crear Promtail Configuración

sudo tee /etc/promtail/promtail-config.yml > /dev/null << 'EOF'
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

scrape_configs:
  - job_name: system
    static_configs:
      - targets:
          - localhost
        labels:
          job: system
          env: production
          host: __HOSTNAME__
          __path__: /var/log/*log

  - job_name: syslog
    static_configs:
      - targets:
          - localhost
        labels:
          job: syslog
          __path__: /var/log/syslog

  - job_name: auth
    static_configs:
      - targets:
          - localhost
        labels:
          job: auth
          __path__: /var/log/auth.log

  - job_name: kernel
    static_configs:
      - targets:
          - localhost
        labels:
          job: kernel
          __path__: /var/log/kern.log

  - job_name: nginx
    static_configs:
      - targets:
          - localhost
        labels:
          job: nginx
          __path__: /var/log/nginx/*.log
    pipeline_stages:
      - regex:
          expression: '^(?P<remote>[^ ]*) (?P<host>[^ ]*) (?P<user>[^ ]*) \[(?P<time>[^\]]*)\] "(?P<method>\w+) (?P<path>[^ ]*) (?P<version>[^"]*)" (?P<status>[^ ]*) (?P<size>[^ ]*)'
      - labels:
          status:
          method:
          path:
EOF

sudo chown promtail:promtail /etc/promtail/promtail-config.yml

Paso 3: Crear Promtail Systemd Servicio

sudo tee /etc/systemd/system/promtail.service > /dev/null << 'EOF'
[Unit]
Description=Promtail
After=network.target

[Service]
User=promtail
Group=promtail
Type=simple
ExecStart=/usr/local/bin/promtail -config.file=/etc/promtail/promtail-config.yml
Restart=on-failure
RestartSec=5

StandardOutput=journal
StandardError=journal
SyslogIdentifier=promtail

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable promtail
sudo systemctl start promtail
sudo systemctl status promtail

Label Strategy

Design Label Sets

Effective labeling is crucial for Loki rendimiento and queryability:

scrape_configs:
  - job_name: application
    static_configs:
      - targets:
          - localhost
        labels:
          job: app
          env: production
          region: us-east-1
          team: backend
          service: api
          __path__: /var/log/app/*.log

  - job_name: database
    static_configs:
      - targets:
          - localhost
        labels:
          job: database
          env: production
          region: us-east-1
          team: database
          service: postgresql
          __path__: /var/log/postgresql/*.log

Canalización Etapas for Label Extraction

Extraer labels from log content:

scrape_configs:
  - job_name: application
    static_configs:
      - targets:
          - localhost
        labels:
          job: app
          __path__: /var/log/app/*.log
    pipeline_stages:
      # Parse JSON logs
      - json:
          expressions:
            level: level
            msg: message
            request_id: request_id
            user_id: user_id
      # Extract labels
      - labels:
          level:
          request_id:
          user_id:
      # Add timestamp
      - timestamp:
          source: timestamp
          format: 2006-01-02T15:04:05Z07:00

LogQL Consultas

Basic Registro Consultas

# Get all logs from a job
{job="app"}

# Get logs with specific label value
{job="app", env="production"}

# Get logs from multiple services
{job=~"app|api", env="production"}

# Exclude logs
{job="app", env!="development"}

# Regex matching
{job="app", host=~".*prod.*"}

Filter by Content

# Show lines containing "error"
{job="app"} |= "error"

# Show lines NOT containing "200" (HTTP status)
{job="app"} != "200"

# Case-insensitive matching
{job="app"} |~ "(?i)failed"

# Regular expression filtering
{job="app"} |~ "error|exception|fatal"

Aggregation Consultas

# Count of log lines per service
sum(count_over_time({job="app"}[5m])) by (service)

# Error rate
sum(count_over_time({job="app"} |= "error" [1m])) 
/ 
sum(count_over_time({job="app"}[1m]))

# Average response time from logs
avg(rate({job="app"} | json | response_time > 0 [5m]))

Label Aggregations

# Count logs by severity
sum(count_over_time({job="app"}[5m])) by (level)

# Top 10 users by log volume
topk(10, sum(count_over_time({job="app"}[5m])) by (user_id))

# Request count by endpoint
sum(count_over_time({job="api"} | json | uri != "" [5m])) by (uri)

Parsing and Extraction

# Extraer duration from logs
{job="app"} | json | duration > 1000

# Parse key-value pairs
{job="app"} | logfmt | status >= 500

# Extraer from log format
{job="app"} | pattern `<_> <_> <level> <_> <method> <path> <status>`

Data Retention

Retention Configuration

Set retention policies in loki-config.yml:

table_manager:
  retention_deletes_enabled: true
  retention_period: 720h  # 30 days
  poll_interval: 10m
  creation_grace_period: 10m
  
limits_config:
  retention_period: 720h
  enforce_retention: true

Configure Per-Label Retention

Advanced retention by labels:

limits_config:
  retention_period: 720h
  
  # Shorter retention for verbose logs
  stream_retention:
    - selector: '{level="debug"}'
      period: 48h
    - selector: '{job="debug-app"}'
      period: 24h
    - selector: '{env="producción"}'
      period: 720h

Monitor Retention

# Verificar current log volume
curl -s http://localhost:3100/api/prom/series | jq .

# View label values
curl -s http://localhost:3100/api/prom/label/job/values | jq .

# Verificar index size
du -sh /var/lib/loki/index/
du -sh /var/lib/loki/chunks/

Grafana Integration

Add Loki Data Source

  1. Grafana > Configuration > Data Sources
  2. Add New Data Source > Loki
  3. URL: http://localhost:3100
  4. Save & Test

Create Loki Dashboard

  1. Create Dashboard > Add Panel
  2. Data Source: Loki
  3. Query Example:
{job="nginx"} | json status=status_code | status >= 400

Set Up Log Alerts

  1. Edit panel > Alert tab
  2. Create alert rule on query
  3. Example:
count_over_time({job="app"} |= "error" [5m]) > 10

Troubleshooting

Check Loki Status

# Servicio status
sudo systemctl status loki

# Métricas endpoint
curl -s http://localhost:3100/metrics | head -30

# Config reload
curl -X POST http://localhost:3100/-/reload

Debug Promtail

# Verificar Promtail status
sudo systemctl status promtail

# Prueba configuration
/usr/local/bin/promtail -verify-config -config.file=/etc/promtail/promtail-config.yml

# View logs
sudo journalctl -u promtail -f

Verify Data Flow

# Verificar Loki ingestion
curl 'http://localhost:3100/loki/api/v1/query_range?query={job="app"}&start=1000&end=2000&limit=100' | jq .

# Verificar indices
ls -la /var/lib/loki/index/

# Monitor disk usage
df -h /var/lib/loki/

Performance Issues

# Verificar chunk age
curl -s http://localhost:3100/metrics | grep loki_ingester_chunks_age_seconds

# Monitor ingestion rate
curl -s http://localhost:3100/metrics | grep loki_ingester_ingested_lines_total

# Verificar memory usage
ps aux | grep loki

Conclusión

Loki provides cost-effective log aggregation by indexing only labels rather than full content. Combined with Promtail for collection and Grafana for visualization, it creates a complete observability solution. Focus on designing clear label hierarchies, leveraging pipeline stages for efficient parsing, and setting appropriate retention policies for your use case. This foundation supports growing log volumes Mientras maintaining fast query rendimiento and minimal operational overhead.