Grafana Mimir Scalable Metrics Backend

Grafana Mimir is a horizontally scalable, multi-tenant time series database designed for large-scale Prometheus metrics storage, offering long-term retention, global query capabilities, and better performance than running Thanos at scale. This guide covers deploying Mimir in both monolithic and microservices modes, configuring multi-tenancy, remote write from Prometheus, storage backends, and query optimization.

Prerequisites

  • Ubuntu 20.04+ or CentOS 8+ / Rocky Linux 8+
  • 2 GB RAM minimum for monolithic mode; 8+ GB for microservices
  • S3-compatible object storage (AWS S3, MinIO, GCS)
  • Prometheus 2.x for remote write
  • Grafana for visualization

Installing Mimir (Monolithic Mode)

Monolithic mode runs all Mimir components in a single process - ideal for getting started or for moderate scale:

# Download Mimir binary
MIMIR_VERSION="2.12.0"
wget https://github.com/grafana/mimir/releases/download/mimir-${MIMIR_VERSION}/mimir-linux-amd64

chmod +x mimir-linux-amd64
sudo mv mimir-linux-amd64 /usr/local/bin/mimir
mimir --version

# Create directories and user
sudo useradd -r -s /sbin/nologin mimir
sudo mkdir -p /etc/mimir /var/lib/mimir/{tsdb,compactor,ruler,alertmanager}
sudo chown -R mimir:mimir /etc/mimir /var/lib/mimir

Configuration File

Create the Mimir configuration:

# /etc/mimir/mimir.yaml

# Use monolithic mode (all components in one process)
target: all,alertmanager

# Multi-tenancy: set to false to use a single "anonymous" tenant
multitenancy_enabled: false

# Limits
limits:
  ingestion_rate: 10000
  ingestion_burst_size: 200000
  max_global_series_per_user: 1500000
  compactor_blocks_retention_period: 1y

# Block storage (S3/MinIO)
blocks_storage:
  backend: s3
  s3:
    bucket_name: mimir-blocks
    endpoint: minio.example.com:9000
    access_key_id: minioadmin
    secret_access_key: minioadmin
    insecure: false
  tsdb:
    dir: /var/lib/mimir/tsdb
    retention_period: 24h  # Local retention before uploading
  bucket_store:
    sync_dir: /var/lib/mimir/tsdb-sync

# Ruler storage
ruler_storage:
  backend: s3
  s3:
    bucket_name: mimir-ruler
    endpoint: minio.example.com:9000
    access_key_id: minioadmin
    secret_access_key: minioadmin

# Alert manager storage
alertmanager_storage:
  backend: s3
  s3:
    bucket_name: mimir-alertmanager
    endpoint: minio.example.com:9000
    access_key_id: minioadmin
    secret_access_key: minioadmin

# Compactor
compactor:
  data_dir: /var/lib/mimir/compactor
  sharding_ring:
    kvstore:
      store: memberlist

# Distributor and ingester ring
ingester:
  ring:
    replication_factor: 1  # Set to 3 for production HA
    kvstore:
      store: memberlist

distributor:
  ring:
    kvstore:
      store: memberlist

# Use memberlist for ring communication (no ZooKeeper/etcd needed)
memberlist:
  join_members: []  # Add other Mimir nodes here for clustering

# HTTP server
server:
  http_listen_port: 9009
  grpc_listen_port: 9095
  log_level: info

# Store gateway
store_gateway:
  sharding_ring:
    replication_factor: 1
    kvstore:
      store: memberlist
# Set correct permissions
sudo chmod 600 /etc/mimir/mimir.yaml
sudo chown mimir:mimir /etc/mimir/mimir.yaml

# Create systemd service
sudo cat > /etc/systemd/system/mimir.service << 'EOF'
[Unit]
Description=Grafana Mimir
After=network.target

[Service]
User=mimir
Group=mimir
ExecStart=/usr/local/bin/mimir -config.file=/etc/mimir/mimir.yaml
Restart=on-failure
RestartSec=10
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now mimir
sudo systemctl status mimir

# Check health
curl http://localhost:9009/ready
curl http://localhost:9009/metrics | head -20

Configuring Prometheus Remote Write

Send Prometheus metrics to Mimir:

# prometheus.yml
global:
  scrape_interval: 15s

remote_write:
  - url: http://mimir-host:9009/api/v1/push
    # For multi-tenant Mimir, add the tenant header:
    # headers:
    #   X-Scope-OrgID: team-ops

    queue_config:
      capacity: 10000
      max_shards: 30
      max_samples_per_send: 10000
      batch_send_deadline: 5s
      min_backoff: 30ms
      max_backoff: 5s

    # Optional: filter metrics before sending
    write_relabel_configs:
      - source_labels: [__name__]
        regex: 'go_gc_.*'
        action: drop

For Grafana Agent as an alternative to Prometheus:

# /etc/grafana-agent.yaml
metrics:
  global:
    scrape_interval: 15s
    remote_write:
      - url: http://mimir-host:9009/api/v1/push
        headers:
          X-Scope-OrgID: my-tenant

  configs:
    - name: default
      scrape_configs:
        - job_name: node-exporter
          static_configs:
            - targets: ['localhost:9100']

Multi-Tenancy

Enable multi-tenancy in mimir.yaml:

multitenancy_enabled: true

Each tenant is identified by the X-Scope-OrgID HTTP header. Send metrics with different tenant IDs:

# prometheus.yml for team-ops
remote_write:
  - url: http://mimir-host:9009/api/v1/push
    headers:
      X-Scope-OrgID: team-ops

# prometheus.yml for team-dev
remote_write:
  - url: http://mimir-host:9009/api/v1/push
    headers:
      X-Scope-OrgID: team-dev

Per-tenant limits in mimir.yaml:

limits:
  # Default limits for all tenants
  ingestion_rate: 10000
  max_global_series_per_user: 1000000

  # Override limits per tenant
  per_tenant_override_config: /etc/mimir/limits.yaml
# /etc/mimir/limits.yaml
overrides:
  team-ops:
    ingestion_rate: 50000
    max_global_series_per_user: 5000000
    compactor_blocks_retention_period: 2y
  team-dev:
    ingestion_rate: 5000
    max_global_series_per_user: 500000
    compactor_blocks_retention_period: 90d

Querying with Grafana

  1. In Grafana: ConfigurationData SourcesAdd data source
  2. Select Prometheus
  3. Set URL: http://mimir-host:9009/prometheus
  4. For multi-tenant Mimir, add a custom header:
    • Header: X-Scope-OrgID
    • Value: team-ops
  5. Click Save & Test

Create a datasource per tenant to keep dashboards isolated.

Mimir-specific recording rules improve query performance. Upload rules via API:

# Create a rules file
cat > /tmp/rules.yaml << 'EOF'
groups:
  - name: http
    interval: 1m
    rules:
      - record: job:http_requests:rate5m
        expr: sum(rate(http_requests_total[5m])) by (job, status)
EOF

# Upload to Mimir (replace tenant-id as needed)
curl -X POST http://localhost:9009/prometheus/config/v1/rules/my-namespace \
  -H 'X-Scope-OrgID: team-ops' \
  -H 'Content-Type: application/yaml' \
  --data-binary @/tmp/rules.yaml

Microservices Deployment with Docker Compose

For production scale, deploy components separately:

# docker-compose.yaml (abbreviated)
version: "3.9"

services:
  minio:
    image: minio/minio
    command: server /data --console-address ":9001"
    environment:
      MINIO_ROOT_USER: mimir
      MINIO_ROOT_PASSWORD: supersecret
    volumes:
      - minio-data:/data

  mimir-distributor:
    image: grafana/mimir:2.12.0
    command: -config.file=/etc/mimir/mimir.yaml -target=distributor
    volumes:
      - ./mimir.yaml:/etc/mimir/mimir.yaml:ro

  mimir-ingester-1:
    image: grafana/mimir:2.12.0
    command: -config.file=/etc/mimir/mimir.yaml -target=ingester
    volumes:
      - ./mimir.yaml:/etc/mimir/mimir.yaml:ro
      - ingester-1-data:/data

  mimir-querier:
    image: grafana/mimir:2.12.0
    command: -config.file=/etc/mimir/mimir.yaml -target=querier
    volumes:
      - ./mimir.yaml:/etc/mimir/mimir.yaml:ro

  mimir-compactor:
    image: grafana/mimir:2.12.0
    command: -config.file=/etc/mimir/mimir.yaml -target=compactor
    volumes:
      - ./mimir.yaml:/etc/mimir/mimir.yaml:ro
      - compactor-data:/data

  mimir-store-gateway:
    image: grafana/mimir:2.12.0
    command: -config.file=/etc/mimir/mimir.yaml -target=store-gateway
    volumes:
      - ./mimir.yaml:/etc/mimir/mimir.yaml:ro
      - store-gateway-data:/data

In microservices mode, update memberlist.join_members to list all component hostnames.

Storage Backends

Mimir supports multiple object storage backends:

# Local filesystem (testing only)
blocks_storage:
  backend: filesystem
  filesystem:
    dir: /var/lib/mimir/blocks

# Google Cloud Storage
blocks_storage:
  backend: gcs
  gcs:
    bucket_name: mimir-blocks

# Azure Blob Storage
blocks_storage:
  backend: azure
  azure:
    account_name: mystorageaccount
    account_key: base64encodedkey==
    container_name: mimir-blocks

Troubleshooting

Mimir not ready:

curl http://localhost:9009/ready
# If not ready, check logs:
sudo journalctl -u mimir -f
# Common issues: S3 connection failed, wrong bucket name

Remote write failing:

# Check Prometheus remote write metrics
curl http://prometheus:9090/api/v1/query?query=prometheus_remote_storage_failed_samples_total

# Check Mimir distributor logs for ingestion errors
curl http://localhost:9009/metrics | grep cortex_distributor

High cardinality causing ingestion rejection:

# Check active series count
curl 'http://localhost:9009/api/v1/query?query=sum(cortex_ingester_active_series)' \
  -H 'X-Scope-OrgID: team-ops'

# Increase limits in mimir.yaml
limits:
  max_global_series_per_user: 5000000

Slow range queries:

# Enable query-frontend caching
# Add to mimir.yaml:
query_range:
  results_cache:
    backend: memcached
    memcached:
      addresses: dns+memcached:11211

Conclusion

Grafana Mimir provides a production-grade, multi-tenant metrics backend that scales horizontally to handle billions of active time series while maintaining Prometheus API compatibility. Starting with the monolithic deployment mode simplifies operations, and migrating to microservices is straightforward once you need to scale individual components independently. The combination of Mimir with Grafana Agent for collection and Grafana for visualization creates a complete, self-hosted observability platform competitive with commercial offerings.