InfluxDB Installation and Configuration

InfluxDB is a purpose-built time-series database designed for storing and querying metrics, events, and telemetry data at high ingest rates. This guide covers installing InfluxDB 2.x on Linux, creating buckets and organizations, using the Flux query language, integrating with Telegraf for metrics collection, configuring retention policies, and visualizing data with Grafana.

Prerequisites

  • Ubuntu 20.04/22.04 or CentOS 8/Rocky Linux 8+
  • At least 2 GB RAM (8+ GB for production time-series workloads)
  • Fast storage (SSD strongly recommended for write-heavy workloads)
  • Root or sudo access
  • Port 8086 available

Install InfluxDB 2.x

Ubuntu/Debian:

# Add InfluxDB repository
curl -fsSL https://repos.influxdata.com/influxdata-archive_compat.key \
  | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg

echo "deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main" \
  | sudo tee /etc/apt/sources.list.d/influxdata.list

sudo apt update
sudo apt install -y influxdb2

sudo systemctl enable influxdb
sudo systemctl start influxdb
sudo systemctl status influxdb

CentOS/Rocky Linux:

# Add repository
cat > /etc/yum.repos.d/influxdata.repo << 'EOF'
[influxdata]
name=InfluxData Repository
baseurl=https://repos.influxdata.com/rhel/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://repos.influxdata.com/influxdata-archive_compat.key
EOF

sudo dnf install -y influxdb2
sudo systemctl enable --now influxdb

Open the firewall port:

sudo ufw allow 8086/tcp        # Ubuntu
sudo firewall-cmd --permanent --add-port=8086/tcp && sudo firewall-cmd --reload  # CentOS

Initial Setup and Organization

Run the setup wizard or use the CLI:

# CLI-based setup (non-interactive)
influx setup \
  --username admin \
  --password AdminPassword123 \
  --org myorg \
  --bucket metrics \
  --retention 30d \
  --force

# Or access the web UI at http://your-server:8086
# Complete the setup wizard with:
# - Username and password
# - Organization name
# - Initial bucket name
# - Retention period

# Get the admin token
influx auth list --user admin

# Store the token for later use
INFLUX_TOKEN=$(influx auth list --user admin --json | python3 -c "import sys,json; print(json.load(sys.stdin)[0]['token'])")
echo "Token: $INFLUX_TOKEN"

# Configure the CLI
influx config create \
  --config-name local \
  --host-url http://localhost:8086 \
  --org myorg \
  --token "$INFLUX_TOKEN" \
  --active

Buckets and Retention Policies

Buckets in InfluxDB 2.x combine the database and retention policy concepts from 1.x:

# Create a bucket with 7-day retention
influx bucket create \
  --name application-logs \
  --org myorg \
  --retention 7d

# Create a bucket for long-term metrics (1 year)
influx bucket create \
  --name annual-metrics \
  --org myorg \
  --retention 365d

# Create a bucket with no expiry (0 = infinite)
influx bucket create \
  --name permanent-data \
  --org myorg \
  --retention 0

# List buckets
influx bucket list --org myorg

# Update retention on existing bucket
influx bucket update \
  --name metrics \
  --retention 90d

# Delete a bucket
influx bucket delete --name application-logs --org myorg

Writing Data

InfluxDB uses Line Protocol for data ingestion:

# Line Protocol format:
# measurement,tag_key=tag_value field_key=field_value timestamp
# measurement,tag1=val1,tag2=val2 field1=1.0,field2="string" 1609459200000000000

# Write via influx CLI
influx write \
  --bucket metrics \
  --org myorg \
  --precision ns \
  "cpu_usage,host=web01,region=us-east value=72.5 $(date +%s)000000000"

# Write multiple points from a file
cat > /tmp/sample_data.lp << 'EOF'
cpu_usage,host=web01 value=45.2
cpu_usage,host=web02 value=62.1
memory_usage,host=web01 used_percent=78.3,free_mb=512.0
memory_usage,host=web02 used_percent=55.1,free_mb=1024.0
EOF

influx write --bucket metrics --org myorg --file /tmp/sample_data.lp

# Write via HTTP API
curl -X POST "http://localhost:8086/api/v2/write?org=myorg&bucket=metrics&precision=s" \
  -H "Authorization: Token ${INFLUX_TOKEN}" \
  -H "Content-Type: text/plain; charset=utf-8" \
  --data-raw "server_load,host=db01 load1=1.5,load5=1.2,load15=0.9 $(date +%s)"

Write with Python:

# pip install influxdb-client
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
import datetime

client = InfluxDBClient(
    url="http://localhost:8086",
    token="your-influx-token",
    org="myorg"
)

write_api = client.write_api(write_options=SYNCHRONOUS)

# Write a data point
point = (
    Point("server_metrics")
    .tag("host", "web01")
    .tag("region", "us-east")
    .field("cpu_usage", 45.2)
    .field("memory_mb", 2048)
    .time(datetime.datetime.utcnow())
)

write_api.write(bucket="metrics", org="myorg", record=point)
print("Data written successfully")

client.close()

Flux Query Language

Flux is InfluxDB's functional data scripting language:

# Query via CLI
influx query --org myorg '
from(bucket: "metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu_usage")
  |> filter(fn: (r) => r.host == "web01")
  |> last()
'

# Aggregation: mean CPU over 5-minute windows
influx query --org myorg '
from(bucket: "metrics")
  |> range(start: -24h)
  |> filter(fn: (r) => r._measurement == "cpu_usage")
  |> aggregateWindow(every: 5m, fn: mean, createEmpty: false)
  |> yield(name: "mean_cpu")
'

# Multiple measurements in one query
influx query --org myorg '
cpu = from(bucket: "metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu_usage")
  |> mean()

mem = from(bucket: "metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "memory_usage")
  |> mean()

join(tables: {cpu: cpu, mem: mem}, on: ["host"])
'

# Top N hosts by CPU
influx query --org myorg '
from(bucket: "metrics")
  |> range(start: -15m)
  |> filter(fn: (r) => r._measurement == "cpu_usage")
  |> group(columns: ["host"])
  |> mean()
  |> top(n: 5, columns: ["_value"])
'

Telegraf Integration

Telegraf is the official agent for collecting and forwarding metrics to InfluxDB:

# Install Telegraf
sudo apt install -y telegraf  # After adding influxdata repo

# Generate a Telegraf configuration
influx telegrafs create \
  --name "server-metrics" \
  --org myorg \
  --file /etc/telegraf/telegraf.conf

# Or configure manually
cat > /etc/telegraf/telegraf.conf << 'EOF'
[global_tags]
  env = "production"

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 5000
  metric_buffer_limit = 50000
  flush_interval = "10s"

# InfluxDB v2 output
[[outputs.influxdb_v2]]
  urls = ["http://localhost:8086"]
  token = "your-influx-token"
  organization = "myorg"
  bucket = "metrics"

# System metrics
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false

[[inputs.mem]]

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "overlay"]

[[inputs.diskio]]

[[inputs.net]]
  interfaces = ["eth0", "lo"]

[[inputs.system]]

[[inputs.processes]]

# Nginx metrics (if available)
[[inputs.nginx]]
  urls = ["http://localhost/nginx_status"]
EOF

sudo systemctl enable --now telegraf
sudo systemctl status telegraf

# Test Telegraf configuration
telegraf --config /etc/telegraf/telegraf.conf --test

Backup and Restore

# Create a backup
influx backup /opt/influxdb-backup \
  --host http://localhost:8086 \
  --token "$INFLUX_TOKEN" \
  --org myorg

# List backup contents
ls -la /opt/influxdb-backup/

# Restore from backup
influx restore /opt/influxdb-backup \
  --host http://localhost:8086 \
  --token "$INFLUX_TOKEN" \
  --org myorg

# Schedule daily backups
cat > /etc/cron.daily/influxdb-backup << 'SCRIPT'
#!/bin/bash
BACKUP_DIR="/opt/influxdb-backups/$(date +%Y-%m-%d)"
mkdir -p "$BACKUP_DIR"
influx backup "$BACKUP_DIR" \
  --host http://localhost:8086 \
  --token "your-influx-token" \
  --org myorg

# Keep only 7 days of backups
find /opt/influxdb-backups -maxdepth 1 -type d -mtime +7 -exec rm -rf {} +
SCRIPT
chmod +x /etc/cron.daily/influxdb-backup

Grafana Visualization

# Install Grafana
sudo apt install -y grafana

sudo systemctl enable --now grafana-server

# Access Grafana at http://your-server:3000
# Default credentials: admin / admin

# Add InfluxDB data source in Grafana:
# Configuration > Data Sources > Add data source > InfluxDB
# Query Language: Flux
# URL: http://localhost:8086
# Organization: myorg
# Token: your-influx-token
# Default Bucket: metrics

Sample Grafana Flux query for a CPU panel:

from(bucket: "metrics")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "cpu")
  |> filter(fn: (r) => r._field == "usage_percent")
  |> filter(fn: (r) => r.cpu == "cpu-total")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> yield(name: "mean")

Troubleshooting

InfluxDB fails to start:

sudo journalctl -u influxdb -n 50
# Common issue: port 8086 in use
ss -tlnp | grep 8086

# Check disk space
df -h /var/lib/influxdb

Write errors: "partial write":

# Check for field type conflicts (field was string, now writing int)
influx query --org myorg 'from(bucket:"metrics") |> range(start:-5m) |> last()'

# Field types are immutable per measurement - delete and recreate if needed
influx delete --org myorg --bucket metrics \
  --start "1970-01-01T00:00:00Z" --stop "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --predicate '_measurement="cpu_usage"'

Flux query is slow:

# Add range filter early in the query (pushes predicate down)
# Good:
from(bucket: "metrics") |> range(start: -1h) |> filter(...)

# Bad (no range, scans all data):
from(bucket: "metrics") |> filter(...) |> range(start: -1h)

# Check cardinality (too many unique tag combinations slow queries)
influx query --org myorg 'import "influxdata/influxdb/schema"
schema.measurements(bucket: "metrics")'

Conclusion

InfluxDB 2.x provides a complete time-series platform with a built-in UI, Flux query engine, and integrated task scheduler. Pair it with Telegraf for zero-code metrics collection from any server and Grafana for visualization. Configure bucket retention policies to automatically age out old data and keep storage costs predictable. For high-ingest production workloads, run InfluxDB on SSD storage and allocate sufficient RAM for the WAL and series cardinality index.