Telegraf InfluxDB Grafana TIG Stack
The TIG stack (Telegraf, InfluxDB, Grafana) provides a complete metrics collection, storage, and visualization pipeline. UnComo Prometheus, InfluxDB is a time-series database optimized for high-volume metric ingestion and long-term storage. Esta guía covers InfluxDB installation, Telegraf configuration, metrics collection, and Grafana dashboards.
Tabla de Contenidos
- Introducción
- Architecture
- Requisitos del Sistema
- InfluxDB Instalación
- Telegraf Instalación
- Telegraf Input Plugins
- Telegraf Output Configuración
- InfluxDB Consultas
- Grafana Integración
- Avanzado Configurations
- Rendimiento Optimización
- [Solución de Problemas](#solución de problemas)
- Conclusión
Introducción
The TIG stack offers a different approach to metrics collection compared to Prometheus. InfluxDB pulls data from Telegraf agents in a push model, making it ideal for cloud environments and high-frequency metrics. Telegraf's plugin architecture enables collection from hundreds of data sources with minimal configuration.
Architecture
TIG Stack Descripción General
┌──────────────────────────────────────┐
│ Applications & Infrastructure │
│ ┌────────────────────────────────┐ │
│ │ System Metrics │ │
│ │ Application Logs & Events │ │
│ │ Database Performance │ │
│ └────────────────┬───────────────┘ │
└─────────────────────┼──────────────────┘
│
┌────────────▼────────────┐
│ Telegraf Agents │
│ - Collection │
│ - Aggregation │
│ - Transformation │
└────────────┬────────────┘
│
│ Push (HTTP/TCP)
│
┌────────────▼─────────────┐
│ InfluxDB Server │
│ - Time-series Storage │
│ - Retention Policies │
│ - Downsampling │
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ Grafana │
│ - Visualization │
│ - Dashboards │
│ - Alerting │
└──────────────────────────┘
Requisitos del Sistema
- Linux (Ubuntu 20.04+, CentOS 8+, Debian 11+)
- Minimum 2GB RAM for InfluxDB
- At least 10GB storage (scales with metrics volume and retention)
- Internet connectivity for downloads
- Root or sudo access
InfluxDB Instalación
Paso 1: Agregar Repository and Install
# Add InfluxDB repository (Ubuntu/Debian)
wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '943666ed83d68847d957f4db127ac0c2f3b7614b40ee23581f3842fda7537541' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list
# Install
sudo apt-get update
sudo apt-get install -y influxdb2
Paso 2: Iniciar InfluxDB
sudo systemctl enable influxdb
sudo systemctl start influxdb
sudo systemctl status influxdb
# Verify service
curl -I http://localhost:8086/health
Paso 3: Initial Configuración
# Access CLI
influx setup \
--username admin \
--password admin_password \
--org myorg \
--bucket mybucket \
--retention 30d \
--force
Paso 4: Crear API Token
# Generate API token via CLI
influx auth create \
--org myorg \
--description "Telegraf token" \
--write-buckets
# Save token for Telegraf configuration
export INFLUX_TOKEN="your-generated-token"
Telegraf Instalación
Paso 1: Install Telegraf
# Add repository
wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '943666ed83d68847d957f4db127ac0c2f3b7614b40ee23581f3842fda7537541' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list
# Install
sudo apt-get update
sudo apt-get install -y telegraf
# Enable service
sudo systemctl enable telegraf
Paso 2: Generate Configuración
# Generate default config
telegraf config > telegraf.conf
# Copy to /etc/telegraf
sudo cp telegraf.conf /etc/telegraf/telegraf.conf
# Create directory for custom configs
sudo mkdir -p /etc/telegraf/telegraf.d
Telegraf Input Plugins
System Métricas
# Edit telegraf config
sudo nano /etc/telegraf/telegraf.conf
# Or create specific config
sudo tee /etc/telegraf/telegraf.d/system.conf > /dev/null << 'EOF'
# System CPU usage
[[inputs.cpu]]
percpu = true
totalcpu = true
interval = "10s"
# System memory
[[inputs.mem]]
interval = "10s"
# Disk metrics
[[inputs.disk]]
mount_points = ["/"]
ignore_fs = ["tmpfs", "devtmpfs", "devfs"]
interval = "30s"
# Disk I/O
[[inputs.diskio]]
interval = "10s"
# Network interfaces
[[inputs.net]]
interface_include = ["eth0", "eth1"]
interval = "10s"
# System processes
[[inputs.processes]]
interval = "10s"
# Load average
[[inputs.system]]
interval = "10s"
# Kernel metrics
[[inputs.linux_sysctl_fs]]
interval = "30s"
EOF
Application Monitoreo
# MySQL monitoring
sudo tee /etc/telegraf/telegraf.d/mysql.conf > /dev/null << 'EOF'
[[inputs.mysql]]
servers = ["user:password@tcp(localhost:3306)/"]
perf_events_statements_digest_text_limit = 120
perf_events_statements_limit = 250
perf_events_statements_interval = 60
metric_database = "performance_schema"
interval = "30s"
EOF
# PostgreSQL monitoring
sudo tee /etc/telegraf/telegraf.d/postgres.conf > /dev/null << 'EOF'
[[inputs.postgresql]]
address = "host=localhost user=telegraf password=pwd dbname=postgres sslmode=disable"
databases = ["postgres"]
interval = "30s"
EOF
# Redis monitoring
sudo tee /etc/telegraf/telegraf.d/redis.conf > /dev/null << 'EOF'
[[inputs.redis]]
servers = ["tcp://localhost:6379"]
interval = "30s"
EOF
# Nginx monitoring
sudo tee /etc/telegraf/telegraf.d/nginx.conf > /dev/null << 'EOF'
[[inputs.nginx]]
urls = ["http://localhost/nginx_status"]
interval = "10s"
EOF
# Docker monitoring
sudo tee /etc/telegraf/telegraf.d/docker.conf > /dev/null << 'EOF'
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
timeout = "5s"
perdevice = true
total = true
interval = "30s"
EOF
Custom Métricas via exec Plugin
sudo tee /etc/telegraf/telegraf.d/custom.conf > /dev/null << 'EOF'
# Execute custom scripts
[[inputs.exec]]
commands = [
"bash /opt/scripts/custom_metric.sh",
"python3 /opt/scripts/app_metrics.py"
]
timeout = "5s"
data_format = "json"
interval = "60s"
tag_keys = ["hostname", "service"]
EOF
Telegraf Output Configuración
Main InfluxDB Output
sudo tee /etc/telegraf/telegraf.d/outputs.conf > /dev/null << 'EOF'
[[outputs.influxdb_v2]]
urls = ["http://localhost:8086"]
token = "your-api-token"
organization = "myorg"
bucket = "telegraf"
# Batching
flush_interval = "10s"
metric_buffer_limit = 10000
# Retention
retention = 30d
# TLS (if needed)
insecure_skip_verify = false
tls_ca = "/etc/telegraf/ca.pem"
tls_cert = "/etc/telegraf/cert.pem"
tls_key = "/etc/telegraf/key.pem"
EOF
Multiple Output Destinations
# Send to multiple InfluxDB instances
[[outputs.influxdb_v2]]
urls = ["http://primary:8086"]
token = "token1"
organization = "myorg"
bucket = "telegraf"
[[outputs.influxdb_v2]]
urls = ["http://backup:8086"]
token = "token2"
organization = "myorg"
bucket = "telegraf"
tagpass = {"backup" = ["true"]}
InfluxDB Consultas
InfluxQL Consultas
# Connect to InfluxDB CLI
influx v1 shell
# List buckets
show databases
# Query metrics
SELECT * FROM cpu WHERE time > now() - 1h
# Aggregations
SELECT mean(usage_user) FROM cpu GROUP BY time(1m), host
# Join multiple series
SELECT * FROM cpu JOIN mem ON cpu.host = mem.host
Flux Consultas (Modern InfluxDB)
# Basic query
from(bucket: "telegraf")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu")
# Aggregation
from(bucket: "telegraf")
|> range(start: -24h)
|> filter(fn: (r) => r._measurement == "mem")
|> aggregateWindow(every: 1m, fn: mean)
# Multi-series
from(bucket: "telegraf")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement =~ /^(cpu|mem|disk)$/)
|> group(columns: ["_measurement"])
|> aggregateWindow(every: 5m, fn: mean)
# Downsampling
from(bucket: "telegraf")
|> range(start: -30d)
|> filter(fn: (r) => r._measurement == "cpu")
|> aggregateWindow(every: 1h, fn: mean)
|> to(bucket: "telegraf-downsampled")
Grafana Integración
Agregar InfluxDB Datos Source
curl -X POST http://admin:admin@localhost:3000/api/datasources \
-H "Content-Type: application/json" \
-d '{
"name": "InfluxDB",
"type": "influxdb",
"url": "http://localhost:8086",
"access": "proxy",
"isDefault": true,
"jsonData": {
"version": "Flux",
"organization": "myorg",
"defaultBucket": "telegraf",
"token": "your-api-token"
}
}'
Crear Panel Panel
Ejemplo Flux query for Grafana:
from(bucket: "telegraf")
|> range(start: $__from, stop: $__to)
|> filter(fn: (r) => r._measurement == "cpu")
|> filter(fn: (r) => r._field == "usage_user")
|> aggregateWindow(every: $__interval, fn: mean)
Avanzado Configurations
Processor Plugin (Datos Transformation)
sudo tee /etc/telegraf/telegraf.d/processors.conf > /dev/null << 'EOF'
# Rename fields
[[processors.rename]]
[[processors.rename.replace]]
field = "usage_user"
dest = "cpu_usage_percent"
# Add tags
[[processors.tags]]
[[processors.tags.tags]]
key = "environment"
value = "production"
# Drop fields
[[processors.fields]]
drop = ["fieldname1", "fieldname2"]
# Regular expression
[[processors.regex]]
[[processors.regex.tags]]
key = "host"
pattern = "^(.+?)\\."
replacement = "${1}"
EOF
Aggregator Plugin (Windowing)
sudo tee /etc/telegraf/telegraf.d/aggregators.conf > /dev/null << 'EOF'
# Min/Max aggregation
[[aggregators.minmax]]
period = "30s"
drop_original = false
# Percentiles
[[aggregators.percentile]]
period = "60s"
percentiles = [50, 90, 95, 99]
fields = ["response_time"]
EOF
Rendimiento Optimización
Batch Configuración
sudo nano /etc/telegraf/telegraf.conf
# Batching settings
[agent]
metric_buffer_limit = 20000
flush_interval = "10s"
flush_jitter = "0s"
[outputs.influxdb_v2]
flush_interval = "10s"
metric_buffer_limit = 10000
Sampling and Filtering
sudo tee /etc/telegraf/telegraf.d/sampling.conf > /dev/null << 'EOF'
# Sample interval
[agent]
interval = "30s"
round_interval = true
# Collect only specific metrics
[[inputs.cpu]]
interval = "60s"
percpu = false
totalcpu = true
# Tag filtering
[outputs.influxdb_v2]
tagpass = {"environment" = ["production"]}
tagdrop = {"test" = ["true"]}
EOF
Solución de Problemas
Verificar Telegraf Servicio
# Service status
sudo systemctl status telegraf
# Check logs
sudo journalctl -u telegraf -f
# Test configuration
telegraf -test -config /etc/telegraf/telegraf.conf
Verificar InfluxDB Connectivity
# Check health
curl -I http://localhost:8086/health
# Verify token
influx auth list
# Test write
telegraf --input-filter=cpu --output-filter=influxdb_v2 --test
Consulta Métricas
# Via influx CLI
influx query 'from(bucket: "telegraf") |> range(start: -1h) |> limit(n: 10)'
# Check bucket contents
influx bucket list
# View stored metrics
influx query 'from(bucket: "telegraf") |> group(columns: ["_measurement"])'
Retention and Cleanup
# Update retention policy
influx bucket update \
--id bucket-id \
--retention 30d
# List retention policies
influx bucket list --org myorg
Conclusión
The TIG stack provides a robust metrics collection and visualization platform with excellent scalability. By following Esta guía, you've deployed a high-rendimienPara monitoreareo system capable of handling thousands of metrics per second. Focus on efficient Telegraf configurations tailored a su infrastructure, leveraging retention policies for cost-effective storage, and creating Grafana dashboards that provide actionable insights. The flexibility of the TIG stack makes it ideal for cloud-native and high-volume metrics scenarios.


