OpenTelemetry Collector Configuration
The OpenTelemetry Collector is a vendor-agnostic proxy for processing and exporting telemetry data (metrics, traces, logs) from applications to various backends. This guide covers installation, receiver configuration, processors for data transformation, exporters to multiple destinations, and service configuration for reliable telemetry processing.
Table of Contents
- Introduction
- Architecture
- System Requirements
- Installation
- Receivers Configuration
- Processors for Data Transformation
- Exporters to Backends
- Service Configuration
- Advanced Scenarios
- Performance Tuning
- Troubleshooting
- Conclusion
Introduction
The OpenTelemetry Collector solves the challenge of collecting telemetry from diverse sources and routing to multiple backends. It decouples instrumentation from infrastructure choices, allowing seamless switching between observability platforms without application changes.
Architecture
Collector Pipeline
Applications
↓
Instrumented with OpenTelemetry SDKs
↓
OTLP Protocol (gRPC/HTTP)
↓
┌─────────────────────────────────┐
│ OpenTelemetry Collector │
├─────────────────────────────────┤
│ │
│ Receivers │
│ ├─ OTLP (gRPC/HTTP) │
│ ├─ Prometheus │
│ ├─ Jaeger │
│ └─ Syslog │
│ ↓ │
│ Processors │
│ ├─ Batch │
│ ├─ Memory Limiter │
│ ├─ Sampling │
│ └─ Attribute Processor │
│ ↓ │
│ Exporters │
│ ├─ Prometheus │
│ ├─ Jaeger │
│ ├─ OTLP Backends │
│ └─ Multiple Destinations │
│ │
└─────────────────────────────────┘
↓
Observability Platforms
(Prometheus, Jaeger, Datadog, etc.)
System Requirements
- Linux, macOS, or Windows
- Minimum 512MB RAM
- 100MB disk space
- Go 1.17+
- Network connectivity
- Telemetry-generating applications
Installation
Binary Installation
# Download latest release
OTEL_VERSION="0.88.0"
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcontribcol_${OTEL_VERSION}_linux_amd64.tar.gz
tar -xzf otelcontribcol_${OTEL_VERSION}_linux_amd64.tar.gz
# Install
sudo mv otelcontribcol /usr/local/bin/
sudo chmod +x /usr/local/bin/otelcontribcol
# Verify
otelcontribcol --version
Docker Installation
# Pull image
docker pull otel/opentelemetry-collector-contrib:latest
# Run container
docker run -d \
-p 4317:4317 \
-p 4318:4318 \
-p 9411:9411 \
-p 14250:14250 \
-p 55679:55679 \
-v $(pwd)/otel-collector-config.yml:/etc/otel-collector-config.yml \
--name otel-collector \
otel/opentelemetry-collector-contrib:latest \
--config=/etc/otel-collector-config.yml
Systemd Service
sudo tee /etc/systemd/system/otel-collector.service > /dev/null << 'EOF'
[Unit]
Description=OpenTelemetry Collector
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/otelcontribcol --config=/etc/otel-collector/config.yml
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otel-collector
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable otel-collector
sudo systemctl start otel-collector
Receivers Configuration
OTLP Receivers
receivers:
# OpenTelemetry Protocol over gRPC
otlp/grpc:
protocols:
grpc:
endpoint: 0.0.0.0:4317
# OpenTelemetry Protocol over HTTP
otlp/http:
protocols:
http:
endpoint: 0.0.0.0:4318
cors:
allowed_origins: ["*"]
Prometheus Receiver
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
metric_relabel_configs:
- source_labels: [__name__]
regex: 'node_network_.*'
action: drop
Jaeger Receiver
receivers:
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
thrift_http:
endpoint: 0.0.0.0:14268
Syslog Receiver
receivers:
syslog:
listen_address: 0.0.0.0:514
protocol_config:
protocol: rfc5424
Filelog Receiver
receivers:
filelog:
include_paths:
- /var/log/app/*.log
- /var/log/syslog
multiline_parser:
line_start_pattern: '^\d{4}-\d{2}-\d{2}'
parse_from: body
parse_config:
type: json
Processors for Data Transformation
Batch Processor
processors:
batch:
send_batch_size: 1024
timeout: 10s
send_batch_max_size: 2048
Memory Limiter
processors:
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
Sampling Processor
processors:
sampling:
sampling_percentage: 10 # Sample 10% of traces
# Or head-based sampling
tail_sampling:
policies:
- name: error-spans
type: status_code
status_code:
status_codes: [ERROR]
- name: slow-traces
type: latency
latency:
threshold_ms: 1000
- name: default-sampling
type: probabilistic
probabilistic:
sampling_percentage: 10
Attribute Processor
processors:
attributes:
actions:
- key: service.version
value: 1.0.0
action: insert
- key: environment
value: production
action: insert
- key: internal_id
action: delete
- key: db.password
pattern: "^password=(?P<pass>\\S+)"
replacement: password=****
action: update
Resource Processor
processors:
resource:
attributes:
- key: service.name
value: my-service
action: insert
- key: host.name
from_attribute: hostname
action: insert
Span Processor
processors:
span:
name:
to_attributes:
rules:
- ^/api/(?P<method>[^/]+)
- ^(?P<operation>[^/]+)
status:
code: Error
description: Span has error status
Exporters to Backends
Prometheus Exporter
exporters:
prometheus:
endpoint: 0.0.0.0:8888
resource_to_telemetry_conversion:
enabled: true
Jaeger Exporter
exporters:
jaeger/grpc:
endpoint: localhost:14250
tls:
insecure: true
jaeger/http:
endpoint: http://localhost:14268/api/traces
OTLP Exporters
exporters:
# Export to external OTLP backend
otlp/grpc:
endpoint: otel-backend.example.com:4317
tls:
insecure: false
# Export to Grafana Cloud
otlp/http:
endpoint: https://otlp-gateway-prod-us-central-1.grafana.net/otlp
headers:
Authorization: "Bearer YOUR_TOKEN"
Multiple Backends
exporters:
prometheus:
endpoint: 0.0.0.0:8888
jaeger/grpc:
endpoint: jaeger:14250
datadog:
api:
key: YOUR_API_KEY
site: datadoghq.com
otlp/datadog-apm:
endpoint: http://localhost:4317
Service Configuration
Basic Service Setup
service:
pipelines:
# Traces pipeline
traces:
receivers: [otlp/grpc, otlp/http, jaeger]
processors: [memory_limiter, batch, sampling]
exporters: [jaeger/grpc, otlp/grpc]
# Metrics pipeline
metrics:
receivers: [otlp/grpc, otlp/http, prometheus]
processors: [memory_limiter, batch]
exporters: [prometheus, otlp/http]
# Logs pipeline
logs:
receivers: [otlp/grpc, otlp/http, syslog, filelog]
processors: [memory_limiter, batch]
exporters: [otlp/grpc]
Multi-Backend Configuration
service:
pipelines:
traces:
receivers: [otlp/grpc]
processors: [memory_limiter, tail_sampling, batch]
exporters:
- jaeger/grpc
- otlp/datadog
- otlp/honeycomb
metrics:
receivers: [otlp/grpc, prometheus]
processors: [memory_limiter, batch]
exporters:
- prometheus
- otlp/datadog
- otlp/grafana
logs:
receivers: [otlp/grpc, syslog, filelog]
processors: [memory_limiter, batch]
exporters:
- otlp/grafana
- otlp/datadog
Advanced Scenarios
Conditional Routing
processors:
routing:
default_exporters:
- otlp/default
table:
- value: production
exporters: [otlp/prod]
- value: staging
exporters: [otlp/staging]
from_attribute: environment
service:
pipelines:
traces:
receivers: [otlp/grpc]
processors: [routing, batch]
exporters: [otlp/default, otlp/prod, otlp/staging]
Multi-Region Deployment
receivers:
otlp/us_east:
protocols:
grpc:
endpoint: 0.0.0.0:4317
otlp/eu_west:
protocols:
grpc:
endpoint: 0.0.0.0:4318
service:
pipelines:
traces:
receivers: [otlp/us_east, otlp/eu_west]
processors: [memory_limiter, batch]
exporters:
- jaeger/us_east
- jaeger/eu_west
Collector as Gateway
receivers:
otlp/app1:
protocols:
grpc:
endpoint: 0.0.0.0:4317
otlp/app2:
protocols:
grpc:
endpoint: 0.0.0.0:4318
processors:
resource/app1:
attributes:
- key: service.instance.id
from_attribute: ""
action: insert
- key: app
value: app1
action: insert
resource/app2:
attributes:
- key: app
value: app2
action: insert
service:
pipelines:
traces:
receivers: [otlp/app1]
processors: [resource/app1, memory_limiter, batch]
exporters: [jaeger/grpc]
traces/app2:
receivers: [otlp/app2]
processors: [resource/app2, memory_limiter, batch]
exporters: [jaeger/grpc]
Performance Tuning
Memory Configuration
processors:
memory_limiter:
check_interval: 1s
limit_mib: 1024 # Overall limit
spike_limit_mib: 256 # Spike allowance
batch:
send_batch_size: 2048
timeout: 10s
send_batch_max_size: 4096
extensions:
memory_ballast:
size_mib: 512 # Reserve memory
Queue Configuration
exporters:
otlp/grpc:
endpoint: backend:4317
# Retry configuration
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 5m
# Queue settings
sending_queue:
enabled: true
num_consumers: 10
queue_size: 5000
storage: file_storage
Telemetry Configuration
service:
telemetry:
logs:
level: info
metrics:
level: detailed
extensions:
pprof:
endpoint: :1888
zpages:
endpoint: :55679
Troubleshooting
Check Collector Status
# Service status
systemctl status otel-collector
# View logs
journalctl -u otel-collector -f
# Check health
curl http://localhost:13133/healthz
Debug Configuration
# Validate configuration
otelcontribcol validate --config=config.yml
# Run with debug logging
GODEBUG=http2debug=1 otelcontribcol --config=config.yml
# Enable pprof for profiling
curl http://localhost:1888/debug/pprof/
Monitor Metrics
# Export metrics endpoint
curl http://localhost:8888/metrics
# Check pipeline stats
curl http://localhost:8888/metrics | grep otelcol_
# Monitor memory usage
curl http://localhost:8888/metrics | grep process_runtime_go_
Conclusion
The OpenTelemetry Collector provides a flexible, vendor-neutral way to collect and process telemetry data. By following this guide, you've deployed a robust telemetry pipeline capable of handling traces, metrics, and logs from diverse sources. Focus on designing efficient processor chains, setting appropriate memory limits, and leveraging multiple exporters for comprehensive observability. The collector's flexibility makes it essential infrastructure in modern observability architectures.


