SigNoz Open-Source APM Installation
SigNoz is an open-source application performance monitoring (APM) tool built on OpenTelemetry that provides distributed tracing, metrics dashboards, and log management in a unified interface as a self-hosted alternative to Datadog or New Relic. This guide covers installing SigNoz with Docker, instrumenting applications with OpenTelemetry, analyzing traces, and setting up dashboards.
Prerequisites
- Ubuntu 20.04+ or CentOS 8+ / Rocky Linux 8+
- Docker and Docker Compose
- 8 GB RAM minimum (SigNoz uses ClickHouse which is memory-intensive)
- 50 GB disk space
- Root or sudo access
Installing SigNoz with Docker Compose
# Clone the SigNoz repository
git clone -b main https://github.com/SigNoz/signoz.git && cd signoz/deploy/
# Check the docker-compose file
ls docker-swarm/clickhouse-setup/
# Start SigNoz (uses ClickHouse for storage)
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml up -d
# Monitor startup (ClickHouse takes 2-3 minutes to initialize)
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml logs -f
# Verify services are up
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml ps
Services started:
clickhouse- the storage backend (port 9000/8123)query-service- SigNoz backend API (port 8080)frontend- SigNoz UI (port 3301)otel-collector- OpenTelemetry collector (ports 4317 gRPC, 4318 HTTP)alertmanager- alert routing
Check that everything is running:
curl http://localhost:8080/api/v1/health
# Expected: {"status":"ok"}
Accessing the SigNoz UI
Access the SigNoz dashboard at http://your-server:3301.
On first access, create an admin account:
- Enter your email and password
- Complete the organization setup
- You'll land on the Services dashboard
The default view shows services auto-discovered from OpenTelemetry data. Initially it's empty until you instrument your applications.
Expose SigNoz behind Nginx with TLS:
server {
listen 443 ssl;
server_name signoz.example.com;
ssl_certificate /etc/letsencrypt/live/signoz.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/signoz.example.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:3301;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
location /api {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
}
}
Instrumenting Applications with OpenTelemetry
SigNoz receives data via the OpenTelemetry Collector on port 4317 (gRPC) or 4318 (HTTP).
Python (FastAPI/Flask):
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
# app.py (FastAPI example)
from fastapi import FastAPI
from opentelemetry import trace
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
app = FastAPI()
FastAPIInstrumentor.instrument_app(app)
@app.get("/users/{user_id}")
async def get_user(user_id: int):
return {"user_id": user_id}
Run with auto-instrumentation:
OTEL_RESOURCE_ATTRIBUTES="service.name=my-python-api" \
OTEL_EXPORTER_OTLP_ENDPOINT="http://signoz-host:4317" \
OTEL_EXPORTER_OTLP_PROTOCOL="grpc" \
opentelemetry-instrument python app.py
Node.js:
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-grpc
// tracing.js - require this FIRST before your app code
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { getNodeAutoInstrumentations } = require("@opentelemetry/auto-instrumentations-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-grpc");
const { Resource } = require("@opentelemetry/resources");
const { SemanticResourceAttributes } = require("@opentelemetry/semantic-conventions");
const sdk = new NodeSDK({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: "my-node-api",
}),
traceExporter: new OTLPTraceExporter({
url: "http://signoz-host:4317",
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
node -r ./tracing.js app.js
Go:
// tracer.go
package main
import (
"context"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/sdk/resource"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
)
func initTracer() *sdktrace.TracerProvider {
ctx := context.Background()
exporter, _ := otlptracegrpc.New(ctx,
otlptracegrpc.WithInsecure(),
otlptracegrpc.WithEndpoint("signoz-host:4317"),
)
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName("my-go-api"),
)),
)
otel.SetTracerProvider(tp)
return tp
}
Java:
# Download the OpenTelemetry Java agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
# Run your application with the agent
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.service.name=my-java-app \
-Dotel.exporter.otlp.endpoint=http://signoz-host:4317 \
-Dotel.exporter.otlp.protocol=grpc \
-jar myapp.jar
Analyzing Traces
After instrumenting your applications, traces appear in the SigNoz UI:
-
Services tab - overview of all instrumented services with:
- Request rate (RPS)
- Error rate (%)
- P99 latency
- Apdex score
-
Traces tab - search and filter individual traces:
- Filter by service, operation, status, duration
- Search by trace ID or span tags
- View the full distributed trace waterfall
-
Trace detail view shows:
- Span tree with timing for each operation
- Database queries with execution time
- HTTP calls to downstream services
- Error spans highlighted in red
- Span attributes (HTTP method, URL, DB query, etc.)
Metrics Dashboards
SigNoz auto-collects the following metrics from instrumented services:
http_server_duration- histogram of HTTP request durationsdb_client_operation_duration- database query timesrpc_server_duration- gRPC call durations
Create custom dashboards:
- Dashboards → + New Dashboard
- Add panels:
- Time series for metrics over time
- Value for single stat (current error rate)
- Table for top slow endpoints
Example PromQL query for error rate:
sum(rate(http_server_duration_count{http_status_code=~"5.."}[5m]))
/
sum(rate(http_server_duration_count[5m]))
* 100
Send custom metrics from your application:
# Python custom metrics
from opentelemetry import metrics
meter = metrics.get_meter("my-app")
request_counter = meter.create_counter("app.requests.total")
queue_size = meter.create_observable_gauge("app.queue.size",
callbacks=[lambda: [(len(my_queue), {})]],
description="Current queue size"
)
# In your request handler:
request_counter.add(1, {"endpoint": "/api/users", "method": "GET"})
Log Management
SigNoz can collect and search logs. Configure your application to send logs via OTLP:
# With Fluent Bit, forward logs to SigNoz
cat >> /etc/fluent-bit/fluent-bit.conf << 'EOF'
[OUTPUT]
Name opentelemetry
Match *
Host signoz-host
Port 4318
Logs_uri /v1/logs
Log_response_payload True
tls Off
tls.verify Off
EOF
Or use the OpenTelemetry Python logging handler:
import logging
from opentelemetry._logs import set_logger_provider
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
logger_provider = LoggerProvider()
logger_provider.add_log_record_processor(
BatchLogRecordProcessor(OTLPLogExporter(endpoint="http://signoz-host:4317"))
)
set_logger_provider(logger_provider)
handler = LoggingHandler(level=logging.NOTSET, logger_provider=logger_provider)
logging.getLogger().addHandler(handler)
Alerts
Set up alerts based on metrics:
- Alerts → + New Alert Rule
- Configure:
- Metric based: use PromQL conditions
- Log based: search for log patterns
- Trace based: alert on high P99 latency
Example: alert when error rate exceeds 5%:
Expression: sum(rate(signoz_calls_me_seen_total{service_name="my-api", status_code="STATUS_CODE_ERROR"}[5m]))
/
sum(rate(signoz_calls_me_seen_total{service_name="my-api"}[5m])) > 0.05
Condition: above 0.05 for 5 minutes
Severity: critical
Configure notification channels in Settings → Alert Channels:
- Slack webhook
- PagerDuty integration key
- Email (SMTP)
- OpsGenie
Troubleshooting
No traces appearing:
# Verify the collector is receiving data
docker logs signoz-otel-collector | grep -i "error\|received"
# Test with a manual OTLP export
curl -X POST http://signoz-host:4318/v1/traces \
-H 'Content-Type: application/json' \
-d '{"resourceSpans":[]}'
# Should return HTTP 200
ClickHouse out of memory:
# Check ClickHouse memory usage
docker stats signoz-clickhouse
# Increase memory limit or reduce data retention
# Edit docker-compose and add:
# CLICKHOUSE_MEMORY_LIMIT=4294967296 (4 GB)
Frontend not loading:
docker logs signoz-frontend
# Check if query-service is healthy
curl http://localhost:8080/api/v1/health
gRPC connection refused from application:
# Verify port 4317 is accessible
nc -zv signoz-host 4317
# Check firewall rules
sudo ufw status | grep 4317
Conclusion
SigNoz delivers a complete open-source APM stack built on industry-standard OpenTelemetry, eliminating vendor lock-in while providing the distributed tracing, metrics, and log correlation that teams need to debug production issues. The single Docker Compose deployment is straightforward to operate, and the OpenTelemetry auto-instrumentation libraries cover most frameworks with minimal code changes. For production deployments, provision at least 16 GB RAM for ClickHouse and set up data retention policies to manage disk usage over time.


