SigNoz Open-Source APM Installation

SigNoz is an open-source application performance monitoring (APM) tool built on OpenTelemetry that provides distributed tracing, metrics dashboards, and log management in a unified interface as a self-hosted alternative to Datadog or New Relic. This guide covers installing SigNoz with Docker, instrumenting applications with OpenTelemetry, analyzing traces, and setting up dashboards.

Prerequisites

  • Ubuntu 20.04+ or CentOS 8+ / Rocky Linux 8+
  • Docker and Docker Compose
  • 8 GB RAM minimum (SigNoz uses ClickHouse which is memory-intensive)
  • 50 GB disk space
  • Root or sudo access

Installing SigNoz with Docker Compose

# Clone the SigNoz repository
git clone -b main https://github.com/SigNoz/signoz.git && cd signoz/deploy/

# Check the docker-compose file
ls docker-swarm/clickhouse-setup/

# Start SigNoz (uses ClickHouse for storage)
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml up -d

# Monitor startup (ClickHouse takes 2-3 minutes to initialize)
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml logs -f

# Verify services are up
docker compose -f docker-swarm/clickhouse-setup/docker-compose.yaml ps

Services started:

  • clickhouse - the storage backend (port 9000/8123)
  • query-service - SigNoz backend API (port 8080)
  • frontend - SigNoz UI (port 3301)
  • otel-collector - OpenTelemetry collector (ports 4317 gRPC, 4318 HTTP)
  • alertmanager - alert routing

Check that everything is running:

curl http://localhost:8080/api/v1/health
# Expected: {"status":"ok"}

Accessing the SigNoz UI

Access the SigNoz dashboard at http://your-server:3301.

On first access, create an admin account:

  1. Enter your email and password
  2. Complete the organization setup
  3. You'll land on the Services dashboard

The default view shows services auto-discovered from OpenTelemetry data. Initially it's empty until you instrument your applications.

Expose SigNoz behind Nginx with TLS:

server {
    listen 443 ssl;
    server_name signoz.example.com;

    ssl_certificate /etc/letsencrypt/live/signoz.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/signoz.example.com/privkey.pem;

    location / {
        proxy_pass http://127.0.0.1:3301;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }

    location /api {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
    }
}

Instrumenting Applications with OpenTelemetry

SigNoz receives data via the OpenTelemetry Collector on port 4317 (gRPC) or 4318 (HTTP).

Python (FastAPI/Flask):

pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
# app.py (FastAPI example)
from fastapi import FastAPI
from opentelemetry import trace
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

app = FastAPI()
FastAPIInstrumentor.instrument_app(app)

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    return {"user_id": user_id}

Run with auto-instrumentation:

OTEL_RESOURCE_ATTRIBUTES="service.name=my-python-api" \
OTEL_EXPORTER_OTLP_ENDPOINT="http://signoz-host:4317" \
OTEL_EXPORTER_OTLP_PROTOCOL="grpc" \
opentelemetry-instrument python app.py

Node.js:

npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node \
    @opentelemetry/exporter-trace-otlp-grpc
// tracing.js - require this FIRST before your app code
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { getNodeAutoInstrumentations } = require("@opentelemetry/auto-instrumentations-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-grpc");
const { Resource } = require("@opentelemetry/resources");
const { SemanticResourceAttributes } = require("@opentelemetry/semantic-conventions");

const sdk = new NodeSDK({
    resource: new Resource({
        [SemanticResourceAttributes.SERVICE_NAME]: "my-node-api",
    }),
    traceExporter: new OTLPTraceExporter({
        url: "http://signoz-host:4317",
    }),
    instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();
node -r ./tracing.js app.js

Go:

// tracer.go
package main

import (
    "context"
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    "go.opentelemetry.io/otel/sdk/resource"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
    semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
)

func initTracer() *sdktrace.TracerProvider {
    ctx := context.Background()

    exporter, _ := otlptracegrpc.New(ctx,
        otlptracegrpc.WithInsecure(),
        otlptracegrpc.WithEndpoint("signoz-host:4317"),
    )

    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exporter),
        sdktrace.WithResource(resource.NewWithAttributes(
            semconv.SchemaURL,
            semconv.ServiceName("my-go-api"),
        )),
    )
    otel.SetTracerProvider(tp)
    return tp
}

Java:

# Download the OpenTelemetry Java agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

# Run your application with the agent
java -javaagent:opentelemetry-javaagent.jar \
     -Dotel.service.name=my-java-app \
     -Dotel.exporter.otlp.endpoint=http://signoz-host:4317 \
     -Dotel.exporter.otlp.protocol=grpc \
     -jar myapp.jar

Analyzing Traces

After instrumenting your applications, traces appear in the SigNoz UI:

  1. Services tab - overview of all instrumented services with:

    • Request rate (RPS)
    • Error rate (%)
    • P99 latency
    • Apdex score
  2. Traces tab - search and filter individual traces:

    • Filter by service, operation, status, duration
    • Search by trace ID or span tags
    • View the full distributed trace waterfall
  3. Trace detail view shows:

    • Span tree with timing for each operation
    • Database queries with execution time
    • HTTP calls to downstream services
    • Error spans highlighted in red
    • Span attributes (HTTP method, URL, DB query, etc.)

Metrics Dashboards

SigNoz auto-collects the following metrics from instrumented services:

  • http_server_duration - histogram of HTTP request durations
  • db_client_operation_duration - database query times
  • rpc_server_duration - gRPC call durations

Create custom dashboards:

  1. Dashboards+ New Dashboard
  2. Add panels:
    • Time series for metrics over time
    • Value for single stat (current error rate)
    • Table for top slow endpoints

Example PromQL query for error rate:

sum(rate(http_server_duration_count{http_status_code=~"5.."}[5m]))
/
sum(rate(http_server_duration_count[5m]))
* 100

Send custom metrics from your application:

# Python custom metrics
from opentelemetry import metrics

meter = metrics.get_meter("my-app")
request_counter = meter.create_counter("app.requests.total")
queue_size = meter.create_observable_gauge("app.queue.size",
    callbacks=[lambda: [(len(my_queue), {})]],
    description="Current queue size"
)

# In your request handler:
request_counter.add(1, {"endpoint": "/api/users", "method": "GET"})

Log Management

SigNoz can collect and search logs. Configure your application to send logs via OTLP:

# With Fluent Bit, forward logs to SigNoz
cat >> /etc/fluent-bit/fluent-bit.conf << 'EOF'
[OUTPUT]
    Name        opentelemetry
    Match       *
    Host        signoz-host
    Port        4318
    Logs_uri    /v1/logs
    Log_response_payload True
    tls         Off
    tls.verify  Off
EOF

Or use the OpenTelemetry Python logging handler:

import logging
from opentelemetry._logs import set_logger_provider
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor

logger_provider = LoggerProvider()
logger_provider.add_log_record_processor(
    BatchLogRecordProcessor(OTLPLogExporter(endpoint="http://signoz-host:4317"))
)
set_logger_provider(logger_provider)

handler = LoggingHandler(level=logging.NOTSET, logger_provider=logger_provider)
logging.getLogger().addHandler(handler)

Alerts

Set up alerts based on metrics:

  1. Alerts+ New Alert Rule
  2. Configure:
    • Metric based: use PromQL conditions
    • Log based: search for log patterns
    • Trace based: alert on high P99 latency

Example: alert when error rate exceeds 5%:

Expression: sum(rate(signoz_calls_me_seen_total{service_name="my-api", status_code="STATUS_CODE_ERROR"}[5m]))
            /
            sum(rate(signoz_calls_me_seen_total{service_name="my-api"}[5m])) > 0.05

Condition: above 0.05 for 5 minutes
Severity: critical

Configure notification channels in SettingsAlert Channels:

  • Slack webhook
  • PagerDuty integration key
  • Email (SMTP)
  • OpsGenie

Troubleshooting

No traces appearing:

# Verify the collector is receiving data
docker logs signoz-otel-collector | grep -i "error\|received"

# Test with a manual OTLP export
curl -X POST http://signoz-host:4318/v1/traces \
  -H 'Content-Type: application/json' \
  -d '{"resourceSpans":[]}'
# Should return HTTP 200

ClickHouse out of memory:

# Check ClickHouse memory usage
docker stats signoz-clickhouse

# Increase memory limit or reduce data retention
# Edit docker-compose and add:
# CLICKHOUSE_MEMORY_LIMIT=4294967296  (4 GB)

Frontend not loading:

docker logs signoz-frontend
# Check if query-service is healthy
curl http://localhost:8080/api/v1/health

gRPC connection refused from application:

# Verify port 4317 is accessible
nc -zv signoz-host 4317
# Check firewall rules
sudo ufw status | grep 4317

Conclusion

SigNoz delivers a complete open-source APM stack built on industry-standard OpenTelemetry, eliminating vendor lock-in while providing the distributed tracing, metrics, and log correlation that teams need to debug production issues. The single Docker Compose deployment is straightforward to operate, and the OpenTelemetry auto-instrumentation libraries cover most frameworks with minimal code changes. For production deployments, provision at least 16 GB RAM for ClickHouse and set up data retention policies to manage disk usage over time.