A Practical Guide to Integrating OpenTelemetry with REST APIs

Instrument REST APIs with OpenTelemetry: traces, metrics, logs, context propagation, code examples (Node.js, Python, Java, Go), Collector, and best practices.

ASOasis
8 min read
A Practical Guide to Integrating OpenTelemetry with REST APIs

Image used for representation purposes only.

Overview

OpenTelemetry (OTel) gives you a vendor‑neutral way to instrument REST APIs for traces, metrics, and logs. With it, you can follow a request across microservices, measure latency and throughput, and correlate errors with code paths—all without locking into a single observability vendor. This guide explains the core concepts, shows architecture patterns, and provides step‑by‑step examples for Node.js, Python, Java, and Go. You’ll also learn about context propagation, sampling, logs correlation, and production hardening.

Why OpenTelemetry for REST APIs

  • Standardization: One API, SDK, and data model (OTLP) across languages and frameworks.
  • Deep insight: End‑to‑end request tracing with span attributes for HTTP, database, cache, and external calls.
  • Efficiency: Batch exporters and the Collector minimize overhead and simplify routing.
  • Portability: Send data to any OTel‑compatible backend today, switch later without re‑instrumentation.

Core Concepts You’ll Use

  • Traces and spans: A trace represents a request’s journey; spans are timed operations within that journey (e.g., HTTP server receive, DB query).
  • Context propagation: Carries the trace context across process and network boundaries, usually via W3C Trace Context headers.
  • Metrics: Quantify performance (e.g., request count, latency histograms). Exemplars can link metrics to specific traces.
  • Logs: Discrete events. With OTel, logs can include trace_id and span_id for seamless correlation.
  • Resources: Describe the service (service.name, service.version, deployment.environment) once, applied to all telemetry.
  • SDK in your API: Auto‑ and manual‑instrumentation create spans/metrics/logs.
  • Export via OTLP: Prefer OTLP/gRPC (4317) or OTLP/HTTP (4318).
  • OpenTelemetry Collector (optional but recommended):
    • Receives data from services.
    • Adds processors (batch, attributes, filtering, tail‑sampling).
    • Exports to your observability backend(s). This split keeps app code simple and lets you change pipelines without redeploying services.

Semantic Conventions for HTTP

Adopt the HTTP semantic conventions so your data is consistent and queryable:

  • http.method, http.route, http.target, http.status_code
  • server.address, server.port
  • user_agent.original Name server spans as “HTTP GET /orders” (use the templated route, not concrete IDs). Record errors by setting span status and adding exception details.

Context Propagation in REST

Use W3C headers so traces stitch across services:

  • traceparent: the current trace and span IDs plus sampling flags
  • tracestate: vendor‑specific info (optional)
  • baggage: key=value pairs for non‑sensitive context (avoid PII)

Example headers you might see:

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
baggage: user_tier=premium,region=us-east

Forward these headers on outbound HTTP calls so the child service continues the same trace.

Step‑by‑Step: Language Examples

The examples below export directly via OTLP/HTTP for clarity. In production, point to a local Collector.

Node.js (Express)

Install:

npm i express @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-http @opentelemetry/exporter-metrics-otlp-http \
  @opentelemetry/resources @opentelemetry/semantic-conventions pino

Create instrumentation loader (start it before your app):

// otel.js
'use strict';
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-metrics-otlp-http');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

const sdk = new NodeSDK({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'orders-api',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.3.0',
    'deployment.environment': process.env.NODE_ENV || 'dev'
  }),
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_TRACES_ENDPOINT || 'http://localhost:4318/v1/traces'
  }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: process.env.OTEL_EXPORTER_OTLP_METRICS_ENDPOINT || 'http://localhost:4318/v1/metrics'
    })
  }),
  instrumentations: [getNodeAutoInstrumentations()]
});

sdk.start().then(() => console.log('OTel initialized'));
process.on('SIGTERM', () => sdk.shutdown());

Express app (ensure you require otel.js before app code):

// server.js
require('./otel');
const express = require('express');
const pino = require('pino');
const { context, trace } = require('@opentelemetry/api');

const app = express();
const logger = pino();

app.get('/orders/:id', async (req, res) => {
  const span = trace.getSpan(context.active());
  const sc = span && span.spanContext ? span.spanContext() : undefined;
  if (sc) logger.info({ trace_id: sc.traceId, span_id: sc.spanId }, 'fetching order');
  // Simulate work
  await new Promise(r => setTimeout(r, 20));
  res.json({ id: req.params.id, status: 'ok' });
});

app.listen(3000, () => console.log('Listening on :3000'));

Environment suggestions:

export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_PROPAGATORS=tracecontext,baggage
node server.js

Python (FastAPI)

Install:

pip install fastapi uvicorn opentelemetry-distro opentelemetry-exporter-otlp \
  opentelemetry-instrumentation-fastapi

Programmatic setup:

# app.py
import os
from fastapi import FastAPI
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

resource = Resource.create({
    'service.name': 'payments-api',
    'service.version': '0.4.2',
    'deployment.environment': os.getenv('ENV', 'dev')
})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(
    OTLPSpanExporter(endpoint=os.getenv('OTEL_EXPORTER_OTLP_TRACES_ENDPOINT', 'http://localhost:4318/v1/traces'))
))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)

app = FastAPI()

@app.get('/charge/{id}')
async def charge(id: str):
    with tracer.start_as_current_span('charge-work') as span:
        span.set_attribute('charge.id', id)
        return {'id': id, 'status': 'captured'}

Run:

OTEL_TRACES_SAMPLER=parentbased_traceidratio OTEL_TRACES_SAMPLER_ARG=0.2 \
uvicorn app:app --reload --port 8000

Alternatively, use the auto‑instrumentation launcher:

opentelemetry-instrument --traces_exporter otlp --metrics_exporter none \
  --service_name payments-api uvicorn app:app --reload

Java (Spring Boot with the Java Agent)

Add the agent and JVM options (no code changes required):

java -javaagent:/path/opentelemetry-javaagent.jar \
  -Dotel.service.name=inventory-api \
  -Dotel.exporter.otlp.endpoint=http://localhost:4317 \
  -Dotel.metrics.exporter=otlp -Dotel.logs.exporter=otlp \
  -Dotel.propagators=tracecontext,baggage \
  -jar target/app.jar

The agent auto‑instruments Spring MVC/WebFlux, JDBC, HTTP clients, etc. For custom spans, inject OTel and annotate specific work:

// Example inside a service method
var tracer = io.opentelemetry.api.GlobalOpenTelemetry.getTracer("custom");
var span = tracer.spanBuilder("reconcile-stock").startSpan();
try (var scope = span.makeCurrent()) {
    // business logic
} catch (Exception e) {
    span.recordException(e);
    span.setStatus(io.opentelemetry.api.trace.StatusCode.ERROR);
    throw e;
} finally {
    span.end();
}

Go (net/http)

Install:

go get go.opentelemetry.io/otel \
  go.opentelemetry.io/otel/sdk \
  go.opentelemetry.io/otel/attribute \
  go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp \
  go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp \
  go.opentelemetry.io/otel/propagation \
  go.opentelemetry.io/otel/semconv/v1.21.0

Initialize and serve:

package main

import (
  "context"
  "log"
  "net/http"

  "go.opentelemetry.io/otel"
  "go.opentelemetry.io/otel/attribute"
  "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
  "go.opentelemetry.io/otel/propagation"
  sdktrace "go.opentelemetry.io/otel/sdk/trace"
  "go.opentelemetry.io/otel/sdk/resource"
  semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
  "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
)

func initProvider(ctx context.Context) func(context.Context) error {
  exp, err := otlptracehttp.New(ctx, otlptracehttp.WithEndpoint("localhost:4318"), otlptracehttp.WithInsecure())
  if err != nil { log.Fatal(err) }
  res, _ := resource.Merge(resource.Default(), resource.NewWithAttributes(
    semconv.SchemaURL,
    semconv.ServiceName("shipping-api"),
    attribute.String("service.version", "2.0.0"),
  ))
  tp := sdktrace.NewTracerProvider(
    sdktrace.WithBatcher(exp),
    sdktrace.WithResource(res),
  )
  otel.SetTracerProvider(tp)
  otel.SetTextMapPropagator(propagation.TraceContext{})
  return tp.Shutdown
}

func orders(w http.ResponseWriter, r *http.Request) {
  w.Write([]byte("ok"))
}

func main() {
  ctx := context.Background()
  shutdown := initProvider(ctx)
  defer shutdown(ctx)

  mux := http.NewServeMux()
  mux.Handle("/orders", otelhttp.NewHandler(http.HandlerFunc(orders), "GET /orders"))

  log.Println("listening on :8080")
  http.ListenAndServe(":8080", mux)
}

OpenTelemetry Collector: Minimal Pipeline

Run a local Collector to buffer, transform, and route telemetry:

receivers:
  otlp:
    protocols:
      http:
      grpc:
processors:
  batch:
exporters:
  logging:
    loglevel: info
  otlp:
    endpoint: your-backend:4317
    tls:
      insecure: true
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, otlp]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, otlp]

Point your services at the Collector (e.g., OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 or 4317 for gRPC).

Sampling Strategies

  • Development: always_on for full fidelity.
  • Production: parentbased_traceidratio for stable sampling (e.g., 5–20%).
  • Tail sampling (Collector): decide after seeing spans (e.g., keep errors, slow requests). Useful for high‑traffic APIs with spiky issues.
  • Consistency: ensure inbound sampled traces continue sampled downstream to avoid broken stories.

Error Handling and Status

When exceptions occur:

  • Record the exception on the current span.
  • Set span status to ERROR. Example (Node.js):
try {
  // work
} catch (e) {
  const span = trace.getActiveSpan();
  if (span) {
    span.recordException(e);
    span.setStatus({ code: 2, message: e.message }); // StatusCode.ERROR = 2
  }
  throw e;
}

Logs Correlation

Include trace_id and span_id in logs so you can pivot from a log line to a trace.

  • Node.js (Pino): capture from the active span context.
  • Python: get_current_span() from opentelemetry.trace and enrich the log record.
  • Java: many logging appenders support automatic correlation when the agent is active.
  • Go: fetch trace.SpanFromContext(r.Context()) and print its SpanContext.

Minimal Node.js example already shown in the Express handler. For Python:

from opentelemetry.trace import get_current_span
span = get_current_span()
sc = span.get_span_context()
logger.info({ 'trace_id': format(sc.trace_id, '032x'), 'span_id': format(sc.span_id, '016x') })

Testing and Troubleshooting

  • Verify headers: call your API with a known traceparent.
curl -H "traceparent: 00-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-bbbbbbbbbbbbbbbb-01" \
  http://localhost:3000/orders/42
  • Check exports: use a logging exporter locally to confirm spans are created.
  • Version alignment: keep SDK, instrumentation libraries, and semantic conventions in compatible ranges.
  • Start order: load/init OTel before creating your HTTP server.
  • Time sync: ensure clocks are accurate (NTP) to avoid strange timelines.
  • Propagators: match propagators across services. Default is tracecontext,baggage; add B3 only if needed.

Performance and Security Considerations

  • Overhead: with batching and moderate sampling, overhead is typically minimal; measure in your environment.
  • Backpressure: use batch exporters and a local Collector to avoid blocking app threads.
  • Data minimization: avoid PII in spans and baggage. Prefer opaque IDs.
  • Transport security: encrypt traffic to remote Collectors/backends (mTLS/TLS). Avoid exposing Collector endpoints publicly.
  • Attribute hygiene: limit high‑cardinality labels (e.g., do not include raw user IDs in span names or metric labels).

Migration Tips and Anti‑Patterns

  • Don’t name spans with concrete IDs (e.g., “/orders/123”). Use templated routes.
  • Don’t create one massive span for an entire request; use child spans for key operations.
  • Don’t log stack traces without linking to the trace; correlate logs instead.
  • Do enrich spans with business‑relevant attributes (order.amount, customer.tier) that are safe and low‑cardinality.
  • Do standardize resource attributes across services for clean service maps.

Quick Checklist

  • Initialize OTel before your web framework.
  • Set service.name, service.version, deployment.environment.
  • Use HTTP semantic conventions and good span names.
  • Propagate W3C headers on outbound calls.
  • Export to a local Collector with batching enabled.
  • Configure production sampling (parent‑based ratio or tail sampling).
  • Correlate logs with trace_id/span_id.
  • Guard against PII and high‑cardinality attributes.

Conclusion

Integrating OpenTelemetry into a REST API is straightforward and pays dividends in reliability and speed of diagnosis. Start with auto‑instrumentation and sensible resource attributes, verify exports with a logging exporter, then deploy a Collector for control over sampling and routing. As you mature, add targeted manual spans, enrich attributes carefully, and correlate logs so every error and latency spike has a trace-backed explanation. With these practices, your REST APIs become observable, debuggable, and ready for growth.

Related Posts