home / skills / bobmatnyc / claude-mpm-skills / opentelemetry

This skill helps you implement OpenTelemetry observability by configuring traces, metrics, and logs with OTLP export and collector pipelines.

npx playbooks add skill bobmatnyc/claude-mpm-skills --skill opentelemetry

Review the files below or copy the command above to add this skill to your agents.

Files (5)
SKILL.md
1.5 KB
---
name: opentelemetry
description: "OpenTelemetry observability patterns: traces, metrics, logs, context propagation, OTLP export, Collector pipelines, and troubleshooting"
version: 1.0.0
category: universal
author: Claude MPM Team
license: MIT
progressive_disclosure:
  entry_point:
    summary: "Instrument services with OpenTelemetry and export OTLP traces/metrics/logs through a Collector for correlation and troubleshooting"
    when_to_use: "When building production observability, adding tracing to distributed systems, or standardizing telemetry across languages"
    quick_start: "1. Set service.name 2. Add auto-instrumentation 3. Export OTLP 4. Deploy Collector 5. Correlate logs with trace IDs"
  token_estimate:
    entry: 150
    full: 9000
context_limit: 900
tags:
  - observability
  - opentelemetry
  - tracing
  - metrics
  - logs
  - otlp
requires_tools: []
---

# OpenTelemetry

## Quick Start (signal design)

- Export OTLP via an OpenTelemetry Collector (vendor-neutral endpoint).
- Standardize resource attributes: `service.name`, `service.version`, `deployment.environment`.
- Start with auto-instrumentation, then add manual spans and log correlation.

## Load Next (References)

- `references/concepts.md` — traces/metrics/logs, context propagation, sampling, semantic conventions
- `references/collector-and-otlp.md` — Collector pipelines, processors, deployment patterns, tail sampling
- `references/instrumentation-and-troubleshooting.md` — manual spans, propagation pitfalls, cardinality, debugging

Overview

This skill codifies OpenTelemetry observability patterns for traces, metrics, logs, context propagation, OTLP export, Collector pipelines, and troubleshooting. It guides practical decisions: signal design, resource attributes, auto-instrumentation, and when to add manual spans and log correlation. The goal is vendor-neutral observability that scales and debugs effectively.

How this skill works

The skill inspects instrumentation choices and recommends signal design: how to export OTLP to a Collector, which resource attributes to standardize, and where to use auto-instrumentation vs manual spans. It explains Collector pipeline components (receivers, processors, exporters), tail-sampling patterns, and propagation pitfalls. It also provides concrete troubleshooting steps for cardinality, propagation, and debug tracing.

When to use it

  • Starting observability for a new service or microservice suite
  • Migrating vendor-specific tracing to a vendor-neutral OTLP Collector
  • Expanding from auto-instrumentation to manual spans and log correlation
  • Designing Collector pipelines for batching, sampling, or enrichment
  • Investigating propagation, sampling, or high-cardinality issues

Best practices

  • Export OTLP to an OpenTelemetry Collector as a stable, vendor-neutral endpoint
  • Standardize core resource attributes: service.name, service.version, deployment.environment
  • Start with auto-instrumentation for breadth, then add manual spans for critical business operations
  • Correlate logs, traces, and metrics via consistent trace and span IDs in logs
  • Control cardinality: avoid unbounded tag/attribute values on high-cardinality fields
  • Use tail-sampling or pre-sampling in the Collector for cost-effective trace retention

Example use cases

  • Deploy an OTLP exporter from services to a central Collector that forwards to multiple backends
  • Add manual spans for payment or checkout flows while keeping auto-instrumentation elsewhere
  • Use the Collector to enrich resource attributes and apply batching and retry policies
  • Implement tail-sampling in the Collector to preserve important traces while lowering storage costs
  • Troubleshoot missing context propagation across async boundaries by verifying propagators and injected headers

FAQ

Should I send telemetry directly to a backend or via a Collector?

Use a Collector as a vendor-neutral aggregation point. It lets you centralize batching, sampling, enrichment, and multi-backend export without changing service code.

When do I add manual spans if I already have auto-instrumentation?

Add manual spans for business-critical operations, external calls with custom semantics, or where you need clearer boundaries and attributes beyond auto-instrumentation.