home / skills / a5c-ai / babysitter / opentelemetry-integrator

opentelemetry-integrator skill

safe

/plugins/babysitter/skills/babysit/process/specializations/sdk-platform-development/skills/opentelemetry-integrator

This skill integrates OpenTelemetry tracing and metrics into SDKs, enabling distributed observability, context propagation, and exporter configurations.

npx playbooks add skill a5c-ai/babysitter --skill opentelemetry-integrator

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

2.0 KB

---
name: opentelemetry-integrator
description: Integrate OpenTelemetry tracing and metrics into SDKs
allowed-tools:
  - Read
  - Write
  - Edit
  - Glob
  - Grep
  - Bash
---

# OpenTelemetry Integrator Skill

## Overview

This skill integrates OpenTelemetry observability into SDKs, providing distributed tracing, metrics collection, and context propagation for comprehensive API monitoring.

## Capabilities

- Add tracing spans to SDK operations
- Export metrics (latency, errors, throughput)
- Configure context propagation (W3C Trace Context)
- Support multiple exporters (OTLP, Jaeger, Zipkin)
- Implement custom span attributes
- Configure sampling strategies
- Add semantic conventions for SDK operations
- Support baggage propagation

## Target Processes

- Observability Integration
- Telemetry and Analytics Integration
- Logging and Diagnostics

## Integration Points

- OpenTelemetry SDKs (all languages)
- Jaeger for distributed tracing
- Prometheus for metrics
- Grafana for visualization
- Cloud observability platforms

## Input Requirements

- Tracing requirements
- Metrics to collect
- Exporter configurations
- Sampling strategy
- Semantic convention mappings

## Output Artifacts

- OpenTelemetry instrumentation
- Custom span definitions
- Metrics collectors
- Exporter configurations
- Propagator setup
- Sampling configuration

## Usage Example

```yaml
skill:
  name: opentelemetry-integrator
  context:
    tracing:
      enabled: true
      propagator: w3c-trace-context
      sampling: parentBased
      sampleRate: 0.1
    metrics:
      enabled: true
      exportInterval: 30s
      metrics:
        - sdk.request.duration
        - sdk.request.count
        - sdk.error.count
    exporters:
      traces: otlp
      metrics: prometheus
    serviceName: "my-sdk"
```

## Best Practices

1. Follow OpenTelemetry semantic conventions
2. Use appropriate sampling rates
3. Propagate context across boundaries
4. Include useful span attributes
5. Avoid high-cardinality attributes
6. Configure exporters for production

Overview

This skill integrates OpenTelemetry observability into SDKs to add distributed tracing, metrics collection, and context propagation. It makes SDK calls measurable and debuggable by producing spans, metrics, and exporter-ready configurations for common backends.

How this skill works

The skill instruments SDK operations by creating spans around requests, recording metrics (latency, errors, throughput), and attaching semantic attributes. It configures propagators (W3C Trace Context, baggage), sampling strategies, and exporter bindings (OTLP, Jaeger, Zipkin, Prometheus) so telemetry flows to your observability backend.

When to use it

Adding distributed tracing to an SDK so downstream services can correlate requests.
Collecting latency, error, and throughput metrics from SDK operations.
Standardizing context propagation across language SDKs using W3C trace context.
Configuring exporters for local development and production observability platforms.
Implementing sampling and semantic conventions to control cost and clarity.

Best practices

Follow OpenTelemetry semantic conventions for span names and attributes.
Use appropriate sampling rates (production vs. dev) and parent-based strategies when possible.
Propagate context and baggage across process and network boundaries to enable end-to-end traces.
Avoid high-cardinality attributes (user IDs, full URLs) in spans to reduce storage and query cost.
Group metrics into meaningful buckets and set sensible export intervals to balance resolution and overhead.

Example use cases

Instrumenting an HTTP client in a JavaScript SDK to emit request spans and response latency metrics.
Configuring the SDK to export traces to OTLP in production and Jaeger for local debugging.
Adding custom span attributes for SDK operation IDs and error codes to speed root-cause analysis.
Enabling Prometheus metrics export for SDK throughput and error counts to drive alerting dashboards.
Applying parent-based sampling to reduce telemetry volume while preserving representative traces.

FAQ

Which exporters are supported?

OTLP, Jaeger, Zipkin for traces and Prometheus or OTLP for metrics; configurations are pluggable.

How do I control telemetry volume?

Use sampling strategies (rate or parent-based), reduce metric cardinality, and adjust export intervals.