home / skills / 0xdarkmatter / claude-mods / python-observability-patterns
/skills/python-observability-patterns
This skill helps you implement Python observability patterns for structured logging, metrics, and tracing to improve monitoring and diagnostics.
npx playbooks add skill 0xdarkmatter/claude-mods --skill python-observability-patternsReview the files below or copy the command above to add this skill to your agents.
---
name: python-observability-patterns
description: "Observability patterns for Python applications. Triggers on: logging, metrics, tracing, opentelemetry, prometheus, observability, monitoring, structlog, correlation id."
compatibility: "Python 3.10+. Requires structlog, opentelemetry-api, prometheus-client."
allowed-tools: "Read Write"
depends-on: [python-async-patterns]
related-skills: [python-fastapi-patterns, python-cli-patterns]
---
# Python Observability Patterns
Logging, metrics, and tracing for production applications.
## Structured Logging with structlog
```python
import structlog
# Configure structlog
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
context_class=dict,
logger_factory=structlog.PrintLoggerFactory(),
)
logger = structlog.get_logger()
# Usage
logger.info("user_created", user_id=123, email="[email protected]")
# Output: {"event": "user_created", "user_id": 123, "email": "[email protected]", "level": "info", "timestamp": "2024-01-15T10:00:00Z"}
```
## Request Context Propagation
```python
import structlog
from contextvars import ContextVar
from uuid import uuid4
request_id_var: ContextVar[str] = ContextVar("request_id", default="")
def bind_request_context(request_id: str | None = None):
"""Bind request ID to logging context."""
rid = request_id or str(uuid4())
request_id_var.set(rid)
structlog.contextvars.bind_contextvars(request_id=rid)
return rid
# FastAPI middleware
@app.middleware("http")
async def request_context_middleware(request, call_next):
request_id = request.headers.get("X-Request-ID") or str(uuid4())
bind_request_context(request_id)
response = await call_next(request)
response.headers["X-Request-ID"] = request_id
structlog.contextvars.clear_contextvars()
return response
```
## Prometheus Metrics
```python
from prometheus_client import Counter, Histogram, Gauge, generate_latest
from fastapi import FastAPI, Response
# Define metrics
REQUEST_COUNT = Counter(
"http_requests_total",
"Total HTTP requests",
["method", "endpoint", "status"]
)
REQUEST_LATENCY = Histogram(
"http_request_duration_seconds",
"HTTP request latency",
["method", "endpoint"],
buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
)
ACTIVE_CONNECTIONS = Gauge(
"active_connections",
"Number of active connections"
)
# Middleware to record metrics
@app.middleware("http")
async def metrics_middleware(request, call_next):
ACTIVE_CONNECTIONS.inc()
start = time.perf_counter()
response = await call_next(request)
duration = time.perf_counter() - start
REQUEST_COUNT.labels(
method=request.method,
endpoint=request.url.path,
status=response.status_code
).inc()
REQUEST_LATENCY.labels(
method=request.method,
endpoint=request.url.path
).observe(duration)
ACTIVE_CONNECTIONS.dec()
return response
# Metrics endpoint
@app.get("/metrics")
async def metrics():
return Response(
content=generate_latest(),
media_type="text/plain"
)
```
## OpenTelemetry Tracing
```python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
# Setup
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="localhost:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
# Manual instrumentation
async def process_order(order_id: int):
with tracer.start_as_current_span("process_order") as span:
span.set_attribute("order_id", order_id)
with tracer.start_as_current_span("validate_order"):
await validate(order_id)
with tracer.start_as_current_span("charge_payment"):
await charge(order_id)
```
## Quick Reference
| Library | Purpose |
|---------|---------|
| structlog | Structured logging |
| prometheus-client | Metrics collection |
| opentelemetry | Distributed tracing |
| Metric Type | Use Case |
|-------------|----------|
| Counter | Total requests, errors |
| Histogram | Latencies, sizes |
| Gauge | Current connections, queue size |
## Additional Resources
- `./references/structured-logging.md` - structlog configuration, formatters
- `./references/metrics.md` - Prometheus patterns, custom metrics
- `./references/tracing.md` - OpenTelemetry, distributed tracing
## Assets
- `./assets/logging-config.py` - Production logging configuration
---
## See Also
**Prerequisites:**
- `python-async-patterns` - Async context propagation
**Related Skills:**
- `python-fastapi-patterns` - API middleware for metrics/tracing
- `python-cli-patterns` - CLI logging patterns
**Integration Skills:**
- `python-database-patterns` - Database query tracing
This skill provides pragmatic observability patterns for Python applications, covering structured logging, metrics, and distributed tracing. It supplies reusable middleware and configuration snippets for structlog, Prometheus client metrics, and OpenTelemetry tracing. The goal is reliable context propagation, actionable telemetry, and easy integration with common frameworks like FastAPI.
The skill inspects common observability touchpoints and offers code patterns to bind request context (request IDs) to logs, record Prometheus metrics in middleware, and emit OpenTelemetry spans for distributed traces. It includes examples for structlog configuration, request-context propagation with contextvars, Prometheus counters/histograms/gauges, and OTLP span exporting. Use the provided middleware and instrumentation snippets to ensure consistent metadata flows across logs, metrics, and traces.
How do I avoid high cardinality in Prometheus labels?
Use coarse-grained labels (method, endpoint group) and avoid user-specific or high-cardinality IDs; prefer aggregations or histograms for variability.
Where should I bind and clear request context?
Bind the request ID at the entry middleware and clear contextvars right after the response is returned to avoid leaking context between requests.
How do I export traces to a backend?
Configure an OTLP exporter with an appropriate endpoint (collector), add a BatchSpanProcessor, and tune sampling and batching for production throughput.