home / skills / hoangnguyen0403 / agent-skills-standard / observability

observability skill

/skills/nestjs/observability

This skill implements structured logging with nestjs-pino, redacts sensitive data, and exposes Prometheus metrics for observability in NestJS applications.

npx playbooks add skill hoangnguyen0403/agent-skills-standard --skill observability

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.5 KB
---
name: NestJS Observability
description: Structured logging (Pino) and Prometheus metrics.
metadata:
  labels: [nestjs, logging, monitoring, pino]
  triggers:
    files: ['main.ts', '**/*.module.ts']
    keywords: [nestjs-pino, Prometheus, Logger, reqId]
---

# Observability Standards

## **Priority: P1 (OPERATIONAL)**

Logging, monitoring, and observability patterns for production applications.

- **Standard**: Use `nestjs-pino` for high-performance JSON logging.
  - **Why**: Node's built-in `console.log` is blocking and unstructured.
- **Configuration**:
  - **Redaction**: Mandatory masking of sensitive fields (`password`, `token`, `email`).
  - **Context**: Always inject `Logger` and set the context (`LoginService`).

## Tracing (Correlation)

- **Request ID**: Every log line **must** include a `reqId` (Request ID).
  - `nestjs-pino` handles this automatically using `AsyncLocalStorage`.
  - **Propagation**: Pass `x-request-id` to downstream microservices/database queries key to trace flows.

## Metrics

- **Exposure**: Use `@willsoto/nestjs-prometheus` to expose `/metrics` for Prometheus scraping.
- **Key Metrics**:
  1. `http_request_duration_seconds` (Histogram)
  2. `db_query_duration_seconds` (Histogram)
  3. `memory_usage_bytes` (Gauge)

## Health Checks

- **Terminus**: Implement explicit logic for "Liveness" (I'm alive) vs "Readiness" (I can take traffic).
  - **DB Check**: `TypeOrmHealthIndicator` / `PrismaHealthIndicator`.
  - **Memory Check**: Fail if Heap > 300MB (prevent crash loops).

Overview

This skill provides NestJS observability patterns focused on structured logging with Pino and Prometheus metrics. It codifies production-ready defaults: high-performance JSON logs, request correlation, sensitive-data redaction, and an opinionated metrics/health-check setup. The goal is repeatable, low-overhead instrumentation that works across services and deployments.

How this skill works

It recommends using nestjs-pino to emit structured JSON logs, automatically attaching a reqId via AsyncLocalStorage and encouraging contextual Logger injection. It prescribes redaction of sensitive fields and propagation of x-request-id to downstream systems. For metrics, it uses @willsoto/nestjs-prometheus to expose /metrics and registers key histograms and gauges. Terminus-based health checks distinguish liveness vs readiness and include DB and memory checks.

When to use it

  • New or existing NestJS services that need production-grade logging and metrics.
  • Microservices that require request correlation across HTTP and DB boundaries.
  • Applications that must expose Prometheus-compatible telemetry (/metrics).
  • Services operating under SLOs where latency and error metrics matter.
  • Deployments requiring clear liveness/readiness signals for orchestration.

Best practices

  • Use nestjs-pino for all application logs and inject Logger with a clear context (e.g., AuthService).
  • Redact sensitive fields (password, token, email) at the logging layer to avoid leaking secrets.
  • Ensure every log line includes reqId; propagate x-request-id to downstream calls and DB queries.
  • Expose /metrics via @willsoto/nestjs-prometheus and register histograms for request and DB durations.
  • Implement Terminus health checks: separate liveness and readiness; include DB and memory thresholds.

Example use cases

  • HTTP API where each request needs a unique reqId for debugging multi-service traces.
  • Backend service that records http_request_duration_seconds and db_query_duration_seconds for SLO reporting.
  • Deployment on Kubernetes that uses readiness probes and /metrics for autoscaling and alerting.
  • Service handling sensitive user data where logs must never contain raw passwords or tokens.
  • Microservice architecture where x-request-id is forwarded to downstream services and databases.

FAQ

Which logger should I use with NestJS?

Use nestjs-pino for high-performance structured JSON logging instead of console.log.

How do I ensure request correlation across services?

Include a reqId on every log line via AsyncLocalStorage and propagate x-request-id to downstream calls.