home / skills / jeremylongshore / claude-code-plugins-plus-skills / deepgram-observability

deepgram-observability skill

safe

/plugins/saas-packs/deepgram-pack/skills/deepgram-observability

This skill helps you implement comprehensive observability for Deepgram integrations, enabling metrics, traces, logs, and alerts across services.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill deepgram-observability

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

13.8 KB

---
name: deepgram-observability
description: |
  Set up comprehensive observability for Deepgram integrations with metrics, traces, and alerts.
  Use when implementing monitoring for Deepgram operations, setting up dashboards,
  or configuring alerting for Deepgram integration health.
  Trigger with phrases like "deepgram monitoring", "deepgram metrics",
  "deepgram observability", "monitor deepgram", "deepgram alerts", "deepgram tracing".
allowed-tools: Read, Write, Edit, Bash(kubectl:*), Bash(curl:*)
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Deepgram Observability

## Overview
Implement comprehensive observability for Deepgram integrations including metrics, distributed tracing, logging, and alerting.

## Prerequisites
- Prometheus or compatible metrics backend
- OpenTelemetry SDK installed
- Grafana or similar dashboarding tool
- AlertManager configured

## Observability Pillars

| Pillar | Tool | Purpose |
|--------|------|---------|
| Metrics | Prometheus | Performance & usage tracking |
| Traces | OpenTelemetry | Request flow visibility |
| Logs | Structured JSON | Debugging & audit |
| Alerts | AlertManager | Incident notification |

## Instructions

### Step 1: Set Up Metrics Collection
Implement Prometheus counters, histograms, and gauges for key operations.

### Step 2: Add Distributed Tracing
Integrate OpenTelemetry for end-to-end request tracing.

### Step 3: Configure Structured Logging
Set up JSON logging with consistent field names.

### Step 4: Create Alert Rules
Define alerting rules for error rates and latency.

## Examples

### Prometheus Metrics
```typescript
// lib/metrics.ts
import { Registry, Counter, Histogram, Gauge, collectDefaultMetrics } from 'prom-client';

export const registry = new Registry();
collectDefaultMetrics({ register: registry });

// Request counters
export const transcriptionRequests = new Counter({
  name: 'deepgram_transcription_requests_total',
  help: 'Total number of transcription requests',
  labelNames: ['status', 'model', 'type'],
  registers: [registry],
});

// Latency histogram
export const transcriptionLatency = new Histogram({
  name: 'deepgram_transcription_latency_seconds',
  help: 'Transcription request latency in seconds',
  labelNames: ['model', 'type'],
  buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 120],
  registers: [registry],
});

// Audio duration processed
export const audioProcessed = new Counter({
  name: 'deepgram_audio_processed_seconds_total',
  help: 'Total audio duration processed in seconds',
  labelNames: ['model'],
  registers: [registry],
});

// Active connections gauge
export const activeConnections = new Gauge({
  name: 'deepgram_active_connections',
  help: 'Number of active Deepgram connections',
  labelNames: ['type'],
  registers: [registry],
});

// Rate limit hits
export const rateLimitHits = new Counter({
  name: 'deepgram_rate_limit_hits_total',
  help: 'Number of rate limit responses',
  registers: [registry],
});

// Cost tracking
export const estimatedCost = new Counter({
  name: 'deepgram_estimated_cost_dollars',
  help: 'Estimated cost in dollars',
  labelNames: ['model'],
  registers: [registry],
});

// Metrics endpoint
export async function getMetrics(): Promise<string> {
  return registry.metrics();
}
```

### Instrumented Transcription Client
```typescript
// lib/instrumented-client.ts
import { createClient, DeepgramClient } from '@deepgram/sdk';
import {
  transcriptionRequests,
  transcriptionLatency,
  audioProcessed,
  estimatedCost,
} from './metrics';
import { trace, context, SpanStatusCode } from '@opentelemetry/api';
import { logger } from './logger';

const tracer = trace.getTracer('deepgram-client');

const modelCosts: Record<string, number> = {
  'nova-2': 0.0043,
  'nova': 0.0043,
  'base': 0.0048,
};

export class InstrumentedDeepgramClient {
  private client: DeepgramClient;

  constructor(apiKey: string) {
    this.client = createClient(apiKey);
  }

  async transcribeUrl(url: string, options: { model?: string } = {}) {
    const model = options.model || 'nova-2';
    const startTime = Date.now();

    return tracer.startActiveSpan('deepgram.transcribe', async (span) => {
      span.setAttribute('deepgram.model', model);
      span.setAttribute('deepgram.audio_url', url);

      try {
        const { result, error } = await this.client.listen.prerecorded.transcribeUrl(
          { url },
          { model, smart_format: true }
        );

        const duration = (Date.now() - startTime) / 1000;

        if (error) {
          transcriptionRequests.labels('error', model, 'prerecorded').inc();
          span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });

          logger.error('Transcription failed', {
            model,
            error: error.message,
            duration,
          });

          throw error;
        }

        // Record metrics
        transcriptionRequests.labels('success', model, 'prerecorded').inc();
        transcriptionLatency.labels(model, 'prerecorded').observe(duration);

        const audioDuration = result.metadata.duration;
        audioProcessed.labels(model).inc(audioDuration);

        const cost = (audioDuration / 60) * (modelCosts[model] || 0.0043);
        estimatedCost.labels(model).inc(cost);

        span.setAttribute('deepgram.request_id', result.metadata.request_id);
        span.setAttribute('deepgram.audio_duration', audioDuration);
        span.setAttribute('deepgram.processing_time', duration);
        span.setStatus({ code: SpanStatusCode.OK });

        logger.info('Transcription completed', {
          requestId: result.metadata.request_id,
          model,
          audioDuration,
          processingTime: duration,
          cost,
        });

        return result;
      } catch (err) {
        const duration = (Date.now() - startTime) / 1000;
        transcriptionRequests.labels('exception', model, 'prerecorded').inc();
        transcriptionLatency.labels(model, 'prerecorded').observe(duration);

        span.setStatus({
          code: SpanStatusCode.ERROR,
          message: err instanceof Error ? err.message : 'Unknown error',
        });

        logger.error('Transcription exception', {
          model,
          error: err instanceof Error ? err.message : 'Unknown',
          duration,
        });

        throw err;
      } finally {
        span.end();
      }
    });
  }
}
```

### OpenTelemetry Configuration
```typescript
// lib/tracing.ts
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

const sdk = new NodeSDK({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'deepgram-service',
    [SemanticResourceAttributes.SERVICE_VERSION]: process.env.VERSION || '1.0.0',
    [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: process.env.NODE_ENV || 'development',
  }),
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://localhost:4317',
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-http': {
        ignoreIncomingPaths: ['/health', '/metrics'],
      },
    }),
  ],
});

export function initTracing(): void {
  sdk.start();

  process.on('SIGTERM', () => {
    sdk.shutdown()
      .then(() => console.log('Tracing terminated'))
      .catch((error) => console.error('Error terminating tracing', error))
      .finally(() => process.exit(0));
  });
}
```

### Structured Logging
```typescript
// lib/logger.ts
import pino from 'pino';

export const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level: (label) => ({ level: label }),
  },
  base: {
    service: 'deepgram-service',
    version: process.env.VERSION || '1.0.0',
    environment: process.env.NODE_ENV || 'development',
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Specialized loggers
export const transcriptionLogger = logger.child({ component: 'transcription' });
export const metricsLogger = logger.child({ component: 'metrics' });
export const alertLogger = logger.child({ component: 'alerts' });
```

### Grafana Dashboard Configuration
```json
{
  "dashboard": {
    "title": "Deepgram Transcription Service",
    "panels": [
      {
        "title": "Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "sum(rate(deepgram_transcription_requests_total[5m])) by (status)",
            "legendFormat": "{{status}}"
          }
        ]
      },
      {
        "title": "Latency (P95)",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, sum(rate(deepgram_transcription_latency_seconds_bucket[5m])) by (le, model))",
            "legendFormat": "{{model}}"
          }
        ]
      },
      {
        "title": "Audio Processed (per hour)",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(increase(deepgram_audio_processed_seconds_total[1h]))/60",
            "legendFormat": "Minutes"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "gauge",
        "targets": [
          {
            "expr": "sum(rate(deepgram_transcription_requests_total{status='error'}[5m])) / sum(rate(deepgram_transcription_requests_total[5m])) * 100"
          }
        ]
      },
      {
        "title": "Estimated Cost Today",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(increase(deepgram_estimated_cost_dollars[24h]))"
          }
        ]
      },
      {
        "title": "Active Connections",
        "type": "graph",
        "targets": [
          {
            "expr": "deepgram_active_connections",
            "legendFormat": "{{type}}"
          }
        ]
      }
    ]
  }
}
```

### AlertManager Rules
```yaml
# prometheus/rules/deepgram.yml
groups:
  - name: deepgram-alerts
    rules:
      - alert: DeepgramHighErrorRate
        expr: |
          sum(rate(deepgram_transcription_requests_total{status="error"}[5m])) /
          sum(rate(deepgram_transcription_requests_total[5m])) > 0.05
        for: 5m
        labels:
          severity: critical
          service: deepgram
        annotations:
          summary: "High Deepgram error rate (> 5%)"
          description: "Error rate is {{ $value | humanizePercentage }}"
          runbook: "https://wiki.example.com/runbooks/deepgram-errors"

      - alert: DeepgramHighLatency
        expr: |
          histogram_quantile(0.95,
            sum(rate(deepgram_transcription_latency_seconds_bucket[5m])) by (le)
          ) > 30
        for: 5m
        labels:
          severity: warning
          service: deepgram
        annotations:
          summary: "High Deepgram latency (P95 > 30s)"
          description: "P95 latency is {{ $value | humanizeDuration }}"

      - alert: DeepgramRateLimited
        expr: increase(deepgram_rate_limit_hits_total[1h]) > 10
        for: 0m
        labels:
          severity: warning
          service: deepgram
        annotations:
          summary: "Deepgram rate limiting detected"
          description: "{{ $value }} rate limit hits in the last hour"

      - alert: DeepgramCostSpike
        expr: |
          sum(increase(deepgram_estimated_cost_dollars[1h])) >
          sum(increase(deepgram_estimated_cost_dollars[1h] offset 1d)) * 2
        for: 30m
        labels:
          severity: warning
          service: deepgram
        annotations:
          summary: "Deepgram cost spike detected"
          description: "Current hour cost is 2x yesterday's average"

      - alert: DeepgramNoRequests
        expr: |
          sum(rate(deepgram_transcription_requests_total[15m])) == 0
          and sum(deepgram_transcription_requests_total) > 0
        for: 15m
        labels:
          severity: warning
          service: deepgram
        annotations:
          summary: "No Deepgram requests in 15 minutes"
          description: "Service may be down or disconnected"
```

### Health Check Endpoint
```typescript
// routes/health.ts
import express from 'express';
import { createClient } from '@deepgram/sdk';
import { getMetrics } from '../lib/metrics';

const router = express.Router();

interface HealthCheck {
  status: 'healthy' | 'degraded' | 'unhealthy';
  timestamp: string;
  checks: Record<string, {
    status: 'pass' | 'fail';
    latency?: number;
    message?: string;
  }>;
}

router.get('/health', async (req, res) => {
  const health: HealthCheck = {
    status: 'healthy',
    timestamp: new Date().toISOString(),
    checks: {},
  };

  // Check Deepgram API
  const startTime = Date.now();
  try {
    const client = createClient(process.env.DEEPGRAM_API_KEY!);
    const { error } = await client.manage.getProjects();

    health.checks.deepgram = {
      status: error ? 'fail' : 'pass',
      latency: Date.now() - startTime,
      message: error?.message,
    };
  } catch (err) {
    health.checks.deepgram = {
      status: 'fail',
      latency: Date.now() - startTime,
      message: err instanceof Error ? err.message : 'Unknown error',
    };
  }

  // Determine overall status
  const failedChecks = Object.values(health.checks).filter(c => c.status === 'fail');
  if (failedChecks.length > 0) {
    health.status = 'unhealthy';
  }

  const statusCode = health.status === 'healthy' ? 200 : 503;
  res.status(statusCode).json(health);
});

router.get('/metrics', async (req, res) => {
  res.set('Content-Type', 'text/plain');
  res.send(await getMetrics());
});

export default router;
```

## Resources
- [Prometheus Best Practices](https://prometheus.io/docs/practices/naming/)
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
- [Grafana Dashboard Examples](https://grafana.com/grafana/dashboards/)

## Next Steps
Proceed to `deepgram-incident-runbook` for incident response procedures.

Overview

This skill sets up comprehensive observability for Deepgram integrations with metrics, distributed tracing, structured logging, dashboards, and alerting. It provides ready-made Prometheus metrics, OpenTelemetry tracing configuration, JSON logging templates, Grafana dashboard panels, and AlertManager rules. Use it to instrument Deepgram clients, surface service health, and automate incident detection.

How this skill works

The skill instruments Deepgram operations to emit Prometheus counters, histograms, and gauges for request rates, latency, audio processed, rate-limit hits, and estimated cost. It adds OpenTelemetry spans around API calls to capture request attributes, durations, and errors, and configures a JSON logger with consistent fields for troubleshooting and audit. Grafana panels visualize key metrics and AlertManager rules notify on high error rates, latency spikes, rate limiting, cost anomalies, and service silence. A health endpoint aggregates Deepgram API checks and exposes metrics for scraping.

When to use it

When adding monitoring to a Deepgram transcription or streaming integration
When you need end-to-end traces to debug slow or failing requests
When creating dashboards to track usage, cost, and system health
When configuring alerts for error rate, latency, rate limits, or cost spikes
When you want a standardized health check and Prometheus metrics endpoint

Best practices

Instrument the client library layer so every API call records metrics and spans
Use semantic attributes (model, request_id, audio_duration) to make traces actionable
Keep metric label cardinality low—avoid free-form IDs in primary labels
Record histograms for latency and counters for success/error/rate-limit outcomes
Log structured JSON with consistent field names and service metadata
Drive alerts from error rate and high-percentile latency rather than individual request failures

Example use cases

Add the InstrumentedDeepgramClient wrapper to record metrics, traces, and cost estimates for prerecorded transcriptions
Expose a /metrics endpoint for Prometheus scraping and a /health endpoint that checks Deepgram API reachability
Create a Grafana dashboard showing request rate, P95 latency, active connections, and estimated daily cost
Configure AlertManager rules to notify on >5% error rate, P95 latency >30s, frequent rate limits, cost spikes, or no requests for 15 minutes
Use OpenTelemetry traces to correlate Deepgram processing time with upstream service latency

FAQ

What backends are required?

You need a Prometheus-compatible metrics backend, an OTLP-compatible trace collector, and a dashboard/alerting system such as Grafana and AlertManager.

How do I avoid high metric cardinality?

Limit label values to finite sets (model, type, status). Avoid user IDs or long identifiers as labels; include them in logs or span attributes instead.