home / skills / softaworks / agent-toolkit / datadog-cli

datadog-cli skill

/skills/datadog-cli

This skill helps you debug production issues by querying logs, metrics, and traces with the Datadog CLI.

npx playbooks add skill softaworks/agent-toolkit --skill datadog-cli

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
3.4 KB
---
name: datadog-cli
description: Datadog CLI for searching logs, querying metrics, tracing requests, and managing dashboards. Use this when debugging production issues or working with Datadog observability.
---

# Datadog CLI

A CLI tool for AI agents to debug and triage using Datadog logs and metrics.

## Required Reading

**You MUST read the relevant reference docs before using any command:**
- [Log Commands](references/logs-commands.md)
- [Metrics](references/metrics.md)
- [Query Syntax](references/query-syntax.md)
- [Workflows](references/workflows.md)
- [Dashboards](references/dashboards.md)

## Setup

### Environment Variables (Required)

```bash
export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"
```

Get keys from: https://app.datadoghq.com/organization-settings/api-keys

### Running the CLI

```bash
npx @leoflores/datadog-cli <command>
```

For non-US Datadog sites, use `--site` flag:
```bash
npx @leoflores/datadog-cli logs search --query "*" --site datadoghq.eu
```

## Commands Overview

| Command | Description |
|---------|-------------|
| `logs search` | Search logs with filters |
| `logs tail` | Stream logs in real-time |
| `logs trace` | Find logs for a distributed trace |
| `logs context` | Get logs before/after a timestamp |
| `logs patterns` | Group similar log messages |
| `logs compare` | Compare log counts between periods |
| `logs multi` | Run multiple queries in parallel |
| `logs agg` | Aggregate logs by facet |
| `metrics query` | Query timeseries metrics |
| `errors` | Quick error summary by service/type |
| `services` | List services with log activity |
| `dashboards` | Manage dashboards (CRUD) |
| `dashboard-lists` | Manage dashboard lists |


## Quick Examples

### Search Errors
```bash
npx @leoflores/datadog-cli logs search --query "status:error" --from 1h --pretty
```

### Tail Logs (Real-time)
```bash
npx @leoflores/datadog-cli logs tail --query "service:api status:error" --pretty
```

### Error Summary
```bash
npx @leoflores/datadog-cli errors --from 1h --pretty
```

### Trace Correlation
```bash
npx @leoflores/datadog-cli logs trace --id "abc123def456" --pretty
```

### Query Metrics
```bash
npx @leoflores/datadog-cli metrics query --query "avg:system.cpu.user{*}" --from 1h --pretty
```

### Compare Periods
```bash
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty
```

## Global Flags

| Flag | Description |
|------|-------------|
| `--pretty` | Human-readable output with colors |
| `--output <file>` | Export results to JSON file |
| `--site <site>` | Datadog site (e.g., `datadoghq.eu`) |

## Time Formats

- **Relative**: `30m`, `1h`, `6h`, `24h`, `7d`
- **ISO 8601**: `2024-01-15T10:30:00Z`

## Incident Triage Workflow

```bash
# 1. Quick error overview
npx @leoflores/datadog-cli errors --from 1h --pretty

# 2. Is this new? Compare to previous period
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty

# 3. Find error patterns
npx @leoflores/datadog-cli logs patterns --query "status:error" --from 1h --pretty

# 4. Narrow down by service
npx @leoflores/datadog-cli logs search --query "status:error service:api" --from 1h --pretty

# 5. Get context around a timestamp
npx @leoflores/datadog-cli logs context --timestamp "2024-01-15T10:30:00Z" --service api --pretty

# 6. Follow the distributed trace
npx @leoflores/datadog-cli logs trace --id "TRACE_ID" --pretty
```

See [workflows.md](references/workflows.md) for more debugging workflows.

Overview

This skill provides a Datadog CLI that helps search logs, query metrics, trace requests, and manage dashboards from the command line. It is designed for fast production debugging and observability workflows, giving agents and engineers a compact toolset for triage and investigation. Use it to gather context, correlate traces, and export results for reporting or automation.

How this skill works

The CLI issues Datadog API queries using the DD_API_KEY and DD_APP_KEY environment variables and supports site selection for non-US regions. Commands cover log search, real-time tailing, trace correlation, metric timeseries queries, error summaries, and CRUD operations for dashboards and lists. Global flags control human-readable output, JSON export, and time ranges, while specialized commands support aggregation, pattern detection, and period comparison.

When to use it

  • Triage production incidents to quickly identify error spikes and affected services.
  • Correlate distributed traces with log entries to follow request flows across services.
  • Run ad-hoc metric queries when investigating resource or latency anomalies.
  • Stream real-time logs during a deploy or live debug session.
  • Generate JSON exports of queries for postmortems, reporting, or automated pipelines.

Best practices

  • Set DD_API_KEY and DD_APP_KEY in a secure environment before running commands.
  • Start with the errors overview and period comparison to determine scope and novelty.
  • Use narrow queries (service, status, trace id) and then broaden as needed to avoid noise.
  • Prefer --pretty for interactive work and --output for reproducible artifacts or automation.
  • Read the query syntax and logs/metrics reference before crafting complex queries.

Example use cases

  • Quickly list services with recent error activity and drill down to the top offenders.
  • Tail API logs in real time while reproducing a bug on a staging or production replica.
  • Find all log events related to a distributed trace id and collect surrounding context.
  • Compare error counts between the last hour and the previous hour to detect regressions.
  • Query avg:system.cpu.user over the last 1h to confirm a CPU spike coincides with increased errors.

FAQ

What environment variables are required?

Set DD_API_KEY and DD_APP_KEY to authenticate with the Datadog API.

How do I target Datadog EU or other non-US sites?

Use the --site flag (for example --site datadoghq.eu) to point commands at a different Datadog site.