home / skills / 0xdarkmatter / claude-mods / data-processing

data-processing skill

/skills/data-processing

This skill helps you parse, filter, and transform JSON, YAML, and TOML data from the command line using jq and yq.

npx playbooks add skill 0xdarkmatter/claude-mods --skill data-processing

Review the files below or copy the command above to add this skill to your agents.

Files (6)
SKILL.md
2.8 KB
---
name: data-processing
description: "Process JSON with jq and YAML/TOML with yq. Filter, transform, query structured data efficiently. Triggers on: parse JSON, extract from YAML, query config, Docker Compose, K8s manifests, GitHub Actions workflows, package.json, filter data."
compatibility: "Requires jq and yq CLI tools. Install: brew install jq yq (macOS)."
allowed-tools: "Bash Read"
---

# Data Processing

Query, filter, and transform structured data (JSON, YAML, TOML) efficiently from the command line.

## Tools

| Tool | Command | Use For |
|------|---------|---------|
| jq | `jq '.key' file.json` | JSON processing |
| yq | `yq '.key' file.yaml` | YAML/TOML processing |

## jq Essentials

```bash
# Extract single field
jq '.name' package.json

# Extract nested field
jq '.scripts.build' package.json

# Extract from array
jq '.dependencies[0]' package.json

# Extract multiple fields
jq '{name, version}' package.json

# Navigate deeply nested
jq '.data.users[0].profile.email' response.json

# Filter by condition
jq '.users[] | select(.active == true)' data.json

# Transform each element
jq '.users | map({id, name})' data.json

# Count elements
jq '.users | length' data.json

# Raw string output
jq -r '.name' package.json
```

## yq Essentials

```bash
# Extract field
yq '.name' config.yaml

# Extract nested
yq '.services.web.image' docker-compose.yml

# List all keys
yq 'keys' config.yaml

# List all service names (Docker Compose)
yq '.services | keys' docker-compose.yml

# Get container images (K8s)
yq '.spec.template.spec.containers[].image' deployment.yaml

# Update value (in-place)
yq -i '.version = "2.0.0"' config.yaml

# TOML to JSON
yq -p toml -o json '.' config.toml
```

## Quick Reference

| Task | jq | yq |
|------|----|----|
| Get field | `jq '.key'` | `yq '.key'` |
| Array element | `jq '.[0]'` | `yq '.[0]'` |
| Filter array | `jq '.[] \| select(.x)'` | `yq '.[] \| select(.x)'` |
| Transform | `jq 'map(.x)'` | `yq 'map(.x)'` |
| Count | `jq 'length'` | `yq 'length'` |
| Keys | `jq 'keys'` | `yq 'keys'` |
| Pretty print | `jq '.'` | `yq '.'` |
| Compact | `jq -c` | `yq -o json -I0` |
| Raw output | `jq -r` | `yq -r` |
| In-place edit | - | `yq -i` |

## When to Use

- Reading package.json dependencies
- Parsing Docker Compose configurations
- Analyzing Kubernetes manifests
- Processing GitHub Actions workflows
- Extracting data from API responses
- Filtering large JSON datasets
- Config file manipulation
- Data format conversion

## Additional Resources

For complete pattern libraries, load:

- `./references/jq-patterns.md` - Arrays, filtering, transformation, aggregation, output formatting
- `./references/yq-patterns.md` - Docker Compose, K8s, GitHub Actions, TOML, YAML modification
- `./references/config-files.md` - package.json, tsconfig, eslint/prettier patterns

Overview

This skill provides command-line data-processing patterns using jq for JSON and yq for YAML/TOML. It helps you filter, extract, transform, and convert structured data quickly. The focus is practical, repeatable commands for developer workflows like config inspection, manifest analysis, and automation.

How this skill works

The skill catalogs concise jq and yq examples and idioms for common tasks: extracting fields, filtering arrays, mapping/transforming items, counting elements, and converting formats. It highlights in-place edits with yq, raw output flags, and compact vs. pretty printing to fit scripts and CI. Use the snippets directly in shell pipelines to parse API responses, configuration files, Docker Compose, Kubernetes manifests, and package metadata.

When to use it

  • Inspect package.json or package-lock.json to list scripts, dependencies, or versions
  • Query Docker Compose or Kubernetes manifests to list services and container images
  • Filter large JSON API responses to extract only required records for downstream processing
  • Automate config updates (version bumps, field edits) in CI or local scripts with in-place yq edits
  • Convert TOML to JSON for tools that expect different formats
  • Quickly count, map, or aggregate items in logs or exported datasets

Best practices

  • Prefer -r (raw) for string output in scripts to avoid JSON quotes
  • Use keys, length, and select() to validate structure before transforming to avoid silent failures
  • Test jq/yq expressions on small samples before running on full files or in CI
  • Chain simple filters rather than long monolithic expressions for readability and maintainability
  • Commit small, reversible config edits; use version control or CI dry-run steps when applying in-place changes

Example use cases

  • List all container images in a Kubernetes deployment for inventory: yq '.spec.template.spec.containers[].image' deployment.yaml
  • Extract build script and version from package.json for release automation: jq '{name, version, build: .scripts.build}' package.json
  • Filter active users from a large API dump: jq '.users[] | select(.active==true)' response.json
  • Update a config value in-place: yq -i '.version = "2.0.0"' config.yaml
  • Convert config.toml to JSON for a tool that only accepts JSON: yq -p toml -o json '.' config.toml

FAQ

When should I prefer yq over jq?

Use yq for YAML and TOML files or when you need in-place edits; use jq for pure JSON workflows and advanced JSON-specific operations.

How do I avoid breaking CI when changing configs in-place?

Run expressions locally against a copy, use version control to track changes, and add a CI dry-run that prints the result instead of applying edits.