home / skills / bobmatnyc / claude-mpm-skills / reporting-pipelines

reporting-pipelines skill

/universal/data/reporting-pipelines

This skill generates and exports timestamped CSV/JSON/markdown reports from GitFlow analytics pipelines, with summaries and post-processing.

npx playbooks add skill bobmatnyc/claude-mpm-skills --skill reporting-pipelines

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
2.2 KB
---
name: reporting-pipelines
description: Reporting pipelines for CSV/JSON/Markdown exports with timestamped outputs, summaries, and post-processing.
version: 1.0.0
category: universal
author: Claude MPM Team
license: MIT
progressive_disclosure:
  entry_point:
    summary: "Generate CSV/JSON/markdown reports with timestamped filenames and summary outputs."
    when_to_use: "Building reporting flows, exporting analytics results, or standardizing CSV/JSON/markdown outputs across projects."
    quick_start: "1. Run the CLI that produces base data 2. Export CSV/JSON/markdown with timestamps 3. Save to reports/"
tags:
  - reporting
  - csv
  - json
  - markdown
  - analytics
---

# Reporting Pipelines

## Overview

Your reporting pattern is consistent across repos: run a CLI or script that emits structured data, then export CSV/JSON/markdown reports with timestamped filenames into `reports/` or `tests/results/`.

## GitFlow Analytics Pattern

```bash
# Basic run
gitflow-analytics -c config.yaml --weeks 8 --output ./reports

# Explicit analyze + CSV
gitflow-analytics analyze -c config.yaml --weeks 12 --output ./reports --generate-csv
```

Outputs include CSV + markdown narrative reports with date suffixes.

## EDGAR CSV Export Pattern

`edgar/scripts/create_csv_reports.py` reads a JSON results file and emits:

- `executive_compensation_<timestamp>.csv`
- `top_25_executives_<timestamp>.csv`
- `company_summary_<timestamp>.csv`

This script uses pandas for sorting and percentile calculations.

## Standard Pipeline Steps

1. **Collect base data** (CLI or JSON artifacts)
2. **Normalize** into rows/records
3. **Export** CSV/JSON/markdown with timestamp suffixes
4. **Summarize** key metrics in stdout
5. **Store** outputs in `reports/` or `tests/results/`

## Naming Conventions

- Use `YYYYMMDD` or `YYYYMMDD_HHMMSS` suffixes
- Keep one output directory per repo (`reports/` or `tests/results/`)
- Prefer explicit prefixes (e.g., `narrative_report_`, `comprehensive_export_`)

## Troubleshooting

- **Missing output**: ensure output directory exists and is writable.
- **Large CSVs**: filter or aggregate before export; keep summary CSVs for quick review.

## Related Skills

- `universal/data/sec-edgar-pipeline`
- `toolchains/universal/infrastructure/github-actions`

Overview

This skill provides a set of reporting pipelines that export structured project outputs as CSV, JSON, and Markdown files with timestamped filenames, summaries, and optional post-processing. It standardizes a simple flow: collect, normalize, export, summarize, and store outputs in a single reports directory. The goal is repeatable, timestamped exports suitable for auditing and downstream processing.

How this skill works

Pipelines ingest CLI or JSON artifacts, normalize data into rows/records (often using pandas), and write exports to a designated output folder with YYYYMMDD or YYYYMMDD_HHMMSS suffixes. Each run emits CSV/JSON/Markdown files plus a short stdout summary and supports optional post-processing steps like sorting, percentile calculations, and aggregation. Filenames use explicit prefixes (e.g., narrative_report_, executive_compensation_) to make outputs discoverable.

When to use it

  • Produce repeatable, timestamped exports for analytics or auditing.
  • Generate human-readable narrative reports alongside raw CSV/JSON outputs.
  • Store results in a single repository folder for CI or nightly jobs.
  • Create summary CSVs for quick review of large datasets.

Best practices

  • Write outputs into a dedicated directory (reports/ or tests/results/) and ensure it is writable before the run.
  • Use YYYYMMDD or YYYYMMDD_HHMMSS suffixes to avoid name collisions and enable easy sorting.
  • Include explicit prefixes to signal report intent (narrative_report_, comprehensive_export_, etc.).
  • Aggregate or filter large datasets before exporting full CSVs; keep summary CSVs for fast inspection.
  • Log a concise stdout summary listing key metrics and generated filenames for traceability.

Example use cases

  • Run a GitFlow analytics job that produces CSVs and Markdown narratives per release window with date suffixes.
  • Post-process EDGAR JSON outputs into multiple CSVs (executive compensation, top executives, company summaries) with timestamped filenames.
  • Nightly CI step that normalizes test artifacts and writes JSON/CSV snapshots to reports/ for downstream data pipelines.
  • Ad-hoc script that aggregates large logs into condensed summary CSVs and a narrative Markdown for stakeholders.

FAQ

What timestamp formats should I use?

Prefer YYYYMMDD or YYYYMMDD_HHMMSS so files sort chronologically and remain filesystem-friendly.

Where should I store outputs in my repo?

Use a single, dedicated folder such as reports/ or tests/results/ to keep exports organized and discoverable.