home / skills / bdambrosio / cognitive_workbench / synthesize

synthesize skill

safe

This skill synthesizes content across multiple documents to reveal cross-document insights, compare sources, and generate coherent reports from a Collection.

npx playbooks add skill bdambrosio/cognitive_workbench --skill synthesize

Review the files below or copy the command above to add this skill to your agents.

Files (2)

Skill.md

4.0 KB

---
name: synthesize
type: python
flattens_collections: true
description: "Integrate content across multiple documents to produce new understanding. Use for cross-document synthesis, comparison, and reporting from Collections."
---

# synthesize

Integrate content across multiple documents (or between two documents) to produce
new understanding. Always crosses the document boundary — this is the tool for
combining, comparing, and generating insight from a Collection.

## Input

- `target`: Collection (variable or ID) — the primary input. May also be a single Note
  when `other` is provided for two-input comparison.
- `other`: Optional second Note or Collection for explicit comparison. When provided,
  the operation compares target against other.
- `focus`: Optional string guiding what to attend to
  ("architectural improvements", "methodology differences", "emerging trends")
- `format`: Output format (optional, default: `"narrative"`):
  - `"narrative"`: Prose synthesis
  - `"comparison"`: Structured JSON with similarity_score, shared_themes,
    unique_to_first, unique_to_second, contradictions
  - `"executive"`: High-level overview, 300-500 words
  - `"technical"`: Balanced detail with compression
  - `"comprehensive"`: Low compression, preserves nuance
- `compression_ratio`: Optional float (default: 3.0). Controls output length relative
  to input. Only meaningful for narrative/technical/comprehensive formats.
- `instruction`: Optional free-form instruction for specialized synthesis tasks.
  Overrides format-specific defaults when provided.
- `out`: Variable name for resulting Note

## Output

Success (`status: "success"`):
- `value`: Synthesized content as a new Note.
  - For `format="narrative"` / `"executive"` / `"technical"` / `"comprehensive"`: prose text
  - For `format="comparison"`: JSON string with structure:
    `{"similarity_score": 0.75, "shared_themes": [...], "unique_to_first": [...], "unique_to_second": [...], "contradictions": [...], "relationship": "...", "summary": "..."}`

Failure (`status: "failed"`):
- `reason`: `"target parameter required"` | `"target is empty"` |
  `"llm_generate_failed"` | `"comparison format requires 'other' parameter"`

## Behavior

- Flattens Collection items, applies focus filtering if `focus` provided
- Uses hierarchical map-reduce for long inputs (auto-chunking at ~16k chars)
- Focus filtering applies relevance threshold — chunks below threshold excluded
- When `other` is provided: both inputs are processed, then compared/integrated
- When `format="comparison"` and `other` is NOT provided: fails with error
- Output may include observations, patterns, and integrative conclusions not
  present in any single input document — this is by design

## Planning Notes

**Use `synthesize` when:**
- Identifying themes and trends across a Collection of papers
- Comparing two documents or Collections
- Producing a report from multiple sources
- Aggregating per-item extractions into a coherent narrative

**Do NOT use `synthesize` when:**
- Extracting content from a single document → use `extract`
- Creating content with no source material → use `generate-note`
- Filtering or selecting items → use `filter-structured` or `filter-semantic`
- Structural operations on Collections → use `project`, `sort`, `head`, etc.

**Standard analytical pipeline:**
1. `map(extract)` — per-item fact extraction
2. `synthesize` — cross-item integration

**For comparison:** use `format="comparison"` with `other=` (requires two inputs)

## Examples

```json
{"type":"synthesize","target":"$papers","focus":"significant architectural improvements","format":"technical","out":"$report"}
{"type":"synthesize","target":"$paper_a","other":"$paper_b","format":"comparison","instruction":"focus on methodology differences","out":"$comparison"}
{"type":"synthesize","target":"$innovations","focus":"dominant trends","format":"executive","out":"$executive_summary"}
{"type":"synthesize","target":"$extracted_methods","focus":"how attention mechanisms have evolved","format":"narrative","compression_ratio":2.0,"out":"$attention_report"}
```

Overview

This skill integrates content across multiple documents or between two documents to produce new understanding. It combines, compares, and distills information from Collections to surface themes, contrasts, and integrative conclusions. Use it to create narratives, structured comparisons, or executive and technical summaries from aggregated source material.

How this skill works

The skill flattens Collection items and applies optional focus filtering to keep only relevant chunks. It uses a hierarchical map-reduce approach for long inputs (auto-chunking around ~16k characters) and can compare two inputs when an explicit second Note or Collection is provided. Output format options control structure and compression: narrative, comparison (JSON), executive, technical, or comprehensive, and a custom instruction can override defaults.

When to use it

Identify themes and trends across a set of research papers or notes
Compare methodologies, results, or claims between two documents or Collections
Produce an executive summary or technical synthesis from many sources
Generate a structured JSON comparison highlighting shared and unique points
Aggregate item-level extractions into a coherent report

Best practices

Provide a clear focus string to guide relevance filtering when you need targeted synthesis
Choose format based on audience: executive for leaders, technical for practitioners, comprehensive to preserve nuance
Set compression_ratio to control brevity vs. detail for narrative/technical/comprehensive outputs
When comparing, always supply the 'other' parameter and use format="comparison" to get structured results
Pre-run a map(extract) step if you need consistent per-item facts before synthesis

Example use cases

Synthesize dominant research trends from a conference’s accepted papers into a 400-word executive overview
Compare two competing system designs and produce a JSON report showing shared themes and contradictions
Create a technical synthesis of methodological evolution across several studies with moderate compression
Integrate product feedback notes into a narrative prioritizing recurring usability issues
Aggregate per-article extractions into a comprehensive review preserving nuance for each source

FAQ

What happens if I omit the target?

The operation fails: target is required and must be non-empty.

Can I control output length?

Yes—set compression_ratio for narrative/technical/comprehensive formats; executive outputs aim for 300–500 words by default.