home / skills / bdambrosio / cognitive_workbench / extract

extract skill

/src/tools/extract

This skill derives content from a single Note using an instruction-driven extraction or transformation, grounded in the input.

npx playbooks add skill bdambrosio/cognitive_workbench --skill extract

Review the files below or copy the command above to add this skill to your agents.

Files (2)
Skill.md
2.8 KB
---
name: extract
type: python
flattens_collections: false
description: "Derive content from a single Note via LLM-guided extraction, compression, or transformation. Output is grounded entirely in the input — no new information is introduced."
---

# extract

Derive content from a single Note's text via LLM-guided extraction or transformation.
Output is grounded entirely in the input — no new information is introduced, no
cross-document synthesis.

## Input

- `target`: Note (variable, ID, or name) — MUST be a single Note
- `instruction`: String describing what to extract or how to transform (required)
- `out`: Variable name for resulting Note

## Output

Success (`status: "success"`):
- `value`: Extracted or transformed content as a new Note

Failure (`status: "failed"`):
- `reason`: `"instruction parameter required"` | `"target parameter required"` |
  `"target is empty"` | `"target must be a single Note, not a Collection; use map(extract) for Collections"` | `"llm_generate_failed"`

## Invariants

- Output content is derived **only from existing input text**
- No new information is introduced
- Does NOT operate on Collections — use `map(extract)` for per-item extraction

## Planning Notes

**Use `extract` when:**
- Pulling specific facts or fields from a document
  ("extract the key architectural innovation as one sentence")
- Reshaping text format
  ("convert this abstract to bullet points", "rewrite as JSON with fields: method, result")
- Compressing a single document
  ("summarize this paper in 3 sentences", "extract only the methodology section")
- Normalizing or cleaning text
  ("remove citation markers", "standardize author name format")

**Do NOT use `extract` when:**
- Integrating across multiple documents → use `synthesize`
- Creating content from scratch → use `generate-note`
- Filtering items in a Collection → use `filter-structured` or `filter-semantic`
- Accessing structured metadata fields → use `project` or `pluck`

## Anti-Patterns

- ❌ `extract(target=$collection)` — Must be a single Note. Use `map(extract)` for Collections.
- ❌ `extract(target=$note, instruction="add a conclusion")` — Adds new content. Use `generate-note`.
- ❌ `extract(target=$note, instruction="compare with other papers")` — Cross-document. Use `synthesize`.

## Examples

```json
{"type":"extract","target":"$paper","instruction":"Extract the key architectural innovation as one sentence.","out":"$innovation"}
{"type":"extract","target":"$abstract","instruction":"Compress to 2-3 sentences retaining methodology and results.","out":"$compressed"}
{"type":"extract","target":"$paper","instruction":"Extract as JSON: {\"method\": ..., \"result\": ..., \"limitation\": ...}","out":"$structured"}
{"type":"map","target":"$papers","operation":"extract","instruction":"State the main contribution in one sentence.","out":"$contributions"}
```

Overview

This skill derives focused content from a single Note using LLM-guided extraction, compression, or transformation. Outputs are strictly grounded in the input Note — no new facts or cross-document synthesis are introduced. It is designed for precise reshaping or distillation of one document at a time.

How this skill works

You provide a single Note as the target and an instruction that specifies what to extract or how to transform the text. The skill runs an LLM-guided process that returns a new Note containing only content derived from the input. If the input is invalid or the model generation fails, the skill returns a clear failure reason.

When to use it

  • Pulling specific facts or fields from one document (e.g., extract the main claim in one sentence).
  • Compressing or summarizing a single Note (e.g., 2–3 sentence summary focused on methods and results).
  • Reshaping format without adding information (e.g., convert a paragraph to JSON with specified keys).
  • Normalizing or cleaning text inside one Note (e.g., remove citation markers or standardize author names).
  • Converting one Note into bullets, abstracts, or structured fields for downstream processing.

Best practices

  • Always target a single Note; use map(extract) when you need the same extraction applied to multiple Notes.
  • Make instructions explicit about scope and allowable content — specify fields, sentence limits, or format.
  • Avoid prompts that request new information, comparisons across documents, or opinions not present in the Note.
  • Validate the result against the source Note to ensure no extraneous content was introduced.
  • Use structured output requests (JSON, bullets) when downstream automation expects strict formats.

Example use cases

  • Extract the methodology section from a research paper Note and save it as a separate Note.
  • Compress an article Note into three sentences that preserve methods and results.
  • Transform an abstract into JSON with keys: method, result, limitation.
  • Clean a Note by removing citation markers and normalizing references.
  • Map over a collection of Notes with map(extract) to produce one-line contributions per paper.

FAQ

Can extract operate on multiple Notes at once?

No. extract requires a single Note. For multiple Notes, use map(extract) to apply the same instruction to each item.

Will extract introduce new information not present in the Note?

No. The skill is constrained to derive output solely from the input text and must not add external facts or synthesis.