home / skills / bdambrosio / cognitive_workbench / filter-semantic

filter-semantic skill

/src/tools/filter-semantic

This skill filters a collection by applying natural-language predicates to each item using an LLM, returning items that match.

npx playbooks add skill bdambrosio/cognitive_workbench --skill filter-semantic

Review the files below or copy the command above to add this skill to your agents.

Files (2)
Skill.md
1.6 KB
---
name: filter-semantic
type: python
description: "Evaluate complex text criteria on each item in a Collection and return a new Collection of matching items. Use to identify relevant items in a Collection"
---

# filter-semantic

Apply flexible, natural-language filtering criteria to Collections. Evaluates each item against specified conditions using LLM.

## Input

- `target`: Collection ID or variable (required)
- `predicate`: String describing filter criteria (required)
- `mode`: "include" (return matches) or "exclude" (return non-matches) (optional, default: "include")

## Output

Success (`status: "success"`):
- `value`: Summary string (e.g., "3 items [Note_1, Note_2, Note_3]")
- `resource_id`: New Collection ID containing matching items

Failure (`status: "failed"`):
- `reason`: Error description

## Behavior

- Evaluates each item individually using LLM
- Supports semantic predicates ("has a positive tone"), boolean logic ("A AND B"), numeric/date predicates
- Empty result returns new empty Collection

## Planning Notes

- Use for content-based filtering beyond simple field comparisons
- For field-based selection, use `project` or `pluck` instead
- Predicate should describe what to match, not what to exclude (unless using `mode: "exclude"`)

## Examples

```json
{"type":"filter-semantic","target":"$collection","predicate":"contains code or implementation details","out":"$filtered"}
{"type":"filter-semantic","target":"$collection","predicate":"mentions safety AND published after 2024","out":"$filtered"}
{"type":"filter-semantic","target":"$collection","predicate":"purely theoretical","mode":"exclude","out":"$practical_only"}
```

Overview

This skill evaluates complex, natural-language criteria against every item in a Collection and returns a new Collection with the matching items. It uses an LLM to interpret semantic predicates like tone, intent, or content features that go beyond simple field comparisons. Use it when you need flexible, content-aware filtering across heterogeneous items.

How this skill works

Provide a target Collection and a predicate string describing what to match. The skill runs the predicate against each item using an LLM, supports boolean logic and numeric/date checks, and then returns a new Collection ID containing items that match (or non-matches when mode is set to exclude). It always returns a success or a failure status with a summary and resource id (or an error reason).

When to use it

  • Filter documents by tone, intent, or semantic content (e.g., "positive tone" or "mentions safety").
  • Select items that mention implementation details, examples, or code snippets.
  • Apply compound conditions using boolean logic (e.g., "A AND (B OR C)").
  • Find items with numeric or date-related properties described in natural language.
  • Exclude items that match a semantic descriptor by using mode: "exclude".

Best practices

  • Write predicates as what to match (not what to exclude) unless using mode: "exclude".
  • Be explicit with temporal or numeric terms (e.g., "published after 2024-01-01").
  • Keep predicates concise but specific to reduce ambiguity for the LLM.
  • Use this for content-based filtering; use simple field projection for exact field matches.
  • Expect an empty Collection result when no items satisfy the predicate.

Example use cases

  • Extract notes that contain code or step-by-step implementation details for a developer review.
  • Build a Collection of safety-related articles published after a given date.
  • Remove purely theoretical pieces to create a practical-only dataset using mode: "exclude".
  • Find customer feedback items with a positive tone mentioning a specific feature.
  • Filter entries that reference specific numeric thresholds or dates described in text.

FAQ

What inputs are required?

Provide the target Collection and a predicate string. Mode is optional and defaults to "include".

What happens if nothing matches?

The skill returns success with a new, empty Collection and a summary indicating zero matches.