home / skills / bdambrosio / cognitive_workbench / assess

assess skill

/src/tools/assess

This skill assesses text content against natural language predicates using chunked evaluation to return true if any segment matches.

npx playbooks add skill bdambrosio/cognitive_workbench --skill assess

Review the files below or copy the command above to add this skill to your agents.

Files (2)
Skill.md
1.2 KB
---
name: assess
description: Boolean test of text content against a natural language predicate. Features auto-chunking for long texts (returns "true" if ANY chunk matches).
type: python
flattens_collections: true
---

# assess

Semantic boolean testing. Evaluates natural language predicates against text content using LLM.

## Input

- `target`: String content to test (empty inputs return "false")
- `predicate`: Natural language question (e.g., "mentions specific dates?", "is critical of the author?")

## Output

Returns string `"true"` or `"false"` (lowercase string, not JSON boolean).

## Behavior

- **Auto-Chunking**: Texts >16k chars are split into boundary-aware chunks
- **OR Aggregation**: Returns `"true"` on first matching chunk (short-circuit), `"false"` only if all chunks fail
- **Fallback**: Returns `"false"` on ambiguous LLM responses

## Planning Notes

- Phrase predicates to detect *presence* rather than global summary (chunks are evaluated in isolation)
  - Good: "Contains mention of inflation?"
  - Risky: "Is the main topic inflation?"
- Every chunk requires an LLM call

## Example

```json
{"type":"assess","target":"$my_note","predicate":"is urgent?","out":"$urgency"}
```

Overview

This skill performs semantic boolean tests on text using natural language predicates. It returns the lowercase string "true" or "false" and is optimized to handle very long inputs via safe auto-chunking. The design short-circuits on the first matching chunk to keep costs and latency low.

How this skill works

You provide a target text and a predicate phrased as a presence check. The engine splits inputs longer than the chunk threshold into boundary-aware chunks and evaluates each chunk with an LLM. If any chunk satisfies the predicate the skill immediately returns "true"; if all chunks fail or the LLM is ambiguous it returns "false".

When to use it

  • Detect whether a document or note contains a specific fact, phrase, or entity.
  • Quickly screen long transcripts or logs for the presence of a topic or claim.
  • Automate triage rules that require semantic matching rather than keyword search.
  • Feed downstream workflows that need a boolean flag per document.
  • Pre-filter content before more expensive or structured analysis.

Best practices

  • Phrase predicates as presence checks (e.g., "mentions a deadline?", not "is the main topic X?").
  • Keep predicates short and specific to reduce LLM ambiguity.
  • Expect one LLM call per chunk; control chunk size to trade cost vs recall.
  • Treat a returned "false" as "no confident match" rather than absolute absence.
  • Test predicates on representative samples to refine wording.

Example use cases

  • Mark support tickets that "mention a refund" to route to billing.
  • Scan meeting transcripts and flag any chunk that "mentions action items".
  • Filter user-submitted content for phrases like "contains hate speech?" before moderation.
  • Detect documents that "mention expiration dates" for archival workflows.
  • Quickly identify notes that "refer to project X" to build topic indexes.

FAQ

What does the skill return for empty input?

Empty targets always return "false".

How are long texts handled?

Texts above the chunk threshold are split into boundary-aware chunks and evaluated independently; any matching chunk yields "true".

Is the result a boolean type?

No. Results are the lowercase strings "true" or "false".