home / skills / plaited / agent-eval-harness

plaited/agent-eval-harness

Evaluate AI agents with Unix-style pipeline commands. Schema-driven adapters for any CLI agent, trajectory capture, pass@k metrics, and multi-run comparison.

12 skills
GitHub

Sponsored

typescript-lsp

plaited/agent-eval-harness

2
Search TypeScript SYMBOLS (functions, types, classes) - NOT text. Use Glob to find files, Grep for text search, LSP for symbol search. Provides type-aware results that understand imports, exports, and relationships.
optimize-agents-md

plaited/agent-eval-harness

2
Optimize AGENTS.md and rules for token efficiency. Auto-invoked when user asks about improving agent instructions, compressing AGENTS.md, or making rules more effective.
scaffold-rules

plaited/agent-eval-harness

2
Scaffold development rules for AI coding agents. Auto-invoked when user asks about setting up rules, coding conventions, or configuring their AI agent environment.
optimize-agents-md@plaited_development-skills

plaited/agent-eval-harness

2
This skill helps optimize AGENTS.md and rules for token efficiency by compressing content and standardizing verification patterns.
scaffold-rules@plaited_development-skills

plaited/agent-eval-harness

2
This skill scaffolds development rules for AI coding agents, enabling quick setup of consistent conventions and environments.
agent-eval-harness

plaited/agent-eval-harness

2
This skill helps you evaluate CLI agent trajectories by capturing full runs and providing structured JSONL for downstream scoring.
validate-skill

plaited/agent-eval-harness

2
Validate skill directories against AgentSkills spec
typescript-lsp@plaited_development-skills

plaited/agent-eval-harness

2
This skill enables type-aware TypeScript symbol exploration using LSP to quickly navigate, inspect types, and verify exports before editing.
code-documentation@plaited_development-skills

plaited/agent-eval-harness

2
This skill provides TSDoc templates and guidelines to write, review, and maintain TypeScript documentation across modules and APIs.
validate-skill@plaited_development-skills

plaited/agent-eval-harness

2
This skill validates skill directories against the AgentSkills specification to ensure proper frontmatter, fields, and naming conventions.