home / skills / bdambrosio / cognitive_workbench / test-json-sql-semantic-scholar

test-json-sql-semantic-scholar skill

/src/saved_plans/test-json-sql-semantic-scholar

This skill validates and analyzes JSON SQL primitives using semantic-scholar output to ensure accurate data extraction, filtering, and sorting.

npx playbooks add skill bdambrosio/cognitive_workbench --skill test-json-sql-semantic-scholar

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
565 B
---
name: test-json-sql-semantic-scholar
description: Test JSON SQL primitives with semantic-scholar output
type: plan
manual_only: true
parameters: []
---

# test-json-sql-semantic-scholar

Tests JSON SQL primitives (project, pluck, filter-structured, sort) with real semantic-scholar output.

## What it tests
- semantic-scholar returns Collection of paper Notes
- project extracts metadata.title, metadata.year, metadata.citations
- pluck extracts first title
- filter-structured filters by metadata.citations > 0
- sort orders by metadata.citations descending

Overview

This skill validates JSON-SQL primitives using real Semantic Scholar output to ensure structured extraction and transformation work as expected. It focuses on common operations—project, pluck, filter-structured, and sort—applied to a collection of paper records. The tests confirm correct handling of metadata fields like title, year, and citations.

How this skill works

The skill ingests a Collection of paper Notes returned by Semantic Scholar and applies a series of JSON-SQL operations. It uses project to extract metadata.title, metadata.year, and metadata.citations; pluck to retrieve the first title; filter-structured to keep papers with citations > 0; and sort to order results by metadata.citations in descending order. Each step produces deterministic outputs that can be asserted in test cases.

When to use it

  • When you need to validate JSON-SQL transformation primitives against real-world API output.
  • When verifying extraction of nested metadata fields from Semantic Scholar paper records.
  • When testing filtering logic that depends on numeric metadata like citation counts.
  • When confirming that sorting by a nested numeric field returns the correct order.

Best practices

  • Run tests with representative Semantic Scholar samples to cover edge cases (missing fields, zero citations).
  • Assert both structure and values after project and pluck operations to catch mapping errors early.
  • Include cases with identical citation counts to verify stable or defined sort behavior.
  • Use explicit type checks for numeric fields before comparison to avoid string/number issues.

Example use cases

  • Automated test to ensure an ingestion pipeline extracts title, year, and citation counts correctly.
  • Regression test after changing JSON-SQL implementation to ensure filter-structured retains only cited papers.
  • Quality check that the first title returned by pluck matches the expected primary title.
  • Performance test that confirms sort by metadata.citations produces descending order on large collections.

FAQ

What fields are validated by the tests?

The tests validate metadata.title, metadata.year, and metadata.citations, plus operations like pluck, filter-structured, and sort.

How does filter-structured decide which records to keep?

filter-structured retains papers where metadata.citations is greater than zero, ensuring only cited works remain.