home / skills / dianel555 / dskills / exa

exa skill

safe

This skill enables high-precision semantic search and content extraction using Exa API to power deep research and code/document lookup.

npx playbooks add skill dianel555/dskills --skill exa

Review the files below or copy the command above to add this skill to your agents.

Files (4)

SKILL.md

5.2 KB

---
name: exa
description: |
  High-precision semantic search and content retrieval via Exa API. Use when: (1) Deep research requiring semantic understanding, (2) Code documentation and examples lookup, (3) Company/professional research, (4) AI-powered comprehensive research tasks, (5) URL content extraction with structured output. Triggers: "research", "find papers", "code examples", "company info", "LinkedIn profiles", "deep analysis". Differentiator: Exa excels at semantic/neural search while grok-search is better for real-time news and general web content.
---

# Exa Search

High-precision semantic search via Exa API. Standalone CLI only (no MCP dependency).

## Execution Method

Run `scripts/exa_cli.py` via Bash:

```bash
# Prerequisites: pip install httpx tenacity
# Environment: EXA_API_KEY (required), EXA_API_URL (optional, default: https://api.exa.ai)
```

## Available Tools

### Search Tools

```bash
# Basic semantic search
python scripts/exa_cli.py web_search_exa --query "emerging patterns in TypeScript" [--num-results 10] [--type auto|keyword|neural] [--livecrawl always|fallback|never]

# Advanced search with filters
python scripts/exa_cli.py web_search_advanced_exa --query "machine learning papers" \
  [--include-domains arxiv.org,github.com] [--exclude-domains medium.com] \
  [--start-date 2024-01-01] [--end-date 2024-12-31] \
  [--text] [--highlights] [--summary] [--out results.json]

# Deep search with query expansion
python scripts/exa_cli.py deep_search_exa --objective "foundations of quantum error correction" [--additional-queries "query1|query2"]

# Company research
python scripts/exa_cli.py company_research_exa --company "Anthropic" [--num-results 10]

# LinkedIn profile search
python scripts/exa_cli.py linkedin_search_exa --query "AI researchers at Stanford" [--num-results 10]
```

### Content Tools

```bash
# Extract content from URL
python scripts/exa_cli.py crawling_exa --url "https://example.com/article" \
  [--max-chars 5000] [--livecrawl always|fallback|never] \
  [--text] [--highlights] [--summary] [--out content.json]

# Get code context (documentation, examples)
python scripts/exa_cli.py get_code_context_exa --query "React useState hook examples" [--tokens-num 10000] [--out code.json]
```

### Research Tools

```bash
# Start AI research task
python scripts/exa_cli.py deep_researcher_start --instructions "Analyze the impact of LLMs on software development" [--model exa-research|exa-research-pro]
# Returns: {"taskId": "abc123", ...}

# Check research status
python scripts/exa_cli.py deep_researcher_check --task-id "abc123" [--out report.json]
# Status: running → completed | failed
```

### Configuration

```bash
# Check config and test connection
python scripts/exa_cli.py get_config_info [--no-test]
```

## Tool Capability Matrix

| Tool | Required | Optional | Output |
|------|----------|----------|--------|
| `web_search_exa` | `query` | `num-results`, `type`, `livecrawl` | Search results JSON |
| `web_search_advanced_exa` | `query` | `include-domains`, `exclude-domains`, `start-date`, `end-date`, `text`, `highlights`, `summary` | Filtered results JSON |
| `deep_search_exa` | `objective` | `additional-queries` | Expanded search results |
| `company_research_exa` | `company` | `num-results` | Company info JSON |
| `linkedin_search_exa` | `query` | `num-results` | LinkedIn profiles JSON |
| `crawling_exa` | `url` | `max-chars`, `livecrawl`, `text`, `highlights`, `summary` | Page content JSON |
| `get_code_context_exa` | `query` | `tokens-num` (1000-50000) | Code context JSON |
| `deep_researcher_start` | `instructions` | `model` | Task ID |
| `deep_researcher_check` | `task-id` | - | Status + report |

## Tool Routing Guide

### Exa vs Grok-Search

| Use Case | Recommended Tool |
|----------|------------------|
| Real-time news, current events | grok-search |
| Semantic/conceptual research | **exa** |
| Code documentation lookup | **exa** (`get_code_context_exa`) |
| Company/professional research | **exa** |
| General web content fetch | grok-search |
| Academic papers, technical docs | **exa** |
| AI-powered deep research | **exa** (`deep_researcher_*`) |

## Workflow Patterns

### Pattern 1: Quick Semantic Search
```bash
python scripts/exa_cli.py web_search_exa --query "best practices for React hooks" --num-results 5
```

### Pattern 2: Filtered Research
```bash
python scripts/exa_cli.py web_search_advanced_exa --query "transformer architecture" \
  --include-domains arxiv.org,papers.nips.cc --start-date 2023-01-01 --text --summary
```

### Pattern 3: Deep Research Task
```bash
# Start research
python scripts/exa_cli.py deep_researcher_start --instructions "Compare RAG vs fine-tuning for domain adaptation"
# Poll until completed
python scripts/exa_cli.py deep_researcher_check --task-id "<taskId>" --out research_report.json
```

## Error Handling

| Error | Recovery |
|-------|----------|
| `EXA_API_KEY not configured` | Set environment variable or use `--api-key` |
| HTTP 429 (Rate limit) | Automatic retry with exponential backoff |
| HTTP 401 (Unauthorized) | Verify API key is valid |
| Timeout | Retry or reduce `num-results` |

## Output Format

All commands output JSON to stdout. Use `--out <file>` to write to file instead.

```json
{
  "results": [
    {"title": "...", "url": "...", "text": "...", "publishedDate": "..."}
  ]
}
```

Overview

This skill provides high-precision semantic search and structured content retrieval via the Exa API. It exposes CLI tools for deep research, code documentation lookup, company and LinkedIn profile exploration, and URL content extraction. Use it when you need neural search quality, long-context code examples, or organized JSON outputs for pipelines.

How this skill works

The CLI calls Exa endpoints to perform semantic (neural) search, filtered advanced searches, and URL crawling, returning JSON results with highlights, summaries, and full text when requested. It supports query expansion for deep research, long-token code context extraction, and asynchronous research tasks that produce task IDs and downloadable reports. Configuration and API key handling are environment-driven, with built-in retries and error guidance for common HTTP failures.

When to use it

Deep research that requires semantic understanding beyond keyword match
Finding code documentation, examples, and large context code snippets
Company or professional research (organized company profiles, bios, filings)
Academic papers and technical documents search with domain filters
URL content extraction when you need structured JSON (text, highlights, summaries)

Best practices

Provide clear objectives or expanded queries for deep_search_exa to get broader coverage
Use include/exclude domain filters and date ranges for focused results with web_search_advanced_exa
Request summaries or highlights when consuming many results to speed triage
Use get_code_context_exa with a sufficient tokens-num for long codebases or complete examples
Store CLI JSON outputs with --out and pipeline them into downstream analysis or reporting tools

Example use cases

Research emerging technical patterns: web_search_exa on "emerging patterns in TypeScript" with num-results 10
Build a literature survey: web_search_advanced_exa scoped to arxiv.org and date range, with summaries
Extract article content programmatically: crawling_exa to get JSON text + highlights for ingestion
Collect code examples and docs: get_code_context_exa for React hooks or library APIs with high token limits
Run multi-step AI research: deep_researcher_start to create a task, then deep_researcher_check to fetch the completed report

FAQ

What do I need to run the CLI?

Install dependencies (httpx, tenacity), set EXA_API_KEY in the environment, and optionally EXA_API_URL.

How do I get structured output?

All commands emit JSON to stdout; use --out <file> to write results to disk for downstream processing.