home / skills / benjaminjackson / exa-skills / research

research skill

safe

This skill helps you manage complex, multi-step Exa research tasks efficiently, delivering structured insights and progress updates.

npx playbooks add skill benjaminjackson/exa-skills --skill research

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

9.5 KB

---
name: exa-research
description: Use when the user mentions Exa research OR when the workflow benefits from complex, multi-step research and other exa-ai approaches are not yielding satisfactory results.
---

# Exa Research Tasks

Manage asynchronous research tasks with exa-ai for complex, multi-step research workflows.

**Use `--help` to see available commands and verify usage before running:**
```bash
exa-ai <command> --help
```

## Working with Complex Shell Commands

When using the Bash tool with complex shell syntax, follow these best practices for reliability:

1. **Run commands directly**: Capture JSON output directly rather than nesting command substitutions
2. **Parse in subsequent steps**: Use `jq` to parse output in a follow-up command if needed
3. **Avoid nested substitutions**: Complex nested `$(...)` can be fragile; break into sequential steps

Example:
```bash
# Less reliable: nested command substitution
results=$(exa-ai research-start --instructions "query" | jq -r '.result')

# More reliable: run directly, then parse
exa-ai research-start --instructions "query"
# Then in a follow-up command if needed:
exa-ai research-get research_id | jq -r '.result'
```

## Cost Optimization

### Pricing
Research is the most expensive Exa endpoint:
- **Agent search**: $0.005 per search operation
- **Standard page read**: $0.005 per page
- **Pro page read**: $0.010 per page (2x standard)
- **Reasoning tokens**: $0.000005 per token

**Cost strategy:**
- **Avoid research unless required**: Most expensive option (2-10x cost premium over other endpoints)
- Use only for autonomous, multi-step reasoning tasks that justify the cost
- For simpler queries, use `search`, `answer`, or `get-contents` instead
- Consider using `exa-research` (standard) instead of `exa-research-pro` unless you need the higher quality

## Research Overview

Research tasks are asynchronous operations that allow you to:
- Run complex, multi-step research workflows
- Process large amounts of information over time
- Monitor progress of long-running research
- Get structured output from comprehensive research

### When to Use Research vs Search

**Use research-start** when:
- The research requires multiple steps or complex reasoning
- You need comprehensive analysis of a topic
- The task will take significant time to complete
- You want structured, synthesized output

**Use search** (from exa-core) when:
- You need immediate results
- The query is straightforward
- You want quick factual information

## Commands

### research-start
Initiate a new research task with instructions.

```bash
exa-ai research-start --instructions "Find the top 10 Ruby performance optimization techniques"
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-start).

### research-get
Check status and retrieve results of a research task.

```bash
exa-ai research-get research_abc123
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-get).

### research-list
List all your research tasks with pagination.

```bash
exa-ai research-list --limit 10
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-list).

## Research Models

- **exa-research** (default): Balanced speed and quality
- **exa-research-pro**: Higher quality, more comprehensive results
- **exa-research-fast**: Faster results, good for simpler research

## Quick Examples

### Simple Research
```bash
exa-ai research-start \
  --instructions "Find the latest breakthroughs in quantum computing"
```

### Research with Structured Output
```bash
exa-ai research-start \
  --instructions "Compare TypeScript vs Flow for type checking" \
  --output-schema '{
    "type":"object",
    "properties":{
      "typescript":{
        "type":"object",
        "properties":{
          "pros":{"type":"array","items":{"type":"string"}},
          "cons":{"type":"array","items":{"type":"string"}}
        }
      },
      "flow":{
        "type":"object",
        "properties":{
          "pros":{"type":"array","items":{"type":"string"}},
          "cons":{"type":"array","items":{"type":"string"}}
        }
      }
    }
  }'
```

### Background Research Workflow
```bash
# Start research
research_id=$(exa-ai research-start \
  --instructions "Analyze competitor landscape for project management tools" | jq -r '.research_id')

# Check status later
status=$(exa-ai research-get $research_id | jq -r '.status')

# Get results when complete
if [ "$status" = "completed" ]; then
  exa-ai research-get $research_id | jq -r '.result'
fi
```

### Use Pro Model for Comprehensive Research
```bash
exa-ai research-start \
  --instructions "Comprehensive analysis of microservices vs monolithic architecture with case studies" \
  --model exa-research-pro \
  --events
```

### Shared Requirements

<shared-requirements>

## Schema Design

### MUST: Use object wrapper for schemas

**Applies to**: answer, search, find-similar, get-contents

When using schema parameters (`--output-schema` or `--summary-schema`), always wrap properties in an object:

```json
{"type":"object","properties":{"field_name":{"type":"string"}}}
```

**DO NOT** use bare properties without the object wrapper:
```json
{"properties":{"field_name":{"type":"string"}}}  // ❌ Missing "type":"object"
```

**Why**: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.

**Examples**:
```bash
# ✅ CORRECT - object wrapper included
exa-ai search "AI news" \
  --summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'

# ❌ WRONG - missing object wrapper
exa-ai search "AI news" \
  --summary-schema '{"properties":{"headline":{"type":"string"}}}'
```

---

## Output Format Selection

### MUST NOT: Mix toon format with jq

**Applies to**: answer, context, search, find-similar, get-contents

`toon` format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:

```bash
# ❌ WRONG - toon is not JSON
exa-ai search "query" --output-format toon | jq -r '.results'

# ✅ CORRECT - use JSON (default) with jq
exa-ai search "query" | jq -r '.results[].title'

# ✅ CORRECT - use toon for direct reading only
exa-ai search "query" --output-format toon
```

**Why**: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.

### SHOULD: Choose one output approach

**Applies to**: answer, context, search, find-similar, get-contents

Pick one strategy and stick with it throughout your workflow:

1. **Approach 1: toon only** - Compact YAML-like output for direct reading
   - Use when: Reading output directly, no further processing needed
   - Token savings: ~40% reduction vs JSON
   - Example: `exa-ai search "query" --output-format toon`

2. **Approach 2: JSON + jq** - Extract specific fields programmatically
   - Use when: Need to extract specific fields or pipe to other commands
   - Token savings: ~80-90% reduction (extracts only needed fields)
   - Example: `exa-ai search "query" | jq -r '.results[].title'`

3. **Approach 3: Schemas + jq** - Structured data extraction with validation
   - Use when: Need consistent structured output across multiple queries
   - Token savings: ~85% reduction + consistent schema
   - Example: `exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'`

**Why**: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.

---

## Shell Command Best Practices

### MUST: Run commands directly, parse separately

**Applies to**: monitor, search (websets), research, and all skills using complex commands

When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:

```bash
# ❌ WRONG - nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')

# ✅ CORRECT - run directly, then parse
exa-ai webset-create --search '{"query":"..."}'
# Then in a follow-up command:
webset_id=$(cat output.json | jq -r '.webset_id')
```

**Why**: Complex nested `$(...)` command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.

### MUST NOT: Use nested command substitutions

**Applies to**: All skills when using complex multi-step operations

Avoid nesting multiple levels of command substitution:

```bash
# ❌ WRONG - deeply nested
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))

# ✅ CORRECT - sequential steps
query=$(cat query.txt | tr '\n' ' ')
count=$(cat config.json | jq -r '.count')
exa-ai search "$query" --num-results $count
```

**Why**: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.

### SHOULD: Break complex commands into sequential steps

**Applies to**: All skills when working with multi-step workflows

For readability and reliability, break complex operations into clear sequential steps:

```bash
# ❌ Less maintainable - everything in one line
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override

# ✅ More maintainable - clear steps
exa-ai webset-create --search '{"query":"startups","count":1}'
webset_id=$(jq -r '.webset_id' < output.json)
exa-ai webset-search-create $webset_id --query "AI" --behavior override
```

**Why**: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.

</shared-requirements>

Overview

This skill manages asynchronous, multi-step research workflows using Exa.ai research endpoints. It helps you start, monitor, and retrieve long-running research tasks and choose the appropriate research model for depth, speed, and cost. Use it when you need structured, comprehensive analysis that simple search or answer endpoints cannot provide.

How this skill works

You start a research task with detailed instructions and optional output schema, then poll or list tasks to monitor progress and retrieve structured results when complete. Research tasks run asynchronously and can produce synthesized outputs, event streams, and JSON-schema-validated responses. Choose between exa-research, exa-research-pro, and exa-research-fast depending on quality and latency needs.

When to use it

You need multi-step reasoning or long-running analysis across many sources
You require structured, validated output (JSON schema) from a complex investigation
You want a synthesized, comprehensive report rather than immediate search hits
You must monitor or resume research over time (asynchronous workflows)
You need a higher-quality deep-dive and are willing to accept higher costs

Best practices

Avoid using research for simple queries—use search/answer/get-contents to save cost
Run commands directly and parse output in follow-up steps instead of nesting substitutions
Break complex shell workflows into sequential steps for reliability and easier debugging
Wrap output schemas in an object with "type":"object" to satisfy API validation
Prefer exa-research (standard) for most tasks; use exa-research-pro only when extra depth justifies cost

Example use cases

Start a background competitor landscape study and poll later for synthesized findings
Request a structured comparison of architectures or technologies using an output schema
Run a comprehensive literature review that aggregates and ranks sources over time
Use exa-research-pro to generate detailed case studies and domain-specific analysis
List and inspect past research tasks to audit progress and reuse results

FAQ

How do I check research progress?

Use research-get with the research_id to retrieve status and partial results; research-list shows task history and pagination.

When should I pick exa-research-pro vs exa-research?

Choose exa-research-pro only when you need higher-quality, more comprehensive reasoning and the cost premium is justified; use standard for balanced speed and cost.