home / skills / glebis / claude-skills / firecrawl-research
npx playbooks add skill glebis/claude-skills --skill firecrawl-researchReview the files below or copy the command above to add this skill to your agents.
---
name: firecrawl-research
description: This skill should be used when the user requests to research topics using FireCrawl, enrich notes with web sources, search and scrape information, or write scientific/academic papers. It extracts research topics from markdown files, creates research documents with scraped sources, generates BibTeX bibliographies from research results, and provides Pandoc/MyST templates for academic writing with citation management.
---
# FireCrawl Research
## Overview
Enrich research documents by automatically searching and scraping web sources using the FireCrawl API. Extract research topics from markdown files and generate comprehensive research documents with source material.
## When to Use This Skill
Use this skill when the user:
- Says "Research this topic using FireCrawl"
- Requests to enrich notes or documents with web sources
- Wants to gather information about topics listed in a markdown file
- Needs to search and scrape multiple topics systematically
## How It Works
### 1. Topic Extraction
The script automatically extracts research topics from markdown files using two methods:
**Method 1: Headers**
```markdown
## Spatial Reasoning in AI
### Computer Vision Applications
```
Both `Spatial Reasoning in AI` and `Computer Vision Applications` become research topics.
**Method 2: Research Tags**
```markdown
- [research] Large Language Models for robotics
- [search] Theory of Mind in autonomous driving
```
Both tagged items become research topics.
### 2. Search and Scrape
For each topic:
1. Searches FireCrawl with the topic as query
2. Retrieves up to N results (default: 5)
3. Automatically scrapes full content from each result
4. Extracts markdown-formatted content (main content only)
### 3. Output Generation
Creates new markdown files in the specified output directory:
- One file per topic
- Filename: `{topic}_{timestamp}.md`
- Contains: title, date, sources count, full scraped content
- Each source includes: title, URL, markdown content
## Usage
### Basic Usage
```bash
python scripts/firecrawl_research.py research.md
```
Outputs to current directory.
### Specify Output Directory
```bash
python scripts/firecrawl_research.py research.md ./output
```
Creates files in `./output/` folder.
### Limit Results Per Topic
```bash
python scripts/firecrawl_research.py research.md ./output 3
```
Retrieves maximum 3 results per topic.
## Configuration
### API Key Setup
1. Copy `.env.example` to `.env`:
```bash
cp .env.example .env
```
2. Add FireCrawl API key:
```
FIRECRAWL_API_KEY=fc-your-actual-api-key
```
The script automatically loads the API key from the skill's `.env` file.
### Rate Limiting
The script includes automatic rate limiting for FireCrawl's free tier:
- **Free tier limit:** 5 requests/minute
- **Built-in delay:** 12 seconds between topics
- Prevents API errors and credit exhaustion
When processing multiple topics, expect:
- 5 topics: ~1 minute
- 10 topics: ~2 minutes
- 20 topics: ~4 minutes
## Workflow Example
**User request:** "Research these AI topics using FireCrawl"
**Input file (`ai-research.md`):**
```markdown
# AI Research Topics
## Spatial Reasoning in Vision-Language Models
- [research] Embodied AI for robotics
- [research] Computer Use Agents
```
**Command:**
```bash
python scripts/firecrawl_research.py ai-research.md ./research_output 5
```
**Output:**
```
research_output/
├── Spatial_Reasoning_in_Vision-Language_Models_20251122_140530.md
├── Embodied_AI_for_robotics_20251122_140542.md
└── Computer_Use_Agents_20251122_140554.md
```
Each file contains:
- Topic title
- Timestamp
- Source count
- Full scraped content from up to 5 sources
- Source URLs
## Common Patterns
### Pattern 1: Quick Research
Extract topics from existing notes, research them, save to current folder:
```bash
python scripts/firecrawl_research.py my-notes.md
```
### Pattern 2: Organized Research
Create dedicated output folder for research results:
```bash
python scripts/firecrawl_research.py topics.md ./research_results
```
### Pattern 3: Deep Dive
Increase results per topic for comprehensive coverage:
```bash
python scripts/firecrawl_research.py topics.md ./deep_research 10
```
### Pattern 4: Obsidian Vault Integration
Direct output to vault's research folder:
```bash
python scripts/firecrawl_research.py topics.md ~/Brains/brain/Research
```
## Error Handling
### "API key not found"
Create `.env` file in skill folder with `FIRECRAWL_API_KEY=...`
### "Rate limit exceeded"
- Free tier: 5 req/min
- Script has 12s delay built-in
- If still hitting limit, reduce topics or wait between runs
### "Insufficient credits"
- Check FireCrawl account credits
- Upgrade plan or wait for credit reset
### "No topics found"
Add topics to markdown using:
- `## Header format`
- `- [research] Topic format`
- `- [search] Topic format`
## Script Details
**Location:** `scripts/firecrawl_research.py`
**Dependencies:**
- `python-dotenv` - Environment variable management
- `requests` - HTTP requests to FireCrawl API
**Install dependencies:**
```bash
pip install python-dotenv requests
```
**FireCrawl Features Used:**
- `/v1/search` endpoint - Search with automatic scraping
- `scrapeOptions.formats: ['markdown']` - Markdown output
- `scrapeOptions.onlyMainContent: true` - Filter noise
## Academic Writing Templates
This skill includes templates for writing scientific papers in markdown format.
### Available Templates
**1. Pandoc Scholarly Paper** (`assets/templates/pandoc-scholarly-paper.md`)
- Standard academic paper format
- Compatible with Pandoc converter
- Supports citations via BibTeX
- Exports to PDF, DOCX, HTML
**2. MyST Scientific Paper** (`assets/templates/myst-scientific-paper.md`)
- MyST (Markedly Structured Text) format
- Advanced cross-referencing
- Professional scientific publishing
- Multi-format export (PDF, LaTeX, DOCX)
### Using Templates
**Copy template to your project:**
```bash
cp assets/templates/pandoc-scholarly-paper.md my-paper.md
# or
cp assets/templates/myst-scientific-paper.md my-paper.md
```
**Edit content:**
- Update YAML frontmatter (title, authors, affiliations)
- Write your content in sections
- Add citations using `[@AuthorYear]` (Pandoc) or `{cite}\`AuthorYear\`` (MyST)
**Convert to PDF/DOCX:**
```bash
python scripts/convert_academic.py my-paper.md pdf
python scripts/convert_academic.py my-paper.md docx
python scripts/convert_academic.py my-paper.md pdf --myst # For MyST
```
### Bibliography Generation
Convert FireCrawl research results into BibTeX bibliography entries:
```bash
python scripts/generate_bibliography.py research_output/*.md -o references.bib
```
**What it does:**
- Extracts URLs and titles from FireCrawl markdown files
- Generates BibTeX `@misc` entries
- Creates citation keys automatically
- Adds access dates
**Example workflow:**
```bash
# 1. Research topics
python scripts/firecrawl_research.py topics.md ./research
# 2. Generate bibliography
python scripts/generate_bibliography.py research/*.md -o refs.bib
# 3. Copy template
cp assets/templates/pandoc-scholarly-paper.md paper.md
# 4. Edit paper.md (add content, cite sources)
# 5. Convert to PDF
python scripts/convert_academic.py paper.md pdf
```
### Citation Examples
**Pandoc syntax:**
```markdown
Recent research [@Smith2024] shows...
Multiple studies [@Jones2023; @Brown2024] indicate...
```
**MyST syntax:**
```markdown
Recent research {cite}`Smith2024` shows...
Multiple studies {cite}`Jones2023,Brown2024` indicate...
```
### Example Bibliography File
An example bibliography is provided in `assets/references.bib` with common entry types:
- Journal articles (`@article`)
- Conference papers (`@inproceedings`)
- Books (`@book`)
- PhD theses (`@phdthesis`)
- Web resources (`@misc`)
- Preprints (`@article` with arXiv)
## Tips
1. **Organize topics hierarchically** - Use `##` for main topics, `###` for subtopics
2. **Use descriptive names** - Topic text becomes filename, make it clear
3. **Batch processing** - Group related topics in one file for efficiency
4. **Output organization** - Create separate folders for different research projects
5. **Content review** - Results are truncated at 3000 chars/source for readability
6. **Academic workflow** - Use bibliography generator to cite research sources in papers
7. **Template customization** - Modify templates for your field's citation style
## Limitations
- **No summarization** - Returns raw scraped content, not summaries
- **No deduplication** - Duplicate sources may appear across topics
- **No quality ranking** - All results treated equally
- **New files only** - Does not append to existing files
- **Free tier constraints** - Rate limiting affects processing speed