home / skills / openclaw / skills / firecrawl-search

firecrawl-search skill

safe

This skill enables fast web search and data extraction using the Firecrawl API to crawl sites, scrape pages, and gather structured data.

npx playbooks add skill openclaw/skills --skill firecrawl-search

Review the files below or copy the command above to add this skill to your agents.

Files (6)

SKILL.md

997 B

---
name: firecrawl
description: Web search and scraping via Firecrawl API. Use when you need to search the web, scrape websites (including JS-heavy pages), crawl entire sites, or extract structured data from web pages. Requires FIRECRAWL_API_KEY environment variable.
---

# Firecrawl

Web search and scraping via Firecrawl API.

## Prerequisites

Set `FIRECRAWL_API_KEY` in your environment or `.env` file:
```bash
export FIRECRAWL_API_KEY=fc-xxxxxxxxxx
```

## Quick Start

### Search the web
```bash
firecrawl_search "your search query" --limit 10
```

### Scrape a single page
```bash
firecrawl_scrape "https://example.com"
```

### Crawl an entire site
```bash
firecrawl_crawl "https://example.com" --max-pages 50
```

## API Reference

See [references/api.md](references/api.md) for detailed API documentation and advanced options.

## Scripts

- `scripts/search.py` - Search the web with Firecrawl
- `scripts/scrape.py` - Scrape a single URL
- `scripts/crawl.py` - Crawl an entire website

Overview

This skill provides web search and scraping capabilities using the Firecrawl API. It lets you perform web searches, scrape single pages (including JavaScript-heavy content), and crawl entire websites to extract structured data. The skill requires a FIRECRAWL_API_KEY set in your environment to authenticate requests.

How this skill works

The skill calls the Firecrawl API to run searches, render and scrape pages, or follow links for site-wide crawls. It can execute headless browser rendering so JavaScript-driven content is captured, then returns HTML or structured extraction results. Command-line scripts wrap common flows: search, single-page scrape, and full-site crawl with pagination and limits.

When to use it

Perform broad web searches and gather ranked search results programmatically.
Extract content from pages that require JavaScript rendering (SPAs, dynamic loading).
Crawl and archive an entire site for backup, analysis, or offline indexing.
Collect structured data fields (titles, prices, articles) from multiple pages automatically.
Automate recurring scraping tasks with configurable page limits and filters.

Best practices

Always set FIRECRAWL_API_KEY as an environment variable and protect it from public exposure.
Respect robots.txt and the target site’s terms of service; configure crawl limits and polite rate limits.
Start with small limits (max pages) when testing to avoid excessive requests and to validate selectors.
Use structured extraction rules or post-process returned HTML to normalize fields across pages.
Log and monitor crawl results to detect changes in page structure that break extraction.

Example use cases

Search for recent news articles matching a topic and extract headlines and links for a research dashboard.
Scrape product pages that render prices via JavaScript and export price and availability data to CSV.
Crawl a client’s website to produce an offline archive or sitemap with extracted metadata for migration.
Run scheduled crawls to track changes in competitor pages and alert when key fields change.
Aggregate content from multiple blogs for topic modeling or downstream NLP analysis.

FAQ

How do I authenticate requests?

Set FIRECRAWL_API_KEY in your environment or .env file (e.g., export FIRECRAWL_API_KEY=fc-xxxxxxxxxx). The scripts read this variable to authenticate with Firecrawl.

Can it scrape JavaScript-heavy pages?

Yes. Firecrawl executes headless rendering so JavaScript-driven content is captured, allowing extraction from SPAs and dynamically loaded elements.