home / skills / firecrawl / cli / firecrawl-scrape
This skill extracts clean, llm-optimized markdown from one or more URLs, including JS-rendered pages, for quick content use.
npx playbooks add skill firecrawl/cli --skill firecrawl-scrapeReview the files below or copy the command above to add this skill to your agents.
---
name: firecrawl-scrape
description: |
Extract clean markdown from any URL, including JavaScript-rendered SPAs. Use this skill whenever the user provides a URL and wants its content, says "scrape", "grab", "fetch", "pull", "get the page", "extract from this URL", or "read this webpage". Handles JS-rendered pages, multiple concurrent URLs, and returns LLM-optimized markdown. Use this instead of WebFetch for any webpage content extraction.
allowed-tools:
- Bash(firecrawl *)
- Bash(npx firecrawl *)
---
# firecrawl scrape
Scrape one or more URLs. Returns clean, LLM-optimized markdown. Multiple URLs are scraped concurrently.
## When to use
- You have a specific URL and want its content
- The page is static or JS-rendered (SPA)
- Step 2 in the [workflow escalation pattern](firecrawl-cli): search → **scrape** → map → crawl → browser
## Quick start
```bash
# Basic markdown extraction
firecrawl scrape "<url>" -o .firecrawl/page.md
# Main content only, no nav/footer
firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md
# Wait for JS to render, then scrape
firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md
# Multiple URLs (each saved to .firecrawl/)
firecrawl scrape https://example.com https://example.com/blog https://example.com/docs
# Get markdown and links together
firecrawl scrape "<url>" --format markdown,links -o .firecrawl/page.json
```
## Options
| Option | Description |
| ------------------------ | ---------------------------------------------------------------- |
| `-f, --format <formats>` | Output formats: markdown, html, rawHtml, links, screenshot, json |
| `-H` | Include HTTP headers in output |
| `--only-main-content` | Strip nav, footer, sidebar — main content only |
| `--wait-for <ms>` | Wait for JS rendering before scraping |
| `--include-tags <tags>` | Only include these HTML tags |
| `--exclude-tags <tags>` | Exclude these HTML tags |
| `-o, --output <path>` | Output file path |
## Tips
- **Try scrape before browser.** Scrape handles static pages and JS-rendered SPAs. Only escalate to browser when you need interaction (clicks, form fills, pagination).
- Multiple URLs are scraped concurrently — check `firecrawl --status` for your concurrency limit.
- Single format outputs raw content. Multiple formats (e.g., `--format markdown,links`) output JSON.
- Always quote URLs — shell interprets `?` and `&` as special characters.
- Naming convention: `.firecrawl/{site}-{path}.md`
## See also
- [firecrawl-search](../firecrawl-search/SKILL.md) — find pages when you don't have a URL
- [firecrawl-browser](../firecrawl-browser/SKILL.md) — when scrape can't get the content (interaction needed)
- [firecrawl-download](../firecrawl-download/SKILL.md) — bulk download an entire site to local files
This skill extracts clean, LLM-optimized Markdown from one or more URLs, including JavaScript-rendered single-page applications. It runs concurrent scrapes, supports selective tag inclusion/exclusion, and can wait for client-side rendering before capturing content. Use it whenever you need readable, structured page content for downstream analysis or indexing.
The skill loads pages (headless browser when needed) and renders JavaScript so dynamic content is captured. It strips navigation, sidebars, and footers when requested and can include multiple output formats (markdown, HTML, links, screenshots). Multiple URLs are fetched concurrently and returned as concise, LLM-friendly Markdown or JSON when multiple formats are selected.
Can it handle pages that require JavaScript to render?
Yes — the skill can wait for client-side rendering (use --wait-for) and uses a headless renderer to capture dynamic content.
What output formats are available?
Outputs include markdown, html, rawHtml, links, screenshot, and json. Request multiple formats to receive a JSON object containing each format.
How do I extract only the main article text?
Use the --only-main-content option to strip navigation, sidebars, and footers and return the primary content only.