home / skills / vaayne / agent-kit / web-fetch
This skill fetches and extracts clean, readable content from web pages using Jina Reader, returning title, text, and metadata for analysis.
npx playbooks add skill vaayne/agent-kit --skill web-fetchReview the files below or copy the command above to add this skill to your agents.
---
name: web-fetch
description: Fetch and extract clean content from URLs using Jina Reader API. Use when users need to read webpage content, extract article text, or fetch URL content for analysis. Triggers on "fetch this page", "read this URL", "extract content from", "get the content of", "what does this page say".
---
# Web Fetch
## Overview
Extract clean, readable content from any URL using Jina Reader API. Returns raw JSON with title, content, and metadata optimized for LLM consumption.
## When to Use
- User wants to read or analyze webpage content
- Need to extract article text from a URL
- Fetching documentation or reference pages
- Converting web pages to clean text for processing
## Workflow
1. Identify the URL from user request
2. Validate URL format
3. Run the fetch script
4. Present extracted content to user
## Usage
```bash
# Basic fetch
uv run --script scripts/web_fetch.py --url "https://example.com"
# With custom timeout
uv run --script scripts/web_fetch.py \
--url "https://example.com/article" \
--timeout 60
```
## Parameters
| Parameter | Default | Description |
| ----------- | ---------- | ------------------------------------- |
| `--url` | (required) | URL to fetch and extract content from |
| `--timeout` | 30 | Request timeout in seconds |
## Output Contract
| Scenario | stdout | stderr | exit code |
| ----------- | ------------------ | ------------------ | --------- |
| Success | Raw JSON from Jina | (empty) | 0 |
| Invalid URL | (empty) | Error message | 1 |
| Timeout | (empty) | Timeout error | 1 |
| HTTP Error | (empty) | HTTP error details | 1 |
Success output contains:
- Page title and description
- Clean extracted content (markdown-formatted)
- URL and metadata
- Token usage information
## Prerequisites
- Uses Jina Reader API (no API key required)
- Requires `uv` for running PEP 723 scripts
## Examples
### Fetch a webpage
```bash
uv run --script scripts/web_fetch.py \
--url "https://docs.python.org/3/whatsnew/3.12.html"
```
### Fetch with longer timeout for slow pages
```bash
uv run --script scripts/web_fetch.py \
--url "https://example.com/large-article" \
--timeout 60
```
This skill fetches and extracts clean, readable content from any public URL using the Jina Reader API. It returns structured JSON containing title, markdown-formatted content, metadata, and token-usage info optimized for downstream LLM processing. Use it when you need reliable article text extraction or to convert web pages into text for analysis or summarization.
The skill identifies and validates the URL, sends a fetch request to the Jina Reader API, and parses the returned document into a compact JSON payload. It extracts title, description, the main article body (cleaned and formatted as markdown), plus metadata such as source URL and token counts. Errors like invalid URLs, timeouts, or HTTP failures are surfaced via stderr and nonzero exit codes.
Do I need an API key to use the Jina Reader through this skill?
No API key is required; the skill uses the public Jina Reader API as provided.
What happens if the page is behind authentication or blocked?
The fetch will fail with an HTTP error; the skill reports the error details on stderr and returns a nonzero exit code.
Can I adjust request timeout?
Yes — the skill accepts a timeout parameter (default 30 seconds) to accommodate slow pages.