home / skills / artwist-polyakov / polyakov-claude-skills / scrapedo-web-scraper
This skill fetches page text or HTML via Scrape.do when normal fetch is blocked, bypassing captchas and blockers to ensure access.
npx playbooks add skill artwist-polyakov/polyakov-claude-skills --skill scrapedo-web-scraperReview the files below or copy the command above to add this skill to your agents.
---
name: scrapedo-web-scraper
description: |
Веб-скрапинг через Scrape.do. Обходит блокировки и CAPTCHA.
АВТОМАТИЧЕСКИ используй при ошибках WebFetch: 403, 401, 429,
timeout, access denied, Cloudflare block.
---
# Scrape.do Web Scraper
Скрапинг веб-страниц через Scrape.do API. Используй когда обычный fetch не работает (блокировка, JavaScript).
## Использование
```bash
# Получить текст страницы
python scripts/scrape.py https://example.com
# Получить HTML
python scripts/scrape.py --html https://example.com
```
## Из Python
```python
from scripts.scrape import fetch_via_scrapedo
result = fetch_via_scrapedo('https://example.com')
if result['success']:
print(result['content']) # текст
# result['html'] — оригинальный HTML
else:
print(result['content']) # описание ошибки
```
## Результат
- **Успех**: текст страницы (или HTML с `--html`)
- **Ошибка**: понятное сообщение (нет токена / лимит / недоступно)
Если вернулась ошибка — страница недоступна через этот метод.
This skill integrates Scrape.do as a fallback web scraper for pages that fail with standard HTTP fetches. It bypasses common blocks like Cloudflare, JavaScript rendering barriers, and simple CAPTCHAs to return page text or full HTML. It triggers automatically on common WebFetch errors to preserve workflow continuity.
The skill routes requests to the Scrape.do API when WebFetch returns 403, 401, 429, timeout, access denied, or Cloudflare block. It performs headless rendering and anti-block measures, then returns either cleaned text or raw HTML depending on the options. On API errors it returns a clear, human-readable error explaining token, rate limit, or availability issues.
What does the skill return on success?
It returns page text by default and can return raw HTML when requested.
What happens if the Scrape.do API fails?
You get a clear error message indicating token, rate limit, or service availability; treat it as an unavailable source and fallback accordingly.