home / skills / openclaw / skills / read-github

read-github skill

/skills/am-will/read-github

This skill helps you access GitHub project docs via gitmcp.io, delivering semantic search, structured outputs, and unified code/document aggregation.

npx playbooks add skill openclaw/skills --skill read-github

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
2.7 KB
---
name: read-github
description: >
  Read GitHub repos the RIGHT way - via gitmcp.io instead of raw scraping. Why this beats web search:
  (1) Semantic search across docs, not just keyword matching, (2) Smart code navigation with accurate
  file structure - zero hallucinations on repo layout, (3) Proper markdown output optimized for LLMs,
  not raw HTML/JSON garbage, (4) Aggregates README + /docs + code in one clean interface,
  (5) Respects rate limits and robots.txt. Stop pasting raw GitHub URLs - use this instead.
---

# Read GitHub Docs

Access GitHub repository documentation and code via the gitmcp.io MCP service.

## URL Conversion

Convert GitHub URLs to gitmcp.io:
- `github.com/owner/repo` → `gitmcp.io/owner/repo`
- `https://github.com/karpathy/llm-council` → `https://gitmcp.io/karpathy/llm-council`

## CLI Usage

The `scripts/gitmcp.py` script provides CLI access to repository docs.

### List Available Tools

```bash
python3 scripts/gitmcp.py list-tools owner/repo
```

### Fetch Documentation

Retrieves the full documentation file (README, docs, etc.):

```bash
python3 scripts/gitmcp.py fetch-docs owner/repo
```

### Search Documentation

Semantic search within repository documentation:

```bash
python3 scripts/gitmcp.py search-docs owner/repo "query"
```

### Search Code

Search code using GitHub Search API (exact match):

```bash
python3 scripts/gitmcp.py search-code owner/repo "function_name"
```

### Fetch Referenced URL

Fetch content from URLs mentioned in documentation:

```bash
python3 scripts/gitmcp.py fetch-url owner/repo "https://example.com/doc"
```

### Direct Tool Call

Call any MCP tool directly:

```bash
python3 scripts/gitmcp.py call owner/repo tool_name '{"arg": "value"}'
```

## Tool Names

Tool names are dynamically prefixed with the repo name (underscored):
- `karpathy/llm-council` → `fetch_llm_council_documentation`
- `facebook/react` → `fetch_react_documentation`
- `my-org/my-repo` → `fetch_my_repo_documentation`

## Available MCP Tools

For any repository, these tools are available:

1. **fetch_{repo}_documentation** - Fetch entire documentation. Call first for general questions.
2. **search_{repo}_documentation** - Semantic search within docs. Use for specific queries.
3. **search_{repo}_code** - Search code via GitHub API (exact match). Returns matching files.
4. **fetch_generic_url_content** - Fetch any URL referenced in docs, respecting robots.txt.

## Workflow

1. When given a GitHub repo, first fetch documentation to understand the project
2. Use search-docs for specific questions about usage or features
3. Use search-code to find implementations or specific functions
4. Use fetch-url to retrieve external references mentioned in docs

Overview

This skill provides reliable access to GitHub repositories via the gitmcp.io MCP service, returning structured documentation and code results instead of raw HTML. It consolidates README, /docs, and code search into a single interface while honoring rate limits and robots.txt. Use it to avoid broken layouts, hallucinated repo structures, and noisy raw scraping outputs.

How this skill works

The skill converts GitHub URLs to gitmcp.io endpoints and exposes a set of MCP tools that fetch documentation, run semantic searches across docs, search code using the GitHub Search API, and retrieve referenced URLs. A CLI script (scripts/gitmcp.py) wraps those tools so you can list tools, fetch full docs, run semantic queries, search code, and call tools directly with JSON arguments. Tool names are generated from the repo name to avoid collisions.

When to use it

  • Research a repository quickly without opening GitHub in a browser
  • Get a single cleaned document that aggregates README and /docs for LLM consumption
  • Run semantic queries across a repo’s documentation (not just keyword matches)
  • Locate exact code implementations or function definitions via the GitHub Search API
  • Fetch external references mentioned in docs while respecting robots.txt and rate limits

Best practices

  • Always call fetch_{repo}_documentation first to get a complete context before asking targeted questions
  • Use search_{repo}_documentation for conceptual or usage questions and search_{repo}_code for implementation-level lookups
  • Prefer semantic search for intent-driven queries and exact-match code search for function names or signatures
  • Convert GitHub URLs to gitmcp.io form to ensure consistent results and correct tool name resolution
  • Respect rate limits: batch requests and avoid repeatedly fetching the same large docs

Example use cases

  • Summarize a project by fetching its full documentation and extracting key architecture and usage points
  • Find where a specific function is implemented by running search_{repo}_code with the function name
  • Answer how-to questions using semantic search across a repo’s /docs folder rather than raw README text
  • Retrieve the contents of a linked external spec or tutorial mentioned in docs using fetch_generic_url_content
  • Integrate the CLI into automation to archive or index multiple repos via gitmcp.io conversion

FAQ

Do I still need GitHub URLs?

No. Convert GitHub repository paths to gitmcp.io/owner/repo and call the corresponding MCP tools for reliable results.

Which tool should I call first?

Call fetch_{repo}_documentation first to build context, then use search_{repo}_documentation or search_{repo}_code for focused queries.