home / skills / git-fg / thecattoolkit / ingesting-git
This skill converts Git repositories into structured plain-text digests optimized for large language model analysis and rapid insights.
npx playbooks add skill git-fg/thecattoolkit --skill ingesting-gitReview the files below or copy the command above to add this skill to your agents.
---
name: ingesting-git
description: "Transforms repositories into structured plain-text digests optimized for LLM consumption. Use when analyzing GitHub repositories, digesting codebases, or ingesting git repos for AI analysis."
allowed-tools: [Read, Write, Edit, Bash, Grep]
---
# GitIngest Protocol
## Quick Start
**Execute via Script:**
```bash
uv run --with gitingest scripts/ingest.py <url_or_path> [options]
```
**Examples:**
```bash
# Ingest remote repo
uv run --with gitingest scripts/ingest.py https://github.com/user/repo
# Ingest with filtering
uv run --with gitingest scripts/ingest.py . -i "*.py" -e "tests/*"
```
## Output Format
GitIngest returns **structured plain-text** optimized for LLM consumption with three distinct sections:
**Section 1: Repository Summary**
```
Repository: owner/repo-name
Files analyzed: 42
Estimated tokens: 15.2k
```
**Section 2: Directory Structure**
```
Directory structure:
└── project-name/
├── src/
│ ├── main.py
│ └── utils.py
├── tests/
│ └── test_main.py
└── README.md
```
**Section 3: File Contents**
```
================================================
FILE: src/main.py
================================================
def hello_world():
print("Hello, World!")
```
## Configuration Options
| Option | Purpose | Example |
|:-------|:--------|:--------|
| `-i` / `--include-pattern` | Include files matching patterns | `-i "*.py" -i "*.js"` |
| `-e` / `--exclude-pattern` | Exclude files matching patterns | `-e "node_modules/*"` |
| `-s` / `--max-size` | Maximum file size in bytes | `-s 102400` |
| `-b` / `--branch` | Specify branch | `-b main` |
| `-t` / `--token` | GitHub access token | `-t $GITHUB_TOKEN` |
| `-o` | Output file (or `-` for stdout) | `-o digest.txt` |
## Common Exclude Patterns
```
node_modules/* # Dependencies
*.log # Log files
dist/* # Build outputs
build/* # Build directories
*.min.js # Minified files
*.lock # Lock files
```
## Implementation Protocol
When executing the gitingest skill:
1. **Assess Requirements**
- Determine if CLI or Python integration is needed
- Identify repository size and scope
- Plan filtering strategy (include/exclude patterns)
2. **Setup Environment**
- Verify gitingest installation
- Check authentication for private repositories
- Configure output destination
3. **Execute Ingestion**
- Run gitingest with appropriate parameters
- Monitor for errors and timeouts
- Apply filtering and size limits
4. **Process Results**
- Parse the three-section output format
- Analyze summary, tree, and content
- Generate insights and reports
## Extended Documentation
For detailed integration examples, error handling patterns, and best practices:
- **Integration Examples:** `references/integration-examples.md`
## Integration with CatToolkit
**Usage Examples:**
```bash
# "Ingest this repository for AI analysis"
# → Uses gitingest to create structured digest
# "Analyze the codebase without dependencies"
# → Uses gitingest with exclude-patterns for node_modules, dist, etc.
# "Generate documentation from this repo"
# → Uses gitingest + filtering to extract docs and code structure
```
The gitingest skill integrates seamlessly with other CatToolkit skills:
- **deep-analysis**: Process gitingest output for comprehensive insights
- **software-engineering**: Analyze ingested code for quality and security
- **prompt-engineering**: Use repository context to generate better prompts
This skill transforms Git repositories into structured plain-text digests optimized for large language model consumption. It produces a three-part output: a concise repository summary, a readable directory tree, and full file contents blocks. Use it to quickly convert codebases into LLM-friendly context for analysis, documentation, or downstream tools.
The skill clones or reads a repository path, applies include/exclude patterns and size limits, and then emits a three-section plain-text digest (summary, directory structure, file contents). Configuration flags let you target branches, authenticate to private repos, and control output destination. The output is intentionally simple so other tools or LLMs can parse and consume it reliably.
What does the output look like?
Three plain-text sections: a short repository summary, an indented directory tree, and labeled file-content blocks separated by delimiters.
How do I avoid ingesting dependencies or large files?
Use exclude-patterns (e.g., node_modules/*, dist/*), include-patterns to whitelist file types, and set a per-file max-size option.