home / skills / yoanbernabeu / grepai-skills / grepai-ignore-patterns

grepai-ignore-patterns skill

/skills/configuration/grepai-ignore-patterns

This skill helps you tailor GrepAI indexing by configuring ignore patterns to exclude files and directories, reducing noise and index size.

npx playbooks add skill yoanbernabeu/grepai-skills --skill grepai-ignore-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.5 KB
---
name: grepai-ignore-patterns
description: Configure ignore patterns in GrepAI. Use this skill when excluding files and directories from indexing.
---

# GrepAI Ignore Patterns

This skill covers how to configure ignore patterns to exclude files and directories from GrepAI indexing.

## When to Use This Skill

- Excluding test files from search results
- Ignoring generated or vendored code
- Reducing index size by excluding unnecessary files
- Customizing which files GrepAI indexes

## How Ignore Patterns Work

GrepAI uses two sources for ignore patterns:

1. **`.grepai/config.yaml`** - Custom patterns you define
2. **`.gitignore`** - Automatically respected

## Configuration Location

```yaml
# .grepai/config.yaml
ignore:
  - pattern1
  - pattern2
```

## Pattern Syntax

### Directory Patterns

```yaml
ignore:
  # Exact directory name (matches anywhere)
  - node_modules
  - vendor
  - __pycache__

  # With trailing slash (explicit directory)
  - dist/
  - build/
  - coverage/
```

### File Patterns

```yaml
ignore:
  # Exact filename
  - package-lock.json
  - yarn.lock

  # Wildcard patterns
  - "*.min.js"
  - "*.min.css"
  - "*.map"
  - "*.lock"
```

### Path Patterns

```yaml
ignore:
  # Paths containing substring
  - /tests/
  - /spec/
  - /__tests__/

  # Specific paths
  - src/generated/
  - api/swagger/
```

### Glob Patterns

```yaml
ignore:
  # Double star (recursive)
  - "**/test/**"
  - "**/mock/**"

  # Single star (single level)
  - "*.test.js"
  - "*.spec.ts"
  - "*_test.go"
```

## Default Ignore Patterns

GrepAI's default configuration includes:

```yaml
ignore:
  # Version control
  - .git
  - .svn
  - .hg

  # GrepAI itself
  - .grepai

  # Package managers
  - node_modules
  - vendor
  - .npm
  - .yarn

  # Build outputs
  - target
  - dist
  - build
  - out

  # Cache directories
  - __pycache__
  - .pytest_cache
  - .mypy_cache
  - .cache

  # Framework outputs
  - .next
  - .nuxt
  - .output
```

## Common Ignore Configurations

### JavaScript/TypeScript Project

```yaml
ignore:
  - node_modules
  - dist
  - build
  - coverage
  - .nyc_output
  - "*.min.js"
  - "*.bundle.js"
  - "*.map"
  - package-lock.json
  - yarn.lock
  - pnpm-lock.yaml
```

### Go Project

```yaml
ignore:
  - vendor
  - bin
  - "*.pb.go"       # Protobuf generated
  - "*_mock.go"     # Mocks
  - "mocks/"
  - go.sum
```

### Python Project

```yaml
ignore:
  - __pycache__
  - .pytest_cache
  - .mypy_cache
  - .venv
  - venv
  - env
  - "*.pyc"
  - "*.pyo"
  - .eggs
  - "*.egg-info"
  - dist
  - build
```

### Rust Project

```yaml
ignore:
  - target
  - Cargo.lock
  - "*.rlib"
```

### Java/Kotlin Project

```yaml
ignore:
  - target
  - build
  - .gradle
  - "*.class"
  - "*.jar"
  - "*.war"
```

### Monorepo

```yaml
ignore:
  # Common
  - node_modules
  - dist
  - build
  - coverage

  # Specific packages to exclude
  - packages/legacy/
  - packages/deprecated/

  # Generated
  - "**/generated/**"
  - "**/__generated__/**"
```

## Excluding Test Files

To focus search on production code:

```yaml
ignore:
  # Test directories
  - tests/
  - test/
  - __tests__/
  - spec/

  # Test files by pattern
  - "*.test.js"
  - "*.test.ts"
  - "*.spec.js"
  - "*.spec.ts"
  - "*_test.go"
  - "test_*.py"
  - "*_test.py"
```

**Alternative:** Use search boosting instead to penalize (not exclude) tests:

```yaml
search:
  boost:
    penalties:
      - pattern: /tests/
        factor: 0.5
      - pattern: _test.
        factor: 0.5
```

## Excluding Generated Code

```yaml
ignore:
  # Generated markers
  - "**/generated/**"
  - "*.generated.*"
  - "*.gen.*"

  # Specific generators
  - "*.pb.go"           # Protobuf
  - "*.graphql.ts"      # GraphQL codegen
  - "*.d.ts"            # TypeScript declarations
  - "swagger_*.go"      # Swagger
  - "openapi_*.ts"      # OpenAPI
```

## Excluding Documentation

```yaml
ignore:
  - docs/
  - documentation/
  - "*.md"
  - "*.mdx"
  - "*.rst"
```

## Verifying Ignore Patterns

Check what's being indexed:

```bash
# Check index status
grepai status

# Output shows file count
# If too high, add more ignore patterns
```

## Common Issues

❌ **Problem:** Index is too large
✅ **Solution:** Add more ignore patterns for dependencies and generated files

❌ **Problem:** Search returns vendor/test code
✅ **Solution:** Either ignore or use boosting penalties

❌ **Problem:** Pattern not working
✅ **Solution:** Check syntax - use quotes for patterns with special characters:
```yaml
ignore:
  - "*.min.js"  # Correct
  - *.min.js    # May cause YAML parsing issues
```

❌ **Problem:** Need to include previously ignored files
✅ **Solution:** Remove from ignore list and re-run `grepai watch`

## Best Practices

1. **Start with defaults:** Add patterns as needed
2. **Exclude dependencies:** Always ignore `node_modules`, `vendor`, etc.
3. **Exclude build outputs:** `dist`, `build`, `target`
4. **Exclude lock files:** Large, not useful for search
5. **Consider boosting vs ignoring:** Penalize instead of exclude for test files
6. **Quote special characters:** `"*.min.js"` not `*.min.js`

## Re-indexing After Changes

After modifying ignore patterns:

```bash
# Stop existing daemon
grepai watch --stop

# Clear index and restart
rm .grepai/index.gob
grepai watch
```

## Output Format

After configuring ignore patterns:

```
✅ Ignore Patterns Configured

   Patterns: 15 configured

   Categories:
   - Directories: node_modules, vendor, dist, build
   - File types: *.min.js, *.map, *.lock
   - Paths: /tests/, /docs/

   Also respecting: .gitignore

   Run 'grepai watch' to re-index with new patterns.
```

Overview

This skill shows how to configure ignore patterns for GrepAI so you can exclude files and directories from indexing. It explains the configuration location, pattern syntax, defaults, and common presets for different languages and monorepos. Use it to shrink your index, remove noise, and focus search on relevant source files.

How this skill works

GrepAI combines patterns defined in .grepai/config.yaml with entries from .gitignore to determine which files to skip. Patterns support exact names, trailing slashes for directories, globs (single * and recursive **), and path substrings. After updating ignore patterns you should stop the watcher, remove the index, and restart GrepAI so changes take effect.

When to use it

  • Exclude test files or directories to focus on production code
  • Ignore generated artifacts, vendored code, and large dependency trees
  • Reduce index size and indexing time for large repositories
  • Customize indexing behavior for language-specific build outputs
  • Temporarily exclude directories when troubleshooting search noise

Best practices

  • Start from GrepAI defaults and add project-specific patterns incrementally
  • Always ignore package managers and build outputs (node_modules, vendor, dist, target)
  • Prefer boosting penalties for tests when you want to keep them searchable but lower-ranked
  • Quote patterns with special characters (e.g. "*.min.js") to avoid YAML parsing errors
  • Re-index after changes: stop watcher, delete .grepai/index.gob, then run grepai watch

Example use cases

  • JavaScript/TypeScript app: ignore node_modules, dist, coverage, and lock files to cut index size
  • Python project: exclude __pycache__, .venv, *.pyc and test_*.py to remove ephemeral files
  • Go service: skip vendor, bin, *.pb.go and mocks/ to avoid generated and vendored code
  • Monorepo: exclude specific legacy packages plus **/generated/** to avoid duplicated generated sources
  • Documentation-heavy repo: ignore docs/, *.md, and *.rst files to keep code-centric search results

FAQ

Does GrepAI respect .gitignore automatically?

Yes. GrepAI automatically respects .gitignore in addition to patterns in .grepai/config.yaml.

My pattern isn’t excluding files. What should I check?

Verify YAML quoting for special patterns, confirm correct path or glob syntax, then restart the watcher and re-index.

Should I remove tests or just penalize them?

If you never need tests in search, ignore them. If you want them lower-ranked but searchable, use search boosting penalties instead.