home / skills / intellectronica / agent-skills / mgrep-code-search

mgrep-code-search skill

/skills/mgrep-code-search

This skill enables semantic code search across large codebases using mgrep to locate concepts, features, and implementation details efficiently.

npx playbooks add skill intellectronica/agent-skills --skill mgrep-code-search

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.4 KB
---
name: mgrep-code-search
description: Semantic code search using mgrep for efficient codebase exploration. This skill should be used when searching or exploring codebases with more than 30 non-gitignored files and/or nested directory structures. It provides natural language semantic search that complements traditional grep/ripgrep for finding features, understanding intent, and exploring unfamiliar code.
---

# mgrep Code Search

## Overview

mgrep is a semantic search tool that enables natural language queries across code, text, PDFs, and images. It is particularly effective for exploring larger or complex codebases where traditional pattern matching falls short.

## When to Use This Skill

Use mgrep when:
- The codebase contains more than 30 non-gitignored files
- There are nested directory structures
- Searching for concepts, features, or intent rather than exact strings
- Exploring an unfamiliar codebase
- Need to understand "where" or "how" something is implemented

Use traditional grep/ripgrep when:
- Searching for exact patterns or symbols
- Regex-based refactoring
- Tracing specific function or variable names

## Quick Start

### Indexing

Before searching, start the watcher to index the repository:

```bash
bunx @mixedbread/mgrep watch
```

The `watch` command indexes the repository and maintains synchronisation with file changes. It respects `.gitignore` and `.mgrepignore` patterns.

### Searching

```bash
bunx @mixedbread/mgrep "your natural language query" [path]
```

## Search Commands

### Basic Search

```bash
bunx @mixedbread/mgrep "where is authentication configured?"
bunx @mixedbread/mgrep "how do we handle errors in API calls?" src/
bunx @mixedbread/mgrep "database connection setup" src/lib
```

### Search Options

| Option | Description |
|--------|-------------|
| `-m <count>` | Maximum results (default: 10) |
| `-c, --content` | Display full result content |
| `-a, --answer` | Generate AI-powered synthesis of results |
| `-s, --sync` | Update index before searching |
| `--no-rerank` | Disable relevance optimisation |

### Examples with Options

```bash
# Get more results
bunx @mixedbread/mgrep -m 25 "user authentication flow"

# Show full content of matches
bunx @mixedbread/mgrep -c "error handling patterns"

# Get an AI-synthesised answer
bunx @mixedbread/mgrep -a "how does the caching layer work?"

# Sync index before searching
bunx @mixedbread/mgrep -s "payment processing" src/services
```

## Workflow

1. **Start watcher** (once per session or when files change significantly):
   ```bash
   bunx @mixedbread/mgrep watch
   ```

2. **Search semantically**:
   ```bash
   bunx @mixedbread/mgrep "what you're looking for" [optional/path]
   ```

3. **Refine as needed** using path constraints or options:
   ```bash
   bunx @mixedbread/mgrep -m 20 -c "refined query" src/specific/directory
   ```

## Environment Variables

Configure defaults via environment variables:

| Variable | Purpose |
|----------|---------|
| `MGREP_MAX_COUNT` | Default result limit |
| `MGREP_CONTENT` | Enable content display (1/true) |
| `MGREP_ANSWER` | Enable AI synthesis (1/true) |
| `MGREP_SYNC` | Pre-search sync (1/true) |

## Important Notes

- Always use `bunx @mixedbread/mgrep` to run commands (not npm/npx or direct installation)
- Run `bunx @mixedbread/mgrep watch` before searching to ensure the index is current
- mgrep respects `.gitignore` patterns automatically
- Create `.mgrepignore` for additional exclusions

Overview

This skill provides semantic code search using mgrep to explore medium-to-large codebases quickly. It complements grep/ripgrep by answering natural language queries about features, intent, and implementation. Use it when repository size or nesting makes simple pattern search inefficient.

How this skill works

The skill indexes the repository and performs semantic matching of natural language queries against code, text, PDFs, and images. A watcher process keeps the index in sync with file changes and respects .gitignore and .mgrepignore. Search commands accept path constraints and options to control result count, content display, and AI-driven synthesis.

When to use it

  • Codebases with more than ~30 non-gitignored files or deep directory nesting
  • Looking for concepts, features, or intent rather than exact identifiers
  • Exploring an unfamiliar project to find where behavior is implemented
  • Tracing architectural responsibilities (e.g., auth, caching, error handling)
  • When grep/ripgrep returns too many false positives or misses semantic matches

Best practices

  • Start the watcher (bunx @mixedbread/mgrep watch) before heavy searching to keep the index current
  • Constrain searches to a path when possible to improve relevance and speed
  • Use -m to increase result count for broad queries and -c to inspect full matches
  • Enable AI synthesis (-a) when you need a concise summary of scattered findings
  • Add a .mgrepignore to exclude large generated folders that bloat the index

Example use cases

  • Find where authentication is configured across backend and frontend code
  • Locate all error handling patterns for API calls to standardize responses
  • Discover database connection and migration setup spread across modules
  • Understand how caching is implemented and where cache keys are defined
  • Quickly map feature ownership when joining an unfamiliar repository

FAQ

Do I need to install mgrep globally?

No. Run commands via bunx @mixedbread/mgrep as documented; that ensures the correct package is used.

How do I keep search results up to date?

Start the watcher (bunx @mixedbread/mgrep watch) to maintain an up-to-date index and use --sync or -s before searches when needed.