home / skills / yoanbernabeu / grepai-skills / grepai-storage-gob

grepai-storage-gob skill

/skills/storage/grepai-storage-gob

This skill configures GrepAI to use GOB local file storage for embeddings, metadata, and indexes, enabling simple single-machine setups.

npx playbooks add skill yoanbernabeu/grepai-skills --skill grepai-storage-gob

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.2 KB
---
name: grepai-storage-gob
description: Configure GOB local file storage for GrepAI. Use this skill for simple, single-machine setups.
---

# GrepAI Storage with GOB

This skill covers using GOB (Go Binary) as the storage backend for GrepAI, the default and simplest option.

## When to Use This Skill

- Single developer projects
- Small to medium codebases
- Simple setup without external dependencies
- Local development environments

## What is GOB Storage?

GOB is Go's native binary serialization format. GrepAI uses it to store:
- Vector embeddings
- File metadata
- Chunk information

Everything is stored in a single local file.

## Advantages

| Benefit | Description |
|---------|-------------|
| ๐Ÿš€ **Simple** | No external services needed |
| โšก **Fast setup** | Works immediately |
| ๐Ÿ“ **Portable** | Single file, easy to backup |
| ๐Ÿ’ฐ **Free** | No infrastructure costs |
| ๐Ÿ”’ **Private** | Data stays local |

## Limitations

| Limitation | Description |
|------------|-------------|
| ๐Ÿ“ **Scalability** | Not ideal for very large codebases |
| ๐Ÿ‘ค **Single user** | No concurrent access |
| ๐Ÿ”„ **No sharing** | Can't share index across machines |
| ๐Ÿ’พ **Memory** | Loads into RAM for searches |

## Configuration

### Default Configuration

GOB is the default backend. Minimal config:

```yaml
# .grepai/config.yaml
store:
  backend: gob
```

### Explicit Configuration

```yaml
store:
  backend: gob
  # Index stored in .grepai/index.gob (automatic)
```

## Storage Location

GOB storage creates files in your project's `.grepai/` directory:

```
.grepai/
โ”œโ”€โ”€ config.yaml    # Configuration
โ”œโ”€โ”€ index.gob      # Vector embeddings
โ””โ”€โ”€ symbols.gob    # Symbol index for trace
```

## File Sizes

Approximate `.grepai/index.gob` sizes:

| Codebase | Files | Chunks | Index Size |
|----------|-------|--------|------------|
| Small | 100 | 500 | ~5 MB |
| Medium | 1,000 | 5,000 | ~50 MB |
| Large | 10,000 | 50,000 | ~500 MB |

## Operations

### Creating the Index

```bash
# Initialize project
grepai init

# Start indexing (creates index.gob)
grepai watch
```

### Checking Index Status

```bash
grepai status

# Output:
# Index: .grepai/index.gob
# Files: 245
# Chunks: 1,234
# Size: 12.5 MB
# Last updated: 2025-01-28 10:30:00
```

### Backing Up the Index

```bash
# Simple file copy
cp .grepai/index.gob .grepai/index.gob.backup
```

### Clearing the Index

```bash
# Delete and re-index
rm .grepai/index.gob
grepai watch
```

### Moving to a New Machine

```bash
# Copy entire .grepai directory
cp -r .grepai /path/to/new/location/

# Note: Only works if using same embedding model
```

## Performance Considerations

### Memory Usage

GOB loads the entire index into RAM for searches:

| Index Size | RAM Usage |
|------------|-----------|
| 10 MB | ~20 MB |
| 50 MB | ~100 MB |
| 500 MB | ~1 GB |

### Search Speed

GOB provides fast searches for typical codebases:

| Codebase Size | Search Time |
|---------------|-------------|
| Small (100 files) | <50ms |
| Medium (1K files) | <200ms |
| Large (10K files) | <1s |

### When to Upgrade

Consider PostgreSQL or Qdrant when:
- Index exceeds 1 GB
- Need concurrent access
- Want to share index across team
- Codebase has 50K+ files

## .gitignore Configuration

Add `.grepai/` to your `.gitignore`:

```gitignore
# GrepAI (machine-specific index)
.grepai/
```

**Why:** The index is machine-specific because:
- Contains binary embeddings
- Tied to the embedding model used
- Each machine should generate its own

## Sharing Index (Not Recommended)

While you can copy the index file, it's not recommended because:
1. Must use identical embedding model
2. File paths are absolute
3. Different machines may have different code versions

**Better approach:** Each developer runs their own `grepai watch`.

## Migrating to Other Backends

### To PostgreSQL

1. Update config:
```yaml
store:
  backend: postgres
  postgres:
    dsn: postgres://user:pass@localhost:5432/grepai
```

2. Re-index:
```bash
rm .grepai/index.gob
grepai watch
```

### To Qdrant

1. Update config:
```yaml
store:
  backend: qdrant
  qdrant:
    endpoint: localhost
    port: 6334
```

2. Re-index:
```bash
rm .grepai/index.gob
grepai watch
```

## Common Issues

โŒ **Problem:** Index file too large
โœ… **Solution:** Add more ignore patterns or migrate to PostgreSQL/Qdrant

โŒ **Problem:** Slow searches on large codebase
โœ… **Solution:** Migrate to Qdrant for better performance

โŒ **Problem:** Corrupted index
โœ… **Solution:** Delete and re-index:
```bash
rm .grepai/index.gob .grepai/symbols.gob
grepai watch
```

โŒ **Problem:** "Index not found" error
โœ… **Solution:** Run `grepai watch` to create the index

## Best Practices

1. **Use for small/medium projects:** Up to ~10K files
2. **Add to .gitignore:** Don't commit the index
3. **Backup before major changes:** Copy index.gob before experiments
4. **Re-index after model changes:** If you change embedding models
5. **Monitor file size:** Migrate if index exceeds 1GB

## Output Format

GOB storage status:

```
โœ… GOB Storage Configured

   Backend: GOB (local file)
   Index: .grepai/index.gob
   Size: 12.5 MB

   Contents:
   - Files: 245
   - Chunks: 1,234
   - Vectors: 1,234 ร— 768 dimensions

   Performance:
   - Search latency: <100ms
   - Memory usage: ~25 MB
```

Overview

This skill configures GOB local file storage as the backend for GrepAI, the default and simplest option for single-machine setups. It stores embeddings, file metadata, and chunk information in a single .grepai/index.gob file. Use it for quick, private, zero-dependency indexing and search on small to medium codebases.

How this skill works

GOB uses Go's binary serialization to write the entire index into files under .grepai/ (index.gob and symbols.gob). On search, the index is loaded into RAM for fast, low-latency queries. Configuration is minimalโ€”set store.backend: gobโ€”and GrepAI creates and updates the local file when you run grepai init and grepai watch.

When to use it

  • Local development on a single machine
  • Small to medium-sized repositories (up to ~10K files)
  • When you want no external services or infrastructure
  • Quick start or proof-of-concept for semantic code search
  • When privacy and portability (single-file backup) matter

Best practices

  • Add .grepai/ to .gitignore to avoid committing the index
  • Backup .grepai/index.gob before major experiments or upgrades
  • Re-index after changing the embedding model to keep vectors consistent
  • Monitor index size and RAM usage; migrate if it approaches ~1 GB
  • Prefer per-developer local indexes rather than sharing files across machines

Example use cases

  • A solo developer enabling semantic code search on a personal repo
  • Small team prototyping GrepAI features without deploying a server
  • Local codebase analysis and call-graph lookups during development
  • Fast iterations on embeddings and indexing without external dependencies
  • Temporary indexing on CI runner for isolated tests

FAQ

Can multiple developers share the same index.gob file?

Sharing is unsupported in practice: the file is tied to the embedding model, absolute file paths, and specific code versions. Each developer should run their own grepai watch.

When should I migrate from GOB to a hosted backend?

Migrate when your index grows beyond ~1 GB, you need concurrent access, want to share an index across machines, or have tens of thousands of filesโ€”use PostgreSQL or Qdrant for scalability and sharing.