home / skills / openclaw / skills / openviking

openviking skill

/skills/zaynjarvis/openviking

This skill enables rapid semantic search and memory management for AI agents by interfacing with OpenViking's context database.

npx playbooks add skill openclaw/skills --skill openviking

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
3.1 KB
---
name: openviking
description: RAG and semantic search via OpenViking Context Database MCP server. Query documents, search knowledge base, add files/URLs to vector memory. Use for document Q&A, knowledge management, AI agent memory, file search, semantic retrieval. Triggers on "openviking", "search documents", "semantic search", "knowledge base", "vector database", "RAG", "query pdf", "document query", "add resource".
---

# OpenViking - Context Database for AI Agents

OpenViking is ByteDance's open-source **Context Database** designed for AI Agents — a next-generation RAG system that replaces flat vector storage with a filesystem paradigm for managing memories, resources, and skills.

**Key Features:**
- **Filesystem paradigm**: Organize context like files with URIs (`viking://resources/...`)
- **Tiered context (L0/L1/L2)**: Abstract → Overview → Full content, loaded on demand
- **Directory recursive retrieval**: Better accuracy than flat vector search
- **MCP server included**: Full RAG pipeline via Model Context Protocol

---

## Quick Check: Is It Set Up?

```bash
test -f ~/code/openviking/examples/mcp-query/ov.conf && echo "Ready" || echo "Needs setup"
curl -s http://localhost:2033/mcp && echo "Running" || echo "Not running"
```

## If Not Set Up → Initialize

Run the init script (one-time):

```bash
bash ~/.openclaw/skills/openviking-mcp/scripts/init.sh
```

This will:
1. Clone OpenViking from `https://github.com/volcengine/OpenViking`
2. Install dependencies with `uv sync`
3. Create `ov.conf` template
4. **Pause for you to add API keys** (embedding.dense.api_key, vlm.api_key)

**Required: Volcengine/Ark API Keys**

| Config Key | Purpose |
|------------|---------|
| `embedding.dense.api_key` | Semantic search embeddings |
| `vlm.api_key` | LLM for answer generation |

Get keys from: https://console.volcengine.com/ark

## Start the Server

```bash
cd ~/code/openviking/examples/mcp-query
uv run server.py
```

Options:
- `--port 2033` - Listen port
- `--host 127.0.0.1` - Bind address
- `--data ./data` - Data directory

Server will be at: `http://127.0.0.1:2033/mcp`

## Connect to Claude

```bash
claude mcp add --transport http openviking http://localhost:2033/mcp
```

Or add to `~/.mcp.json`:
```json
{
  "mcpServers": {
    "openviking": {
      "type": "http",
      "url": "http://localhost:2033/mcp"
    }
  }
}
```

## Tools Available

| Tool | Description |
|------|-------------|
| `query` | Full RAG pipeline — search + LLM answer |
| `search` | Semantic search only, returns docs |
| `add_resource` | Add files, directories, or URLs |

## Example Usage

Once connected via MCP:

```
"Query: What is OpenViking?"
"Search: machine learning papers"
"Add https://example.com/article to knowledge base"
"Add ~/documents/report.pdf"
```

## Troubleshooting

| Issue | Fix |
|-------|-----|
| Port in use | `uv run server.py --port 2034` |
| Auth errors | Check API keys in ov.conf |
| Server not found | Ensure it's running: `curl localhost:2033/mcp` |

## Files

- `ov.conf` - Configuration (API keys, models)
- `data/` - Vector database storage
- `server.py` - MCP server implementation

Overview

This skill integrates OpenViking as a Context Database for RAG and semantic search, exposing a local MCP server to query documents, add files or URLs, and run document Q&A. It organizes context with a filesystem paradigm and tiered context levels for efficient retrieval and on-demand loading. Use it to build reliable vector-backed knowledge stores and agent memory accessible via standard MCP clients.

How this skill works

The skill runs an OpenViking MCP server that stores resources as file-like URIs and maintains tiered context (L0/L1/L2) so summaries and full content are loaded when needed. It supports semantic search (embeddings), recursive directory retrieval, and a full query tool that combines search results with an LLM for answers. Resources are added via add_resource and become part of the vector memory used by search and query.

When to use it

  • Build a local RAG pipeline for document question answering
  • Create a searchable knowledge base from PDFs, web pages, and folders
  • Provide persistent agent memory or skill resources for AI agents
  • Replace flat vector storage with a structured filesystem-style context store
  • Run semantic search across large directory trees with better relevance

Best practices

  • Provision required embedding and LLM API keys in the configuration before starting the server
  • Organize resources into clear directories and use descriptive URIs (viking://resources/...) to leverage recursive retrieval
  • Use tiered content: store summaries at L1 and full content at L2 to reduce cost and latency
  • Run the MCP server on a dedicated port and monitor with simple curl health checks
  • Add resources incrementally and validate search results before relying on automated agents

Example use cases

  • Ask natural language questions over a corpus of research PDFs and receive sourced answers
  • Ingest a project folder and let agents recall design notes, specs, and meeting docs
  • Archive web articles and query them semantically for competitive research
  • Attach OpenViking as an MCP server to a multi-agent setup to share a common knowledge memory
  • Use add_resource to import URLs and local PDFs during onboarding to build a searchable knowledge base

FAQ

What keys are required to run the server?

You need an embeddings API key and an LLM (VLM) API key configured in ov.conf for semantic embeddings and answer generation.

How do I check the server is running?

Use a simple HTTP health check such as curl http://localhost:2033/mcp; the MCP endpoint should respond when the server is running.