home / skills / jezweb / claude-skills / cloudflare-vectorize
/skills/cloudflare-vectorize
This skill helps you implement Cloudflare Vectorize V2 for fast semantic search, robust metadata filtering, and reliable RAG workflows.
npx playbooks add skill jezweb/claude-skills --skill cloudflare-vectorizeReview the files below or copy the command above to add this skill to your agents.
---
name: cloudflare-vectorize
description: |
Build semantic search with Cloudflare Vectorize V2. Covers async mutations, 5M vectors/index, 31ms latency, returnMetadata enum changes, and V1 deprecation. Prevents 14 errors including dimension mismatches, TypeScript types, testing setup.
Use when: building RAG or semantic search, troubleshooting returnMetadata, V2 timing, metadata index, dimension errors, vitest setup, or wrangler --json output.
user-invocable: true
---
# Cloudflare Vectorize
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
**Status**: Production Ready ✅
**Last Updated**: 2026-01-21
**Dependencies**: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings)
**Latest Versions**: [email protected], @cloudflare/[email protected]
**Token Savings**: ~70%
**Errors Prevented**: 14
**Dev Time Saved**: ~4 hours
## What This Skill Provides
### Core Capabilities
- ✅ **Index Management**: Create, configure, and manage vector indexes
- ✅ **Vector Operations**: Insert, upsert, query, delete, and list vectors (list-vectors added August 2025)
- ✅ **Metadata Filtering**: Advanced filtering with 10 metadata indexes per index
- ✅ **Semantic Search**: Find similar vectors using cosine, euclidean, or dot-product metrics
- ✅ **RAG Patterns**: Complete retrieval-augmented generation workflows
- ✅ **Workers AI Integration**: Native embedding generation with @cf/baai/bge-base-en-v1.5
- ✅ **OpenAI Integration**: Support for text-embedding-3-small/large models
- ✅ **Document Processing**: Text chunking and batch ingestion pipelines
- ✅ **Testing Setup**: Vitest configuration with Vectorize bindings
### Templates Included
1. **basic-search.ts** - Simple vector search with Workers AI
2. **rag-chat.ts** - Full RAG chatbot with context retrieval
3. **document-ingestion.ts** - Document chunking and embedding pipeline
4. **metadata-filtering.ts** - Advanced filtering patterns
---
## ⚠️ Vectorize V2 Breaking Changes (September 2024)
**IMPORTANT**: Vectorize V2 became GA in September 2024 with significant breaking changes.
### What Changed in V2
**Performance Improvements**:
- **Index capacity**: 200,000 → **5 million vectors** per index
- **Query latency**: 549ms → **31ms** median (18× faster)
- **TopK limit**: 20 → **100** results per query
- **Scale limits**: 100 → **50,000 indexes** per account
- **Namespace limits**: 100 → **50,000 namespaces** per index
**Breaking API Changes**:
1. **Async Mutations** - All mutations now asynchronous:
```typescript
// V2: Returns mutationId
const result = await env.VECTORIZE_INDEX.insert(vectors);
console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
// Vector inserts/deletes may take a few seconds to be reflected
```
2. **returnMetadata Parameter** - Boolean → String enum:
```typescript
// ❌ V1 (deprecated)
{ returnMetadata: true }
// ✅ V2 (required)
{ returnMetadata: 'all' | 'indexed' | 'none' }
```
3. **Metadata Indexes Required Before Insert**:
- V2 requires metadata indexes created BEFORE vectors inserted
- Vectors added before metadata index won't be indexed
- Must re-upsert vectors after creating metadata index
**V1 Deprecation Timeline**:
- **December 2024**: Can no longer create V1 indexes
- **Existing V1 indexes**: Continue to work (other operations unaffected)
- **Migration**: Use `wrangler vectorize --deprecated-v1` flag for V1 operations
**Wrangler Version Required**:
- **Minimum**: [email protected] for V2 commands
- **Recommended**: [email protected]+ (latest)
### Check Mutation Status
```typescript
// Get index info to check last mutation processed
const info = await env.VECTORIZE_INDEX.describe();
console.log(info.mutationId); // Last mutation ID
console.log(info.processedUpToMutation); // Last processed timestamp
```
---
## Critical Setup Rules
### ⚠️ MUST DO BEFORE INSERTING VECTORS
```bash
# 1. Create the index with FIXED dimensions and metric
npx wrangler vectorize create my-index \
--dimensions=768 \
--metric=cosine
# 2. Create metadata indexes IMMEDIATELY (before inserting vectors!)
npx wrangler vectorize create-metadata-index my-index \
--property-name=category \
--type=string
npx wrangler vectorize create-metadata-index my-index \
--property-name=timestamp \
--type=number
```
**Why**: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
### Index Configuration (Cannot Be Changed Later)
```bash
# Dimensions MUST match your embedding model output:
# - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions
# - OpenAI text-embedding-3-small: 1536 dimensions
# - OpenAI text-embedding-3-large: 3072 dimensions
# Metrics determine similarity calculation:
# - cosine: Best for normalized embeddings (most common)
# - euclidean: Absolute distance between vectors
# - dot-product: For non-normalized vectors
```
## Wrangler Configuration
**wrangler.jsonc**:
```jsonc
{
"name": "my-vectorize-worker",
"main": "src/index.ts",
"compatibility_date": "2025-10-21",
"vectorize": [
{
"binding": "VECTORIZE_INDEX",
"index_name": "my-index"
}
],
"ai": {
"binding": "AI"
}
}
```
## TypeScript Types
```typescript
export interface Env {
VECTORIZE_INDEX: VectorizeIndex;
AI: Ai;
}
interface VectorizeVector {
id: string;
values: number[] | Float32Array | Float64Array;
namespace?: string;
metadata?: Record<string, string | number | boolean | string[]>;
}
interface VectorizeMatches {
matches: Array<{
id: string;
score: number;
values?: number[];
metadata?: Record<string, any>;
namespace?: string;
}>;
count: number;
}
```
## Metadata Filter Operators (V2)
Vectorize V2 supports advanced metadata filtering with range queries:
```typescript
// Equality (implicit $eq)
{ category: "docs" }
// Not equals
{ status: { $ne: "archived" } }
// In/Not in arrays
{ category: { $in: ["docs", "tutorials"] } }
{ category: { $nin: ["deprecated", "draft"] } }
// Range queries (numbers) - NEW in V2
{ timestamp: { $gte: 1704067200, $lt: 1735689600 } }
// Range queries (strings) - prefix searching
{ url: { $gte: "/docs/workers", $lt: "/docs/workersz" } }
// Nested metadata with dot notation
{ "author.id": "user123" }
// Multiple conditions (implicit AND)
{ category: "docs", language: "en", "metadata.published": true }
```
## Metadata Best Practices
### 1. Cardinality Considerations
**Low Cardinality (Good for $eq filters)**:
```typescript
// Few unique values - efficient filtering
metadata: {
category: "docs", // ~10 categories
language: "en", // ~5 languages
published: true // 2 values (boolean)
}
```
**High Cardinality (Avoid in range queries)**:
```typescript
// Many unique values - avoid large range scans
metadata: {
user_id: "uuid-v4...", // Millions of unique values
timestamp_ms: 1704067200123 // Use seconds instead
}
```
### 2. Metadata Limits
- **Max 10 metadata indexes** per Vectorize index
- **Max 10 KiB metadata** per vector
- **String indexes**: First 64 bytes (UTF-8)
- **Number indexes**: Float64 precision
- **Filter size**: Max 2048 bytes (compact JSON)
### 3. Vector Dimension Limit
**Current Limit**: 1536 dimensions per vector
**Source**: [GitHub Issue #8729](https://github.com/cloudflare/workers-sdk/issues/8729)
**Supported Embedding Models**:
- Workers AI `@cf/baai/bge-base-en-v1.5`: 768 dimensions ✅
- OpenAI `text-embedding-3-small`: 1536 dimensions ✅
- OpenAI `text-embedding-3-large`: 3072 dimensions ❌ (requires dimension reduction)
**Unsupported Models** (>1536 dimensions):
- `nomic-embed-code`: 3584 dimensions
- `Qodo-Embed-1-7B`: >1536 dimensions
**Workaround**:
Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality.
**Feature Request**: Higher dimension support is under consideration. Use [Limit Increase Request Form](https://forms.gle/nyamy2SM9zwWTXKE6) if this blocks your use case.
### 4. Key Restrictions
```typescript
// ❌ INVALID metadata keys
metadata: {
"": "value", // Empty key
"user.name": "John", // Contains dot (reserved for nesting)
"$admin": true, // Starts with $
"key\"with\"quotes": 1 // Contains quotes
}
// ✅ VALID metadata keys
metadata: {
"user_name": "John",
"isAdmin": true,
"nested": { "allowed": true } // Access as "nested.allowed" in filters
}
```
## Best Practices
### Batch Insert Performance
**Critical**: Use batch size of 5000 vectors for optimal performance.
**Performance Data**:
- **Individual inserts**: 2.5M vectors in 36+ hours (incomplete)
- **Batch inserts (5000)**: 4M vectors in ~12 hours
- **18× faster with proper batching**
**Why 5000?**
- Vectorize's internal Write-Ahead Log (WAL) optimized for this size
- Avoids Cloudflare API rate limits
- Balances throughput and memory usage
**Optimal Pattern**:
```typescript
const BATCH_SIZE = 5000;
async function insertVectors(vectors: VectorizeVector[]) {
for (let i = 0; i < vectors.length; i += BATCH_SIZE) {
const batch = vectors.slice(i, i + BATCH_SIZE);
const result = await env.VECTORIZE.insert(batch);
console.log(`Inserted batch ${i / BATCH_SIZE + 1}, mutationId: ${result.mutationId}`);
// Optional: Rate limiting delay
if (i + BATCH_SIZE < vectors.length) {
await new Promise(resolve => setTimeout(resolve, 100));
}
}
}
```
**Sources**:
- [Community Report](https://community.cloudflare.com/t/performance-issues-with-large-scale-inserts-in-vectorize/788917)
- [Official Best Practices](https://developers.cloudflare.com/vectorize/best-practices/insert-vectors/)
---
### Query Accuracy Modes
Vectorize uses approximate nearest neighbor (ANN) search by default with ~80% accuracy compared to exact search.
**Default Mode**: Approximate scoring (~80% accuracy)
- Faster latency
- Good for RAG, search, recommendations
- topK up to 100
**High-Precision Mode**: Near 100% accuracy
- Enabled via `returnValues: true`
- Higher latency
- Limited to topK=20
**Trade-off Example**:
```typescript
// Fast, ~80% accuracy, topK up to 100
const results = await env.VECTORIZE.query(embedding, {
topK: 50,
returnValues: false // Default
});
// Slower, ~100% accuracy, topK max 20
const preciseResults = await env.VECTORIZE.query(embedding, {
topK: 10,
returnValues: true // High-precision scoring
});
```
**When to Use High-Precision**:
- Critical applications (fraud detection, legal compliance)
- Small result sets (topK < 20)
- Accuracy is higher priority than latency
**Source**: [Cloudflare Blog - Building Vectorize](https://blog.cloudflare.com/building-vectorize-a-distributed-vector-database-on-cloudflare-developer-platform/)
---
## Common Errors & Solutions
### Error 1: Metadata Index Created After Vectors Inserted
```
Problem: Filtering doesn't work on existing vectors
Solution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting
```
### Error 2: Dimension Mismatch
```
Problem: "Vector dimensions do not match index configuration"
Solution: Ensure embedding model output matches index dimensions:
- Workers AI bge-base: 768
- OpenAI small: 1536
- OpenAI large: 3072
```
### Error 3: Invalid Metadata Keys
```
Problem: "Invalid metadata key"
Solution: Keys cannot:
- Be empty
- Contain . (dot)
- Contain " (quote)
- Start with $ (dollar sign)
```
### Error 4: Filter Too Large
```
Problem: "Filter exceeds 2048 bytes"
Solution: Simplify filter or split into multiple queries
```
### Error 5: Range Query on High Cardinality
```
Problem: Slow queries or reduced accuracy
Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps
```
### Error 6: Insert vs Upsert Confusion
```
Problem: Updates not reflecting in index
Solution: Use upsert() to overwrite existing vectors, not insert()
```
### Error 7: Missing Bindings
```
Problem: "VECTORIZE_INDEX is not defined"
Solution: Add [[vectorize]] binding to wrangler.jsonc
```
### Error 8: Namespace vs Metadata Confusion
```
Problem: Unclear when to use namespace vs metadata filtering
Solution:
- Namespace: Partition key, applied BEFORE metadata filters
- Metadata: Flexible key-value filtering within namespace
```
### Error 9: V2 Async Mutation Timing (NEW in V2)
```
Problem: Inserted vectors not immediately queryable
Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected
- Use mutationId to track mutation status
- Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestamp
```
### Error 10: V1 returnMetadata Boolean (BREAKING in V2)
```
Problem: "returnMetadata must be 'all', 'indexed', or 'none'"
Solution: V2 changed returnMetadata from boolean to string enum:
- ❌ V1: { returnMetadata: true }
- ✅ V2: { returnMetadata: 'all' }
```
### Error 11: Wrangler --json Output Contains Log Prefix
**Error**: `wrangler vectorize list --json` output starts with log message, breaking JSON parsing
**Source**: [GitHub Issue #11011](https://github.com/cloudflare/workers-sdk/issues/11011)
**Affected Commands**:
- `wrangler vectorize list --json`
- `wrangler vectorize list-metadata-index --json`
**Problem**:
```bash
$ wrangler vectorize list --json
📋 Listing Vectorize indexes...
[
{ "created_on": "2025-10-18T13:28:30.259277Z", ... }
]
```
The log message makes output invalid JSON, breaking piping to `jq` or other tools.
**Solution**: Strip first line before parsing:
```bash
# Using tail
wrangler vectorize list --json | tail -n +2 | jq '.'
# Using sed
wrangler vectorize list --json | sed '1d' | jq '.'
```
---
### Error 12: TypeScript Types Missing Filter Operators
**Error**: `wrangler types` generates incomplete `VectorizeVectorMetadataFilterOp` type
**Source**: [GitHub Issue #10092](https://github.com/cloudflare/workers-sdk/issues/10092)
**Status**: OPEN (tracked internally as VS-461)
**Problem**:
Generated type only includes `$eq` and `$ne`, missing V2 operators: `$in`, `$nin`, `$lt`, `$lte`, `$gt`, `$gte`
**Impact**:
TypeScript shows false errors when using valid V2 metadata filter operators:
```typescript
const vectorizeRes = env.VECTORIZE.queryById(imgId, {
filter: { gender: { $in: genderFilters } }, // ❌ TS error but works!
topK,
returnMetadata: 'indexed',
});
```
**Workaround**: Manual type override until wrangler types is fixed:
```typescript
// Add to your types file
type VectorizeMetadataFilter = Record<string,
| string
| number
| boolean
| {
$eq?: string | number | boolean;
$ne?: string | number | boolean;
$in?: (string | number | boolean)[];
$nin?: (string | number | boolean)[];
$lt?: number | string;
$lte?: number | string;
$gt?: number | string;
$gte?: number | string;
}
>;
```
---
### Error 13: Windows Dev Registry Failure (FIXED)
**Error**: `ENOENT: no such file or directory` when running `wrangler dev` on Windows
**Source**: [GitHub Issue #10383](https://github.com/cloudflare/workers-sdk/issues/10383)
**Status**: FIXED in [email protected]
**Problem**:
Wrangler attempted to create external worker files with colons in the name (invalid on Windows):
```
Error: ENOENT: ... '__WRANGLER_EXTERNAL_VECTORIZE_WORKER:<project>:<binding>'
```
**Solution**:
Update to [email protected] or later:
```bash
npm install -g wrangler@latest
```
---
### Error 14: topK Limit Depends on returnValues/returnMetadata
**Error**: `topK exceeds maximum allowed value`
**Source**: [Vectorize Limits](https://developers.cloudflare.com/vectorize/platform/limits/)
**Problem**: Maximum topK value changes based on query options:
| Configuration | Max topK |
|---------------|----------|
| `returnValues: false`, `returnMetadata: 'none'` | 100 |
| `returnValues: true` OR `returnMetadata: 'all'` | 20 |
| `returnMetadata: 'indexed'` | 100 |
**Common Error**:
```typescript
// ❌ ERROR - topK too high with returnValues
query(embedding, {
topK: 100, // Exceeds limit!
returnValues: true // Max topK=20 when true
});
```
**Solution**:
```typescript
// ✅ OK - respects conditional limit
query(embedding, {
topK: 20,
returnValues: true
});
// ✅ OK - higher topK without values
query(embedding, {
topK: 100,
returnValues: false,
returnMetadata: 'indexed'
});
```
---
## V2 Migration Checklist
**If migrating from V1 to V2**:
1. ✅ Update wrangler to 3.71.0+ (`npm install -g wrangler@latest`)
2. ✅ Create new V2 index (can't upgrade V1 → V2)
3. ✅ Create metadata indexes BEFORE inserting vectors
4. ✅ Update `returnMetadata` boolean → string enum ('all', 'indexed', 'none')
5. ✅ Handle async mutations (expect `mutationId` in responses)
6. ✅ Test with V2 limits (topK up to 100, 5M vectors per index)
7. ✅ Update error handling for async behavior
**V1 Deprecation**:
- After December 2024: Cannot create new V1 indexes
- Existing V1 indexes: Continue to work
- Use `wrangler vectorize --deprecated-v1` for V1 operations
---
## Testing Considerations
### Vitest with Vectorize Bindings
**Issue**: Using `@cloudflare/vitest-pool-workers` with Vectorize or Workers AI bindings causes runtime failure.
**Source**: [GitHub Issue #7434](https://github.com/cloudflare/workers-sdk/issues/7434)
**Error**: `wrapped binding module can't be resolved`
**Workaround**:
1. Create `wrangler-test.jsonc` without Vectorize/AI bindings
2. Point vitest config to test-specific wrangler file
3. Mock bindings in your tests
**Example**:
```typescript
// wrangler-test.jsonc (no Vectorize binding)
{
"name": "my-worker-test",
"main": "src/index.ts",
"compatibility_date": "2025-10-21"
// No vectorize binding
}
// vitest.config.ts
import { defineWorkersProject } from '@cloudflare/vitest-pool-workers/config';
export default defineWorkersProject({
test: {
poolOptions: {
workers: {
wrangler: {
configPath: "./wrangler-test.jsonc"
}
}
}
}
});
// Mock in tests
import { vi } from 'vitest';
const mockVectorize = {
query: vi.fn().mockResolvedValue({
matches: [
{ id: 'test-1', score: 0.95, metadata: { category: 'docs' } }
],
count: 1
}),
insert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }),
upsert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" })
};
// Use mock in tests
test('vector search', async () => {
const env = { VECTORIZE_INDEX: mockVectorize };
// ... test logic
});
```
---
## Community Tips
**Note**: These tips come from community discussions and official blog posts. Verify against your Vectorize version.
### Tip 1: Range Queries at Scale May Have Reduced Accuracy (Community-sourced)
**Source**: [Query Best Practices](https://developers.cloudflare.com/vectorize/best-practices/query-vectors/)
**Confidence**: MEDIUM
**Applies to**: Datasets with ~10M+ vectors
Range queries (`$lt`, `$lte`, `$gt`, `$gte`) on large datasets may experience reduced accuracy.
**Optimization Strategy**:
```typescript
// ❌ High-cardinality range at scale
metadata: {
timestamp_ms: 1704067200123
}
filter: { timestamp_ms: { $gte: 1704067200000 } }
// ✅ Bucketed into discrete values
metadata: {
timestamp_bucket: "2025-01-01-00:00", // 1-hour buckets
timestamp_ms: 1704067200123 // Original (non-indexed)
}
filter: {
timestamp_bucket: {
$in: ["2025-01-01-00:00", "2025-01-01-01:00"]
}
}
```
**When This Matters**:
- Time-based filtering over months/years
- User IDs, transaction IDs (UUID ranges)
- Any high-cardinality continuous data
**Alternative**: Use equality filters (`$eq`, `$in`) with bucketed values.
---
### Tip 2: List Vectors Operation (Added August 2025)
**Source**: [Vectorize Changelog](https://developers.cloudflare.com/vectorize/platform/changelog/)
Vectorize V2 added support for the `list-vectors` operation for paginated iteration through vector IDs.
**Use Cases**:
- Auditing vector collections
- Bulk vector operations
- Debugging index contents
**API**:
```typescript
const result = await env.VECTORIZE_INDEX.list({
limit: 1000, // Max 1000 per page
cursor?: string
});
// result.vectors: Array<{ id: string }>
// result.cursor: string | undefined
// result.count: number
// Pagination example
let cursor: string | undefined;
const allVectorIds: string[] = [];
do {
const result = await env.VECTORIZE_INDEX.list({
limit: 1000,
cursor
});
allVectorIds.push(...result.vectors.map(v => v.id));
cursor = result.cursor;
} while (cursor);
```
**Limitations**:
- Returns IDs only (not values or metadata)
- Max 1000 vectors per page
- Use cursor for pagination
---
## Official Documentation
- **Vectorize V2 Docs**: https://developers.cloudflare.com/vectorize/
- **V2 Changelog**: https://developers.cloudflare.com/vectorize/platform/changelog/
- **V1 to V2 Migration**: https://developers.cloudflare.com/vectorize/reference/transition-vectorize-legacy/
- **Metadata Filtering**: https://developers.cloudflare.com/vectorize/reference/metadata-filtering/
- **Workers AI Models**: https://developers.cloudflare.com/workers-ai/models/
---
**Status**: Production Ready ✅ (Vectorize V2 GA - September 2024)
**Last Updated**: 2026-01-21
**Token Savings**: ~70%
**Errors Prevented**: 14 (includes V2 breaking changes, testing setup, TypeScript types)
**Changes**: Added 4 new errors (wrangler --json, TypeScript types, Windows dev, topK limits), batch performance best practices, query accuracy modes, testing setup, community tips on range queries and list-vectors operation.
This skill shows how to build production-grade semantic search and RAG workflows using Cloudflare Vectorize V2 from a TypeScript Workers environment. It documents V2 breaking changes, async mutation behavior, metadata indexing rules, dimension limits, and practical fixes for 14 common errors so you can deploy fast, reliable vector search with minimal friction.
The skill provides index management, vector insert/upsert/query/delete, metadata filtering patterns, batching strategies, and vitest-friendly test bindings. It explains V2-specific APIs (async mutations returning mutationId, returnMetadata enum values), how to check mutation processing with describe(), and how to integrate Workers AI or OpenAI embeddings with correct dimensions and metrics.
Why are my new metadata filters not returning results?
Metadata indexes must exist before vectors are inserted. If created after, delete and re-insert or upsert vectors so metadata is indexed.
What changed for returnMetadata in V2?
returnMetadata is now a required string enum: 'all', 'indexed', or 'none' (the boolean form from V1 is invalid).
How do I handle async mutation timing?
Mutations return a mutationId; check env.VECTORIZE_INDEX.describe() for processedUpToMutation and wait a few seconds for changes to become queryable.