home / skills / secondsky / claude-skills / cloudflare-workers-performance

This skill helps optimize Cloudflare Workers performance by reducing cold starts, memory, CPU usage, and leveraging edge caching and streaming.

npx playbooks add skill secondsky/claude-skills --skill cloudflare-workers-performance

Review the files below or copy the command above to add this skill to your agents.

Files (11)
SKILL.md
6.1 KB
---
name: workers-performance
description: Cloudflare Workers performance optimization with CPU, memory, caching, bundle size. Use for slow workers, high latency, cold starts, or encountering CPU limits, memory issues, timeout errors.
---

# Cloudflare Workers Performance Optimization

Techniques for maximizing Worker performance and minimizing latency.

## Quick Wins

```typescript
// 1. Avoid unnecessary cloning
// ❌ Bad: Clones entire request
const body = await request.clone().json();

// ✅ Good: Parse directly when not re-using body
const body = await request.json();

// 2. Use streaming instead of buffering
// ❌ Bad: Buffers entire response
const text = await response.text();
return new Response(transform(text));

// ✅ Good: Stream transformation
return new Response(response.body.pipeThrough(new TransformStream({
  transform(chunk, controller) {
    controller.enqueue(process(chunk));
  }
})));

// 3. Cache expensive operations
const cache = caches.default;
const cached = await cache.match(request);
if (cached) return cached;
```

## Critical Rules

1. **Stay under CPU limits** - 10ms (free), 30ms (paid), 50ms (unbound)
2. **Minimize cold starts** - Keep bundles < 1MB, avoid dynamic imports
3. **Use Cache API** - Cache responses at the edge
4. **Stream large payloads** - Don't buffer entire responses
5. **Batch operations** - Combine multiple KV/D1 calls

## Top 10 Performance Errors

| Error | Symptom | Fix |
|-------|---------|-----|
| CPU limit exceeded | Worker terminated | Optimize hot paths, use streaming |
| Cold start latency | First request slow | Reduce bundle size, avoid top-level await |
| Memory pressure | Slow GC, timeouts | Stream data, avoid large arrays |
| KV latency | Slow reads | Use Cache API, batch reads |
| D1 slow queries | High latency | Add indexes, optimize SQL |
| Large bundles | Slow cold starts | Tree-shake, code split |
| Blocking operations | Request timeouts | Use Promise.all, streaming |
| Unnecessary cloning | Memory spike | Only clone when needed |
| Missing cache | Repeated computation | Implement caching layer |
| Sync operations | CPU spikes | Use async alternatives |

## CPU Optimization

### Profile Hot Paths

```typescript
async function profiledHandler(request: Request): Promise<Response> {
  const timing: Record<string, number> = {};

  const time = async <T>(name: string, fn: () => Promise<T>): Promise<T> => {
    const start = Date.now();
    const result = await fn();
    timing[name] = Date.now() - start;
    return result;
  };

  const data = await time('fetch', () => fetchData());
  const processed = await time('process', () => processData(data));
  const response = await time('serialize', () => serialize(processed));

  console.log('Timing:', timing);
  return new Response(response);
}
```

### Optimize JSON Operations

```typescript
// For large JSON, use streaming parser
import { JSONParser } from '@streamparser/json';

async function parseStreamingJSON(stream: ReadableStream): Promise<unknown[]> {
  const parser = new JSONParser();
  const results: unknown[] = [];

  parser.onValue = (value) => results.push(value);

  for await (const chunk of stream) {
    parser.write(chunk);
  }

  return results;
}
```

## Memory Optimization

### Avoid Large Arrays

```typescript
// ❌ Bad: Loads all into memory
const items = await db.prepare('SELECT * FROM items').all();
const processed = items.results.map(transform);

// ✅ Good: Process in batches
async function* batchProcess(db: D1Database, batchSize = 100) {
  let offset = 0;
  while (true) {
    const { results } = await db
      .prepare('SELECT * FROM items LIMIT ? OFFSET ?')
      .bind(batchSize, offset)
      .all();

    if (results.length === 0) break;

    for (const item of results) {
      yield transform(item);
    }
    offset += batchSize;
  }
}
```

## Caching Strategies

### Multi-Layer Cache

```typescript
interface CacheLayer {
  get(key: string): Promise<unknown | null>;
  set(key: string, value: unknown, ttl?: number): Promise<void>;
}

// Layer 1: In-memory (request-scoped)
const memoryCache = new Map<string, unknown>();

// Layer 2: Cache API (edge-local)
const edgeCache: CacheLayer = {
  async get(key) {
    const response = await caches.default.match(new Request(`https://cache/${key}`));
    return response ? response.json() : null;
  },
  async set(key, value, ttl = 60) {
    await caches.default.put(
      new Request(`https://cache/${key}`),
      new Response(JSON.stringify(value), {
        headers: { 'Cache-Control': `max-age=${ttl}` }
      })
    );
  }
};

// Layer 3: KV (global)
// Use env.KV.get/put
```

## Bundle Optimization

```typescript
// 1. Tree-shake imports
// ❌ Bad
import * as lodash from 'lodash';

// ✅ Good
import { debounce } from 'lodash-es';

// 2. Lazy load heavy dependencies
let heavyLib: typeof import('heavy-lib') | undefined;

async function getHeavyLib() {
  if (!heavyLib) {
    heavyLib = await import('heavy-lib');
  }
  return heavyLib;
}
```

## When to Load References

Load specific references based on the task:

- **Optimizing CPU usage?** → Load `references/cpu-optimization.md`
- **Memory issues?** → Load `references/memory-optimization.md`
- **Implementing caching?** → Load `references/caching-strategies.md`
- **Reducing bundle size?** → Load `references/bundle-optimization.md`
- **Cold start problems?** → Load `references/cold-starts.md`

## Templates

| Template | Purpose | Use When |
|----------|---------|----------|
| `templates/performance-middleware.ts` | Performance monitoring | Adding timing/profiling |
| `templates/caching-layer.ts` | Multi-layer caching | Implementing cache |
| `templates/optimized-worker.ts` | Performance patterns | Starting optimized worker |

## Scripts

| Script | Purpose | Command |
|--------|---------|---------|
| `scripts/benchmark.sh` | Load testing | `./benchmark.sh <url>` |
| `scripts/profile-worker.sh` | CPU profiling | `./profile-worker.sh` |

## Resources

- Performance: https://developers.cloudflare.com/workers/platform/performance/
- Limits: https://developers.cloudflare.com/workers/platform/limits/
- Caching: https://developers.cloudflare.com/workers/runtime-apis/cache/

Overview

This skill helps optimize Cloudflare Workers for CPU, memory, caching, and bundle size to reduce latency, cold starts, and timeout errors. It provides practical techniques, templates, and scripts to diagnose hot paths and apply quick wins in production Workers. Use it when you encounter high latency, CPU limits, memory pressure, or slow cold starts.

How this skill works

The skill inspects request/response handling, hot code paths, JSON and streaming operations, cache usage, and bundle composition. It guides profiling of CPU hot spots, batching and streaming to reduce memory use, and layered caching strategies (in-memory, Cache API, KV). It also suggests bundle-shrinking tactics and when to lazy-load heavy dependencies.

When to use it

  • Worker terminated due to CPU limit exceeded or frequent CPU throttling
  • High first-request latency or noticeable cold start delays
  • Memory pressure, GC pauses, or timeout errors during large payload handling
  • Repeated slow reads from KV or D1 causing user-visible latency
  • Large bundle size causing slow deployments or startup

Best practices

  • Profile hot paths with lightweight timing instrumentation and focus optimization there
  • Stream large payloads instead of buffering to avoid memory spikes and improve throughput
  • Cache expensive or repeated computations at the edge using a multi-layer approach
  • Batch KV/D1 operations and avoid per-item synchronous calls
  • Keep bundles under 1MB, tree-shake imports, and lazy-load heavy libs

Example use cases

  • Reduce cold start for an API Worker by removing top-level await and trimming dependencies
  • Fix CPU limit errors by profiling request handler stages and streaming JSON parsing
  • Lower memory usage when migrating bulk DB exports by processing rows in batches
  • Improve latency for reads by adding an edge Cache API layer in front of KV
  • Shrink bundle size by replacing lodash with lodash-es and importing specific functions

FAQ

What CPU budget should I target for Workers?

Aim under 10ms for free tiers, 30ms for paid tiers, and avoid exceeding the unbound soft limits; profile and optimize hot paths accordingly.

When should I use streaming versus batching?

Use streaming for large payloads to minimize memory and GC pressure. Use batching for large data sets from DBs where streaming is not supported or when you must limit per-call overhead.