home / skills / shipshitdev / library / nestjs-queue-architect

This skill helps you design resilient queue architectures in NestJS with BullMQ for media processing, including clear patterns and robust error handling.

npx playbooks add skill shipshitdev/library --skill nestjs-queue-architect

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
3.0 KB
---
name: nestjs-queue-architect
version: 1.0.0
technology: BullMQ 5.61.0 with NestJS 11.1.7
description: Queue job management patterns, processors, and async workflows for video/image processing
expertise_level: senior
last_updated: 2025-10-22
---

# NestJS Queue Architect - BullMQ Expert

You are a **senior queue architect** specializing in BullMQ with NestJS. Design resilient, scalable job processing systems for media-heavy workflows.

## Technology Stack

- **BullMQ**: 5.61.0 (Redis-backed job queue)
- **@nestjs/bullmq**: 11.0.4
- **@bull-board/nestjs**: 6.13.1 (Queue monitoring UI)

## Project Context Discovery

Before implementing:

1. Check `.agents/SYSTEM/ARCHITECTURE.md` for queue patterns
2. Review existing queue services and constants
3. Look for `[project]-queue-architect` skill

## Core Patterns

### Queue Constants

```typescript
export const QUEUE_NAMES = {
  VIDEO_PROCESSING: 'video-processing',
  IMAGE_PROCESSING: 'image-processing',
} as const;

export const JOB_PRIORITY = {
  HIGH: 1,    // User-facing
  NORMAL: 5,  // Standard
  LOW: 10,    // Background
} as const;
```

### Queue Service

```typescript
@Injectable()
export class VideoQueueService {
  constructor(@InjectQueue(QUEUE_NAMES.VIDEO) private queue: Queue) {}

  async addJob(data: VideoJobData) {
    return this.queue.add(JOB_TYPES.RESIZE, data, {
      priority: JOB_PRIORITY.NORMAL,
      attempts: 3,
      backoff: { type: 'exponential', delay: 2000 },
    });
  }
}
```

### Processor (WorkerHost)

```typescript
@Processor(QUEUE_NAMES.VIDEO)
export class VideoProcessor extends WorkerHost {
  async process(job: Job<VideoJobData>) {
    switch (job.name) {
      case JOB_TYPES.RESIZE: return this.handleResize(job);
      case JOB_TYPES.MERGE: return this.handleMerge(job);
      default: throw new Error(`Unknown job: ${job.name}`);
    }
  }
}
```

## Key Principles

1. **One service per queue type** - Encapsulate job options
2. **Switch-based routing** - Route by `job.name`
3. **Structured error handling** - Log, emit WebSocket, publish Redis, re-throw
4. **Always cleanup** - Temp files in try/finally
5. **Idempotent handlers** - Safe to retry

## Queue Configuration

```typescript
BullModule.registerQueue({
  name: QUEUE_NAMES.VIDEO,
  defaultJobOptions: {
    attempts: 3,
    backoff: { type: 'exponential', delay: 2000 },
    removeOnComplete: 100,  // Prevent Redis bloat
    removeOnFail: 50,
  },
});
```

## Retry Strategy

| Job Type | Attempts | Delay | Reason |
|----------|----------|-------|--------|
| Resize | 3 | 2000ms | Transient failures |
| Merge | 2 | 5000ms | Resource-intensive |
| Metadata | 2 | 1000ms | Fast, fail quickly |
| Cleanup | 5 | 1000ms | Must succeed |

## Common Pitfalls

- **Memory leaks**: Always set `removeOnComplete/Fail`
- **Timeouts**: Set appropriate `timeout` for heavy jobs
- **Race conditions**: Make handlers idempotent

---

**For complete processor examples, testing patterns, Bull Board setup, and Redis pub/sub integration, see:** `references/full-guide.md`

Overview

This skill designs resilient, scalable BullMQ job processing patterns for media-heavy NestJS applications. It provides queue constants, service patterns, processor routing, retry strategies, and operational guidance tuned for video and image workflows. The focus is on predictable retries, cleanup, idempotency, and Redis hygiene to keep media pipelines reliable at scale.

How this skill works

The skill defines one service per queue type that encapsulates job creation and options (priority, attempts, backoff). Processors implement a switch-based router by job name inside a WorkerHost, with structured error handling and guaranteed cleanup in finally blocks. Queues are registered with sane defaultJobOptions (attempts, exponential backoff, removeOnComplete/Fail) and per-job retry strategies to balance resilience and resource usage.

When to use it

  • Building video or image processing pipelines (transcoding, resizing, merging).
  • Coordinating asynchronous media tasks that must be retried safely.
  • Running resource-heavy jobs that require backoff and timeouts.
  • When you need predictable Redis usage and operational visibility.
  • Integrating queue UI monitoring like Bull Board for observability.

Best practices

  • Create one service per queue type to encapsulate job options and APIs.
  • Route work by job.name in processors to keep handlers small and testable.
  • Make all handlers idempotent so retries and race conditions are safe.
  • Always clean up temporary files and resources in finally blocks.
  • Set removeOnComplete and removeOnFail to prevent Redis bloat.
  • Tune timeouts and attempts per job type based on resource needs.

Example use cases

  • Submit a resize job with NORMAL priority and exponential backoff for transient errors.
  • Run a merge job with fewer attempts and longer delay because it’s resource-intensive.
  • Publish metadata extraction jobs that fail fast with low attempts.
  • Use a dedicated cleanup queue with higher attempts to ensure artifacts are removed.
  • Expose Bull Board for operators to inspect job history and replay failures.

FAQ

How do I prevent Redis from growing indefinitely?

Set removeOnComplete and removeOnFail per-queue or per-job options and periodically prune old job data.

What retry strategy should I use for heavy jobs?

Use fewer attempts with longer backoff (e.g., 2 attempts, 5000ms) and increase timeouts to avoid premature failures.