home / skills / jjuidev / jss / media-processing

media-processing skill

/.claude/skills/media-processing

This skill processes video, audio, and images with FFmpeg, ImageMagick, and RMBG to encode, convert formats, generate thumbnails, and batch optimize media.

npx playbooks add skill jjuidev/jss --skill media-processing

Review the files below or copy the command above to add this skill to your agents.

Files (23)
SKILL.md
2.7 KB
---
name: media-processing
description: Process media with FFmpeg (video/audio), ImageMagick (images), RMBG (AI background removal). Use for encoding, format conversion, filters, thumbnails, batch processing, HLS/DASH streaming.
license: MIT
---

# Media Processing Skill

Process video, audio, and images using FFmpeg, ImageMagick, and RMBG CLI tools.

## Tool Selection

| Task | Tool | Reason |
|------|------|--------|
| Video encoding/conversion | FFmpeg | Native codec support, streaming |
| Audio extraction/conversion | FFmpeg | Direct stream manipulation |
| Image resize/effects | ImageMagick | Optimized for still images |
| Background removal | RMBG | AI-powered, local processing |
| Batch images | ImageMagick | mogrify for in-place edits |
| Video thumbnails | FFmpeg | Frame extraction built-in |
| GIF creation | FFmpeg/ImageMagick | FFmpeg for video, ImageMagick for images |

## Installation

```bash
# macOS
brew install ffmpeg imagemagick
npm install -g rmbg-cli

# Ubuntu/Debian
sudo apt-get install ffmpeg imagemagick
npm install -g rmbg-cli

# Verify
ffmpeg -version && magick -version && rmbg --version
```

## Essential Commands

```bash
# Video: Convert/re-encode
ffmpeg -i input.mkv -c copy output.mp4
ffmpeg -i input.avi -c:v libx264 -crf 22 -c:a aac output.mp4

# Video: Extract audio
ffmpeg -i video.mp4 -vn -c:a copy audio.m4a

# Image: Convert/resize
magick input.png output.jpg
magick input.jpg -resize 800x600 output.jpg

# Image: Batch resize
mogrify -resize 800x -quality 85 *.jpg

# Background removal
rmbg input.jpg                          # Basic (modnet)
rmbg input.jpg -m briaai -o output.png  # High quality
rmbg input.jpg -m u2netp -o output.png  # Fast
```

## Key Parameters

**FFmpeg:**
- `-c:v libx264` - H.264 codec
- `-crf 22` - Quality (0-51, lower=better)
- `-preset slow` - Speed/compression balance
- `-c:a aac` - Audio codec

**ImageMagick:**
- `800x600` - Fit within (maintains aspect)
- `800x600^` - Fill (may crop)
- `-quality 85` - JPEG quality
- `-strip` - Remove metadata

**RMBG:**
- `-m briaai` - High quality model
- `-m u2netp` - Fast model
- `-r 4096` - Max resolution

## References

Detailed guides in `references/`:
- `ffmpeg-encoding.md` - Codecs, quality, hardware acceleration
- `ffmpeg-streaming.md` - HLS/DASH, live streaming
- `ffmpeg-filters.md` - Filters, complex filtergraphs
- `imagemagick-editing.md` - Effects, transformations
- `imagemagick-batch.md` - Batch processing, parallel ops
- `rmbg-background-removal.md` - AI models, CLI usage
- `common-workflows.md` - Video optimization, responsive images, GIF creation
- `troubleshooting.md` - Error fixes, performance tips
- `format-compatibility.md` - Format support, codec recommendations

Overview

This skill provides a compact, practical toolkit for processing video, audio, and images using FFmpeg, ImageMagick, and RMBG CLI tools. It focuses on encoding, format conversion, filters, thumbnails, batch edits, background removal, and HLS/DASH streaming workflows. Use it to automate media pipelines and generate optimized outputs for web and broadcast.

How this skill works

The skill wraps common CLI workflows: FFmpeg for video/audio encoding, extraction, streaming and filterchains; ImageMagick for single-image edits and bulk operations via mogrify; and RMBG for AI-powered background removal. It exposes recommended command patterns, codec/quality parameters, and model choices so you can run reproducible conversions and batch jobs from scripts or CI. Examples include re-encoding with presets, extracting frames for thumbnails, resizing image batches, and removing backgrounds with selectable models.

When to use it

  • Re-encode or convert video containers and codecs for compatibility.
  • Extract audio tracks or produce podcast-ready files from video sources.
  • Generate thumbnails, GIFs, or sprite sheets from video frames.
  • Batch-resize or compress large image sets for responsive web delivery.
  • Remove image backgrounds locally for product photos or design assets.
  • Prepare HLS/DASH outputs for adaptive streaming and live/delayed playback.

Best practices

  • Prefer stream copy (-c copy) when only changing container to avoid re-encoding overhead.
  • Choose CRF + preset for predictable video quality/size tradeoffs (e.g., -crf 22 -preset slow).
  • Process images in batches with mogrify but keep originals in a separate folder to avoid data loss.
  • Strip metadata (-strip) and set JPEG quality (-quality 85) to reduce file size for web images.
  • Use RMBG model flags (-m briaai for quality, -m u2netp for speed) and cap resolution (-r) to manage memory and performance.
  • Test a few representative files before running large-scale batch jobs and automate retries for transient CLI failures.

Example use cases

  • Convert a camera .mov archive to H.264 MP4s with consistent audio codec for CDN upload.
  • Extract a clean audio track from lecture recordings for transcription or podcasting.
  • Bulk resize and compress product images for responsive sites while keeping originals in S3 or an archive.
  • Produce 10 thumbnails per video at fixed intervals for gallery previews or video players.
  • Remove backgrounds from user-submitted photos for marketplace listings, choosing quality vs speed per workload.
  • Create HLS segments and master playlists for adaptive streaming of recorded events.

FAQ

Which tool should I pick for video vs images?

Use FFmpeg for all video/audio tasks including frame extraction. Use ImageMagick for high-quality still-image effects and fast bulk edits. Use RMBG specifically for AI background removal on images.

How do I balance quality vs speed when encoding?

Use CRF for quality targeting with a preset for speed: lower CRF = higher quality; faster presets reduce encoding time but increase file size. Test several combinations on representative clips.