home / skills / benchflow-ai / skillsbench / ffmpeg-audio-processing

This skill helps you extract, normalize, mix, and filter audio with FFmpeg to enhance and tailor audio tracks.

npx playbooks add skill benchflow-ai/skillsbench --skill ffmpeg-audio-processing

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.7 KB
---
name: FFmpeg Audio Processing
description: Extract, normalize, mix, and process audio tracks - audio manipulation and analysis
---

# FFmpeg Audio Processing Skill

Extract, normalize, mix, and process audio tracks from video files.

## When to Use

- Extract audio from video
- Normalize audio levels
- Mix multiple audio tracks
- Convert audio formats
- Extract specific channels
- Adjust audio volume

## Extract Audio

```bash
# Extract as MP3
ffmpeg -i video.mp4 -vn -acodec libmp3lame -q:a 2 audio.mp3

# Extract as AAC (copy, no re-encode)
ffmpeg -i video.mp4 -vn -c:a copy audio.aac

# Extract as WAV (uncompressed)
ffmpeg -i video.mp4 -vn -acodec pcm_s16le audio.wav

# Extract specific audio stream
ffmpeg -i video.mp4 -map 0:a:1 -c:a copy audio2.aac
```

## Normalize Audio

```bash
# Normalize loudness (ITU-R BS.1770-4)
ffmpeg -i input.mp4 -af "loudnorm=I=-23:TP=-1.5:LRA=11" output.mp4

# Simple normalization
ffmpeg -i input.mp4 -af "volume=2.0" output.mp4

# Peak normalization
ffmpeg -i input.mp4 -af "volumedetect" -f null -
# Then use the detected peak to normalize
ffmpeg -i input.mp4 -af "volume=0.5" output.mp4
```

## Volume Adjustment

```bash
# Increase volume by 6dB
ffmpeg -i input.mp4 -af "volume=6dB" output.mp4

# Decrease volume by 3dB
ffmpeg -i input.mp4 -af "volume=-3dB" output.mp4

# Set absolute volume
ffmpeg -i input.mp4 -af "volume=0.5" output.mp4
```

## Channel Operations

```bash
# Extract left channel
ffmpeg -i stereo.mp3 -map_channel 0.0.0 left.mp3

# Extract right channel
ffmpeg -i stereo.mp3 -map_channel 0.0.1 right.mp3

# Convert stereo to mono
ffmpeg -i stereo.mp3 -ac 1 mono.mp3

# Convert mono to stereo
ffmpeg -i mono.mp3 -ac 2 stereo.mp3
```

## Mix Audio Tracks

```bash
# Replace audio track
ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -map 0:v:0 -map 1:a:0 output.mp4

# Mix two audio tracks
ffmpeg -i video.mp4 -i audio2.mp3 \
  -filter_complex "[0:a][1:a]amix=inputs=2:duration=first" \
  -c:v copy output.mp4

# Mix with volume control
ffmpeg -i video.mp4 -i bgm.mp3 \
  -filter_complex "[0:a]volume=1.0[voice];[1:a]volume=0.3[music];[voice][music]amix=inputs=2:duration=first" \
  -c:v copy output.mp4
```

## Audio Delay

```bash
# Delay audio by 0.5 seconds
ffmpeg -i video.mp4 -itsoffset 0.5 -i video.mp4 \
  -map 0:v -map 1:a -c copy output.mp4

# Using adelay filter (milliseconds)
ffmpeg -i input.mp4 -af "adelay=500|500" output.mp4
```

## Sample Rate Conversion

```bash
# Resample to 48kHz
ffmpeg -i input.mp4 -af "aresample=48000" output.mp4

# Resample audio only
ffmpeg -i input.mp4 -vn -af "aresample=48000" -ar 48000 audio.wav
```

## Audio Filters

```bash
# High-pass filter (remove low frequencies)
ffmpeg -i input.mp4 -af "highpass=f=200" output.mp4

# Low-pass filter (remove high frequencies)
ffmpeg -i input.mp4 -af "lowpass=f=3000" output.mp4

# Band-pass filter
ffmpeg -i input.mp4 -af "bandpass=f=1000:width_type=h:w=500" output.mp4

# Fade in/out
ffmpeg -i input.mp4 -af "afade=t=in:st=0:d=2,afade=t=out:st=8:d=2" output.mp4
```

## Audio Analysis

```bash
# Detect volume levels
ffmpeg -i input.mp4 -af "volumedetect" -f null -

# Measure loudness (LUFS)
ffmpeg -i input.mp4 -af "ebur128=peak=true" -f null -

# Get audio statistics
ffprobe -v error -select_streams a:0 \
  -show_entries stream=sample_rate,channels,bit_rate \
  -of json input.mp4
```

## Combine Multiple Audio Files

```bash
# Concatenate audio files
ffmpeg -i "concat:audio1.mp3|audio2.mp3|audio3.mp3" -c copy output.mp3

# Using file list
# file 'audio1.mp3'
# file 'audio2.mp3'
ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp3
```

## Notes

- Use `-vn` to disable video when extracting audio
- `-map` to select specific streams
- `-af` for audio filters
- `-ac` for channel count
- `-ar` for sample rate
- Normalization may require two passes for accurate results

Overview

This skill provides practical FFmpeg commands and patterns to extract, normalize, mix, and process audio tracks from video or audio files. It focuses on common tasks like extraction, loudness normalization, channel operations, mixing, resampling, filtering, and basic analysis. The content is hands-on and command-oriented so you can apply fixes or automation quickly.

How this skill works

The skill demonstrates FFmpeg CLI usage and recommended flags to operate on audio streams: -vn and -map to select streams, -af for audio filters, and -c or -ac/-ar to control codecs, channels, and sample rates. It shows single-pass and two-pass normalization, mixing with filter_complex and amix, channel mapping, delays, resampling, filtering (high/low/band-pass), concatenation, and basic analysis with ffmpeg and ffprobe.

When to use it

  • Extract audio tracks from video files for editing or republishing
  • Normalize loudness to meet broadcast or streaming targets (LUFS/ITU-R BS.1770)
  • Mix voice and background tracks with volume control
  • Convert sample rate or change channel layout (stereo/mono)
  • Delay or align audio to fix lip-sync problems
  • Run quick audio analysis (volumedetect, ebur128) before mastering

Best practices

  • Use -vn to exclude video when creating audio-only files to avoid re-encoding video streams
  • Prefer loudnorm for LUFS-targeted normalization; expect two-pass workflows for accuracy
  • Use -map and -map_channel to explicitly select streams or channels to avoid accidental inclusions
  • When mixing, control relative levels with volume filters before amix to prevent clipping
  • When concatenating, match codec/format or use re-encoding if input formats differ
  • Check results with ffprobe or volumedetect/ebur128 to verify loudness and peaks

Example use cases

  • Extract a clean MP3 or WAV from a recorded webinar for podcast publishing
  • Normalize all episode files to -23 LUFS or a platform-specific LUFS target
  • Replace camera audio with a higher-quality narration track while keeping the original video stream
  • Mix dialog and background music with different volume levels and produce a single output
  • Resample audio to 48 kHz and convert stereo to mono for voice-only platforms
  • Detect loudness and peak statistics before final mastering or submission

FAQ

Do I always need two passes to normalize loudness?

Two-pass loudnorm yields more accurate LUFS and true peak results. For quick fixes you can use single-pass volume adjustments, but expect less precision.

How do I avoid clipping when mixing tracks?

Reduce individual track volumes with the volume filter before amix, and consider adding a limiter or lowering master gain to keep peaks below 0 dBFS.