home / skills / zpankz / mcp-skillset / suno-song

suno-song skill

/suno-song

This skill transforms inputs into optimized Suno V5 prompts for fast, reliable song generation with template-based guidance.

npx playbooks add skill zpankz/mcp-skillset --skill suno-song

Review the files below or copy the command above to add this skill to your agents.

Files (16)
skill.md
28.3 KB
---
name: suno-song
description: Transform diverse inputs (YouTube videos/audio, existing Suno songs, raw lyrics, audio files, or conversational ideas) into highly optimized Suno V5 custom song generation prompts with intelligent character optimization (5000 lyrics, 1000 style limits), proven metatag reliability, V5-enhanced emotion tags, and template-based best practices. Use when user wants to create music with Suno, provides content for song generation, or needs help crafting effective V5 prompts.
---

# Suno Song Generator (V5 Optimized)

Transform any input into optimized, copy-paste ready prompts for suno.com/create using Suno V5's advanced capabilities and proven success patterns.

## QUICK START

**Model**: Suno V5 (chirp-crow) - Released September 2025
**Goal**: Generate copy-paste Suno V5 prompts with 90%+ success rate

**Core Approach**: Template-based generation with quality-focused optimization

**Key Principle**: Generate immediately, refine iteratively (not question-heavy)

**V5 Character Limits**:
- Lyrics: 5000 characters maximum
- Style: 1000 characters maximum
- Title: 100 characters maximum
- **Recommended**: Stay concise (2000-3500 lyrics, 100-300 style) for best audio quality

## When to Use This Skill

Activate when the user:
- Wants to create a song with Suno AI
- Provides a YouTube video/audio URL for song inspiration
- Shares audio files to analyze for style/mood
- Has lyrics that need formatting for Suno
- Describes a song idea in conversation
- Asks "create a [genre] song about [topic]"
- Provides an existing Suno song URL to remix/analyze
- Needs help with Suno prompt engineering

**Keywords**: suno, song, music generation, lyrics, create music, generate song, song prompt

## How It Works

### Core Workflow

```
Input Detection → Template Matching → Prompt Building →
3 Variations → Copy-Paste Ready Output
```

**Time to First Output**: <15 seconds (simple inputs)

---

## STEP 1: INPUT DETECTION & PROCESSING

### 1.1 YouTube URL Processing

When user provides a YouTube URL (music video, audio, tutorial):

**Process**:

1. **Validate URL** and check yt-dlp availability:
   ```bash
   # Check installation
   which yt-dlp || command -v yt-dlp

   # If not installed (macOS):
   brew install yt-dlp

   # If not installed (Linux):
   pip3 install yt-dlp
   ```

2. **Extract transcript**:
   ```bash
   # Get video title for file naming
   VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')

   # Download auto-generated subtitles
   yt-dlp --write-auto-sub --skip-download --sub-langs en --output "temp_transcript" "$URL"

   # Clean VTT format (remove duplicates, timestamps, metadata)
   python3 -c "
   import re
   seen = set()
   with open('temp_transcript.en.vtt', 'r') as f:
       for line in f:
           line = line.strip()
           if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
               clean = re.sub('<[^>]*>', '', line)
               clean = clean.replace('&amp;', '&').replace('&gt;', '>').replace('&lt;', '<')
               if clean and clean not in seen:
                   print(clean)
                   seen.add(clean)
   " > transcript_clean.txt

   rm -f temp_transcript.en.vtt
   ```

3. **Detect intent**:
   - **If transcript contains song lyrics** (repeated patterns, verse/chorus structure):
     - Proceed to lyrics formatting
   - **If transcript is conversational/tutorial** (no obvious song structure):
     - Extract themes and concepts
     - Ask user: "This appears to be [type of content]. Should I use these themes to create song lyrics, or treat it differently?"

4. **Handle long transcripts** (>1250 chars):
   - Apply smart compression (see Section 2)
   - Notify user: "Transcript is X characters, compressed to Y characters for Suno limits"

### 1.2 Audio File Processing (Optional)

When user provides audio file (.mp3, .wav, .m4a, etc.):

**Tiered Approach**:

**Tier 1 - Metadata Only** (no extra dependencies):
```bash
# Extract basic info using ffprobe
ffprobe -v quiet -print_format json -show_format "$AUDIO_FILE"
# Gets: duration, bitrate, format
```

**Tier 2 - Simple Analysis** (optional, requires pydub):
```python
# Only if user wants energy/tempo estimates
from pydub import AudioSegment
import numpy as np

audio = AudioSegment.from_file(audio_path)
rms_energy = audio.rms  # Loudness

# Rough energy classification
if rms_energy > threshold_high:
    energy = "High Energy"
elif rms_energy > threshold_medium:
    energy = "Medium Energy"
else:
    energy = "Low Energy"
```

**Tier 3 - Advanced** (opt-in only, requires librosa):
- Only offer if user explicitly wants genre detection
- Show warning: "This requires installing librosa (~200MB). Continue? (y/n)"
- If analysis fails, fall back to asking user

**Fallback Strategy**:
```
If any analysis fails → Ask user directly:
"What genre/mood/style is this audio?"
```

### 1.3 Text/Lyrics Input

When user provides raw lyrics or text:

**Detection Steps**:

1. **Check for existing metatags**:
   - If `[Intro]`, `[Verse]`, `[Chorus]` etc. present → parse structure
   - Validate metatag syntax and placement

2. **Detect structure from patterns**:
   ```python
   # Look for repeated sections (likely chorus)
   # Count line patterns
   # Identify verse-like sections (unique, longer)
   ```

3. **If unstructured**:
   - Treat as concepts/themes
   - Match to appropriate genre template
   - Show user: "I'm treating this as [genre] based on the themes. Adjust?"

### 1.4 Conversational Input

When user provides vague idea ("create a sad piano song"):

**Process**:

1. **Immediate template matching**:
   - Analyze keywords (sad, piano, slow, etc.)
   - Select closest matching template
   - Show template selection: "I'm using our 'Piano Ballad Melancholic' template"

2. **Generate immediately** (don't ask many questions):
   - Create 3 variations from template
   - Show previews within 10 seconds

3. **Refine after seeing results**:
   - User can then say "make it slower" or "add strings"
   - Adjust and regenerate

**Question Limits**:
- Maximum 3 questions before first generation
- Only ask if absolutely necessary:
  - Genre (if not obvious)
  - Mood (if ambiguous for genre)
  - Vocals vs. Instrumental (if unclear)
- Never ask about specific instruments, structures, parameters before first output

---

## STEP 2: CHARACTER LIMIT COMPLIANCE

### V5 Character Limits

```yaml
v5_maximums:
  lyrics: 5000 characters (generous limit)
  style: 1000 characters (generous limit)
  title: 100 characters

v5_recommended_optimal:
  lyrics: 2000-3500 characters (best audio quality)
  style: 100-300 characters (concise is better)
  title: 30-60 characters

principle: "Brevity produces better results despite generous limits"
```

**Quality-Focused Targets**:
- Lyrics optimal: 2500 characters (balance of detail and clarity)
- Style optimal: 150 characters (clear without over-specification)
- Safety margins: 100 chars buffer for lyrics, 50 for style

### Smart Compression Algorithm (V5)

**When to Compress**:
- **Mandatory**: Input > 5000 chars (hard limit violation)
- **Recommended**: Input > 3500 chars (quality optimization)
- **Optional**: Input 2500-3500 chars (offer to user)
- **Unnecessary**: Input < 2500 chars (already optimal)

**Priority Order** (what to keep):
1. **All Choruses** (most important - repetition, catchiness)
2. **Verses 1 & 2** (narrative, essential storytelling)
3. **Bridges** (can compress while keeping essence)
4. **Additional Verses** (3+) - compress or remove
5. **Intro/Outro** (simplify or remove descriptions)

**Compression Techniques**:

```yaml
verbose_metatags:
  before: "[Short Instrumental Intro with ambient pads and soft piano]"
  after: "[Intro]"
  savings: ~45 characters

redundant_content:
  before: "5 verses saying similar things"
  after: "2-3 strongest verses"
  savings: ~500-1000 characters

wordy_lyrics:
  before: "I am currently feeling so incredibly sad about the situation"
  after: "This situation breaks my heart"
  savings: ~25 chars, better impact

instrumental_descriptions:
  before: "[Melodic Instrumental section featuring guitar and strings with emotional depth]"
  after: "[melodic interlude]"
  savings: ~60 characters
```

**Example Compression**:
```
Input: 4200 characters (over optimal range)
Target: 2800 characters (optimal quality range)
After compression: 2800 characters
Method: 3 verses → 2 verses, simplified metatags, condensed bridge
Quality impact: Improved (tighter, more focused, cleaner audio)
```

---

---

## STEP 3: V5-ENHANCED FEATURES

### V5 Improvements to Leverage

```yaml
prompt_adherence: 90% success rate (up from ~75% in V4.5)

metatag_reliability: 10-15% improvement across all tags

emotion_parsing: NEW tags work excellently
  - haunting (85% reliability)
  - joyful (85%)
  - somber (85%)
  - ethereal (80%)
  - bittersweet (75%)
  - triumphant (85%)

vocal_quality: Near-human
  - Natural breaths and pauses
  - Better pronunciation
  - Authentic phrasing
  - Tighter mix integration

genre_fusion: Stable 2-genre combinations
  - Pop + EDM (95% success)
  - Gospel + Trap (90%)
  - Jazz + Hip Hop (90%)

studio_features:
  - Replace Section (fix parts without full regeneration)
  - Extend (less drift than V4.5)
  - Remaster (subtle/normal/high)
  - Stem export (2-12 stems)
  - MIDI export
```

### Optimal Lyric Structure for V5

```yaml
syllables_per_line: 6-12 (V5 vocal engine optimized for this)

example_good:
  "Walking down the memory lane" (8 syllables)
  "Sunshine fades to gentle rain" (8 syllables)
  "Every moment feels so far away" (10 syllables)

example_avoid:
  "Lost" (1 syllable - too short)
  "I am currently in the process of contemplating" (14 syllables - too long)

consistency_matters: Match syllable counts within sections for natural flow
```

---

## STEP 4: TEMPLATE MATCHING

### Template-Based Generation

**Core Templates** (25+ available in assets/templates/):

```yaml
acoustic:
  - indie_folk_melancholic
  - folk_ballad_storytelling
  - acoustic_pop_uplifting

electronic:
  - edm_energetic_drop
  - synthwave_retro_nostalgic
  - ambient_atmospheric_calm

hip_hop:
  - hip_hop_storytelling
  - trap_aggressive_energy
  - lo_fi_hip_hop_chill

rock:
  - indie_rock_anthemic
  - alternative_rock_emotional
  - soft_rock_ballad

pop:
  - pop_uplifting_catchy
  - synth_pop_80s
  - pop_ballad_emotional

jazz_soul:
  - jazz_smooth_sophisticated
  - neo_soul_groovy
  - jazz_hop_fusion

orchestral:
  - orchestral_epic_cinematic
  - classical_piano_peaceful
  - chamber_music_intimate

fusion:
  - gospel_trap_triumphant
  - jazz_electronic_fusion
  - rock_orchestral_hybrid
```

**Template Selection**:
1. Analyze input keywords, themes, mood
2. Match to closest template (cosine similarity or keyword matching)
3. Show user: "Using [Template Name] as foundation"
4. Allow user to change template if desired

**Template Structure**:
- Optimized style field (≤115 chars)
- Proven metatag sequence (Tier 1 reliability)
- Character budget breakdown
- Common pitfalls to avoid

---

## STEP 5: METATAG RELIABILITY SYSTEM (V5 ENHANCED)

### 3-Tier Reliability Matrix for V5

**V5 Improvement**: 10-15% better metatag reliability vs V4.5

#### Tier 1 - RELIABLE (Use Confidently)

```yaml
proven_structure_tags:
  - [Verse] (90% reliability, +10% vs V4.5)
  - [Chorus] (95% reliability, +10% vs V4.5)
  - [Bridge] (85% reliability, +10% vs V4.5)

proven_alternatives:
  - [Short Instrumental Intro] (90% reliability)
  - [Catchy Hook] (90% reliability)
  - [melodic interlude] (85% reliability, lowercase works)
  - [Big Finish] (85% reliability, better than [Outro])
```

#### Tier 2 - MODERATE (Use with Caution)

```yaml
improved_in_v5:
  - [Pre-Chorus] (70% reliability, +5% vs V4.5)
  - [Post-Chorus] (65% reliability, +5%)
  - [Instrumental] (75% reliability, +5%)
  - [Mood: Uplifting] (75% reliability, +15%)
  - [Energy: High] (70% reliability, +10%)

v5_new_emotion_tags (HIGH effectiveness):
  - [Mood: haunting] (85% reliability)
  - [Mood: joyful] (85%)
  - [Mood: somber] (85%)
  - [Mood: ethereal] (80%)
  - [Mood: bittersweet] (75%)
  - [Mood: triumphant] (85%)
```

#### Tier 3 - UNRELIABLE (Warn User)

```yaml
still_problematic_in_v5:
  - [Intro] (40% reliability - slight improvement, still use [Short Instrumental Intro])
  - [Fade Out] (40% reliability)
  - [Outro] (45% reliability - use [Big Finish] instead)

v5_note: "Even with improvements, these tags remain unreliable"
```

**V5 Warning Message**:
```
⚠️ V5 has improved metatag reliability to ~90% for core tags, but AI
may still interpret structure creatively. If not followed:
1. Use Studio's Replace Section feature (V5 advantage!)
2. Regenerate with simpler Tier 1 tags only
3. Simplify structure (fewer sections)
4. Accept and work with AI's interpretation
```

---

## STEP 6: STYLE FIELD OPTIMIZATION (V5)

### Rules for V5's 1000-Character Limit

**V5 Reality**: 1000 chars allowed, but **100-300 chars produces best results**

**Optimal Format**:
```
[Primary Genre], [Secondary Genre/Mood], [Tempo/Energy], [Instrument 1-3], [Vocal Style], [Optional: BPM]
```

**Examples by Length**:

```yaml
minimal (50-80 chars): Works great for simple songs
  - "Pop, Uplifting, Female Vocals, Guitar"  (43 chars)
  - "Indie Folk, Melancholic, Acoustic, Male"  (44 chars)

optimal (80-150 chars): RECOMMENDED for most songs
  - "Indie Pop, Electronic, Uplifting, Bittersweet, Female Vocals, Synth Pads, Acoustic Guitar, 115 BPM"  (100 chars)
  - "Jazz Hip Hop, Smooth, Introspective, Saxophone, 808 Bass, Spoken Word Male"  (77 chars)

detailed (150-300 chars): For complex fusions
  - "Gospel Trap, Triumphant, High Energy, Choir Harmonies, 808 Bass, Violin Strings, Female Lead with Choir Background, 115 BPM, Clear Mix, Vocal Focus"  (154 chars)

excessive (300+ chars): AVOID - causes artifacts
  - Problem: Too many specifications confuse AI
  - Result: Muddy mix, conflicting elements
  - V5 improvement: Better than V4.5 but still problematic
```

**V5 Optimization Rules**:
- **3-5 tags optimal** (despite 1000 char limit, more ≠ better)
- **Most important tag first** (V5 weighs first tag 60% influence)
- **2-genre fusion max** (V5 handles well, 3+ dilutes)
- **No filler words**: Remove "and", "with", "featuring"
- **Comma-separated**: Clean format
- **Avoid conflicts**: Check incompatible combinations

**Tag Conflicts to Avoid**:
```yaml
incompatible_pairs:
  - "Very Slow" + "High Energy"
  - "Minimal" + "Orchestral"
  - "Acoustic" + "Heavy Electronic"
  - "Calm" + "Aggressive"
  - "Lo-fi" + "Crystal Clear Production"
```

---

## STEP 7: VARIATION GENERATION (V5 OPTIMIZED)

### 3 Differentiated Variations

Generate simultaneously to leverage V5's improved reliability:

#### Variation 1: Conservative (Safe, Proven)

```yaml
goal: "Maximum predictability, V5's highest success rate"
style_tags: 3-4 core genre tags only
style_length: 50-80 characters
metatags: ONLY Tier 1 reliable tags
structure: Standard template (no deviations)
lyrics_length: 1500-2500 chars (optimal quality range)
v5_success_rate: ~98% (V5's reliability boost)
user_type: "I want something that definitely works first time"
```

**Example**:
```
Style: "Piano Ballad, Melancholic, Male Vocal"  (42 chars)
Lyrics: ~2000 chars with simple structure
Structure: [Short Instrumental Intro] → [Verse] → [Chorus] → [Verse] → [Chorus] → [Big Finish]
```

#### Variation 2: Balanced (Recommended)

```yaml
goal: "Optimal quality-creativity balance, leverages V5 improvements"
style_tags: 4-5 tags + 1 fusion element + V5 emotion tag
style_length: 100-200 characters
metatags: Tier 1 + V5-enhanced emotion tags + selective Tier 2
structure: Standard + 1-2 unique elements
lyrics_length: 2000-3500 chars (V5 sweet spot)
v5_success_rate: ~92% (improved from 85% in V4.5)
user_type: "I want professional quality with personality"
```

**Example**:
```
Style: "Indie Pop, Electronic, Uplifting, Bittersweet, Female Vocals, Synth Pads, Acoustic Guitar, 115 BPM, Clear Mix"  (118 chars)
Lyrics: ~2500 chars with nuanced structure
Structure: [Short Intro] → [Verse] → [Pre-Chorus] → [Chorus] [triumphant] → [Verse] → [Bridge] [ethereal] → [Chorus] → [Big Finish]
V5 Feature: Inline mood tags ([Chorus] [triumphant]) work 70% of time
```

#### Variation 3: Experimental (Creative, V5 Fusion Power)

```yaml
goal: "Creative exploration, leverages V5 fusion stability"
style_tags: 5-7 tags + unexpected 2-genre fusion + V5 emotion tags
style_length: 150-300 characters
metatags: Tier 1 + Tier 2 + V5 inline style hints
structure: Non-standard, progressive arrangement
lyrics_length: 3000-4500 chars (V5 handles complexity better)
v5_success_rate: ~80% (up from 60% in V4.5)
user_type: "I want something unique that pushes boundaries"
```

**Example**:
```
Style: "Gospel Trap, Triumphant, Bittersweet, 808 Bass, Choir Harmonies, Violin Strings, Female Lead, Chopped Vocals, 115 BPM, Vocal Focus, Ethereal Atmosphere"  (161 chars)
Lyrics: ~3500 chars with complex structure
Structure: [Atmospheric Intro] → [Verse] [intimate, minimal] → [Catchy Hook] → [Verse] [building energy] → [Bridge] [remove drums, choir only] → [Final Chorus] [maximum intensity] → [Big Finish]
V5 Features: Inline style hints, complex fusion, extended length
```

---

## STEP 8: OUTPUT GENERATION (V5 FORMAT)

### Copy-Paste Ready Format (V5)

**Standard V5 Output Template**:

```markdown
# Suno V5 Prompt - [Song Title]

## ✅ RECOMMENDED (Variation 2 - Balanced)

### Copy this into suno.com/create:

**Model:** V5 (chirp-crow)

**Title:**
[Song Title Here]  (X/100 chars)

**Style of Music:**
[Optimized style field, 100-200 chars optimal]  (X/1000 chars)

**Lyrics:**
[Formatted lyrics with V5-optimized metatags, 2000-3500 chars optimal]  (X/5000 chars)

**Custom Mode:** ✓ ON

---

### Quality Assessment (V5)

- **Quality Score:** 9.2/10
- **V5 Success Probability:** 92% (first generation)
- **Reliability:** HIGH (Tier 1 tags + V5 emotion tags)
- **Character Optimization:** Style: 118/1000 (optimal) | Lyrics: 2,347/5000 (optimal)
- **Expected Outcome:** Professional, natural vocals, balanced creativity

### What to Expect with V5

✓ Highly reliable metatag following (~90% core tags)
✓ Natural vocal delivery with proper breaths/phrasing
✓ Clean mix with clear instrument separation
✓ If structure varies slightly, use Studio's Replace Section
⚠️ V5 may interpret creatively - regenerate if needed

---

## Alternative Variations

<details>
<summary>Variation 1 (Conservative) - Click to expand</summary>

**Title:** [Same title]

**Style:** [Simpler, 3-4 tags only]

**Lyrics:** [Same structure, simplified metatags]

**Score:** 8.5/10 | Reliability: VERY HIGH

</details>

<details>
<summary>Variation 3 (Experimental) - Click to expand</summary>

**Title:** [Same or creative variation]

**Style:** [Complex fusion, 5-6 tags]

**Lyrics:** [Non-standard structure, creative metatags]

**Score:** 8.2/10 | Reliability: MEDIUM (may need regeneration)

</details>

---

## V5 Studio Integration & Troubleshooting

**V5 Advantages - Use These Features:**

1. **Replace Section** (10 credits):
   - Fix specific section without regenerating whole song
   - Add prompt: "Make this section more energetic"
   - Smooth crossfades automatically

2. **Extend** (10 credits):
   - Song ended early? Extend with better consistency than V4.5
   - Less drift, maintains voice/style

3. **Remaster** (10 credits):
   - Subtle: Minor cleanup (recommended first try)
   - Normal: Balanced polish
   - High: Significant change (experimental)

**If Results Don't Match Vision:**

1. **Structure ignored?** → Use Studio's Replace Section (V5 advantage!) or Variation 1
2. **Vocals buried?** → Remaster (Normal), or add "Vocal Focus, Clear Mix"
3. **Song ends early?** → Use Extend feature (improved in V5)
4. **Section needs fixing?** → Replace Section instead of full regeneration
5. **Too weird?** → Try Variation 1 or reduce weirdness parameter
6. **Not creative enough?** → Try Variation 3 with higher weirdness (0.6-0.7)

**V5-Specific Issues & Fixes:**

- **Lyric hallucination** (rare in V5):
  - V5 improvement: Better lyric adherence
  - If occurs: Use Replace Section feature for that part

- **Vocals can't be heard** (less common in V5):
  - V5's better mix, but still possible with heavy instrumentation
  - Fix: Remaster (Normal) often solves this automatically

- **Generic auto-lyrics** (still an issue):
  - V5 auto-lyrics unchanged from V4.5
  - Solution: Always provide custom lyrics

- **Early cutoff** (occasional V5 issue):
  - Symptom: Song ends at 2:30 instead of expected 3:30
  - Fix: Use Extend feature (works much better in V5)

---

Generated by Suno Song Skill | Based on proven templates & community best practices
```

---

## ADVANCED FEATURES

### Refinement Loop

After user sees first output:

**Process**:
1. Show all 3 variations
2. User selects one and suggests changes
3. Adjust specific elements:
   - "Make it slower" → adjust tempo tags
   - "Add strings" → add to style field (check char limit)
   - "Remove bridge" → restructure lyrics
4. Regenerate updated variations
5. Repeat max 3 times (then suggest starting fresh)

**Exit Conditions**:
- User says "looks good" or "finalize"
- User explicitly requests to "start over"
- 3 refinement cycles completed

### Playlist/Album Mode (Future Enhancement)

If user requests multiple songs:

**Process**:
1. Define overarching theme and mood arc
2. Ensure consistency (same core style, voice)
3. Add variation for interest (different tempos, structures)
4. Generate all prompts at once

---

## DEPENDENCIES (V5 COMPATIBLE)

### Installation Tiers

**Tier 0 - Always Required** (lightweight, V5 text-based):
```bash
pip install pyyaml requests
# <30 seconds, works for text-based V5 prompt generation
# Sufficient for 80% of use cases
```

**Tier 1 - YouTube Features** (optional):
```bash
brew install yt-dlp  # macOS
# or
pip3 install yt-dlp  # Universal
# Triggered when user provides YouTube URL
```

**Tier 2 - Basic Audio** (optional):
```bash
pip install pydub
# Requires ffmpeg (usually pre-installed)
# Triggered when user provides audio file
```

**Tier 3 - Advanced Audio** (opt-in only):
```bash
pip install librosa scipy
# ~200MB download, 5-10 min install
# Only with explicit user consent
# Triggered by "advanced audio analysis" request
```

### Graceful Degradation

**If libraries not installed**:
- Skill works fine with text-only inputs
- Prompts user: "Enable [feature] by installing [library]?"
- Provides clear install instructions
- Falls back to asking user for genre/style

---

## V5 BEST PRACTICES

### Suno V5-Specific Wisdom

**From V5 Community Testing & Official Guidance**:

1. **Brevity Wins (Despite Generous Limits)**:
   - Style 100-300 chars → cleanest audio (despite 1000 limit)
   - Lyrics 2000-3500 chars → optimal quality (despite 5000 limit)
   - "Very short prompts create cleanest audio quality" - Community consensus

2. **First Tag = 60% Influence**:
   - V5 weighs first tag most heavily in style field
   - Put most important genre/descriptor first
   - Position 2: 25% influence, Position 3: 10%, Position 4+: 5%

3. **V5 Metatag Improvements**:
   - `[Verse]`, `[Chorus]`, `[Bridge]` now 90-95% reliable
   - `[Short Instrumental Intro]` still more reliable than `[Intro]` (40%)
   - `[Big Finish]` still better than `[Outro]` (45%)
   - NEW: Inline style hints work 70%: `[Verse] [intimate, minimal]`

4. **Syllable Consistency** (V5 Vocal Engine):
   - 6-12 syllables per line optimal
   - Match counts within sections (all verses similar, all choruses similar)
   - Vary between verse/chorus for distinction

5. **V5 Studio Workflow**:
   - Generate → Review in Timeline → Replace Section (fix parts) → Extend (if short) → Remaster (polish)
   - Non-destructive: Original always preserved
   - Cost-effective: Fix sections (10 credits) vs full regeneration (10 credits)

6. **2-Genre Fusion Sweet Spot**:
   - V5 excels at 2-genre combinations (95% success for proven pairs)
   - 3+ genres dilute identity (~60% success)
   - Proven: Pop+EDM, Gospel+Trap, Jazz+Hip Hop, Rock+Orchestral

7. **V5 Emotion Tags**:
   - Use new V5-enhanced emotions: haunting, joyful, somber, ethereal, bittersweet, triumphant
   - 85% reliability (much better than generic mood tags)

8. **Regeneration Still Normal**:
   - V5 success rate: ~90% (up from ~75%)
   - 10% still need retry - this is expected
   - Use Studio features to fix instead of full regeneration

---

## EXAMPLES

### Example 1: YouTube Video Input

```
User: "Use this to create a Suno song: youtube.com/watch?v=abc123"

Output:
✓ Detected: YouTube video
✓ Extracted transcript (1,234 words)
✓ Detected song lyrics in transcript
✓ Structure: Verse-Chorus-Verse-Bridge-Chorus
✓ Compressed to 1,147 chars
✓ Generated 3 variations

[Shows complete formatted prompts]
```

### Example 2: Conversational Input

```
User: "Create a sad piano song about lost love"

Output:
Using "Piano Ballad Melancholic" template

[Immediately shows 3 variations within 10 seconds]

Variation 1 (Conservative): "Piano Ballad, Sad, Male Vocal"
Variation 2 (Balanced): "Piano, Strings, Melancholic, Male, Slow"
Variation 3 (Experimental): "Neo-Classical Piano, Bittersweet, Whisper"

[Full prompts ready to copy]
```

### Example 3: Raw Lyrics Input

```
User: [Provides 500-line poem]

Output:
⚠️ Input is 3,200 characters (exceeds 1,250 limit)
Applying smart compression...
✓ Compressed to 1,180 chars
✓ Kept: All choruses, main verses, key bridge
✓ Shortened: Intro, outro, verbose metatags

[Shows compressed lyrics with structure]
```

---

## REFERENCE MATERIALS (V5)

See `/references` directory for detailed V5 guides:

- `suno_v5_features.md` - Complete V5 feature set and capabilities
- `metatag_reliability_guide.md` - V5-updated 3-tier metatag system with reliability improvements
- `suno_v5_best_practices.md` - V5-specific prompting strategies and community wisdom
- `compression_strategy.md` - V5 quality optimization (optional compression)
- `template_library.md` - 25+ V5-optimized genre templates (coming soon)
- `troubleshooting_guide.md` - V5-specific problem solving (coming soon)

---

## QUALITY ASSURANCE

### Validation Checklist

Before outputting any prompt, verify:

```yaml
character_limits:
  - ✓ Lyrics ≤ 1250 chars
  - ✓ Style ≤ 120 chars
  - ✓ Title ≤ 80 chars

metatag_quality:
  - ✓ 80%+ Tier 1 tags used
  - ✓ No Tier 3 tags (unless experimental variation)
  - ✓ Proper metatag syntax: [Tag] format

style_field_quality:
  - ✓ 3-5 core tags
  - ✓ Most important tag first
  - ✓ No conflicting tags
  - ✓ No filler words

structure_logic:
  - ✓ Logical section flow
  - ✓ Verse/chorus differentiation
  - ✓ Appropriate song length
```

---

## SKILL PHILOSOPHY (V5)

**Core Principles**:

1. **Template-Based Reliability**: Start with V5-optimized proven patterns, customize from there
2. **Quality Over Quantity**: Use limits wisely (5000/1000), but stay concise (2500/150) for best audio
3. **Generate-First**: Show options immediately, refine iteratively (not question-heavy)
4. **Leverage V5 Strengths**: 90% prompt adherence, emotion tags, Studio features
5. **User Empowerment**: Provide 3 variations (conservative/balanced/experimental)
6. **Non-Destructive Refinement**: Use V5 Studio features (Replace Section, Extend, Remaster)

**V5 Success Metrics**:
- 90%+ prompts work first generation (V5 reliability)
- Character optimization (not just limits): 2500 lyrics, 150 style targets
- Studio-assisted perfection: 98%+ final satisfaction with Replace Section capability

---

## TROUBLESHOOTING

### Installation Issues

**yt-dlp not found**:
```bash
# macOS with Homebrew
brew install yt-dlp

# Linux apt
sudo apt update && sudo apt install -y yt-dlp

# Universal pip
pip3 install yt-dlp
```

**Audio libraries fail**:
- Fallback to text-only mode (still fully functional)
- Prompt user with genre/style questions instead
- Skill works perfectly without audio analysis

### Usage Issues

**Character limit exceeded**:
- Should never happen (validation prevents this)
- If it does: Report as bug, compression algorithm needs adjustment

**Structure not followed by Suno**:
- Normal AI behavior
- Try Variation 1 (simpler tags)
- Regenerate 2-3 times
- Accept AI interpretation (sometimes it's better!)

**Can't hear vocals**:
- Common issue with dense instrumentals
- Add "Vocal Focus" or "Clear Mix" to style
- Remove heavy instruments from style field
- Regenerate

---

This skill generates Suno prompts using proven community knowledge and strict character limit compliance. All outputs are copy-paste ready for suno.com/create.

For issues or feedback, refer to troubleshooting guides or update templates based on real-world testing.

Overview

This skill converts YouTube videos, audio files, raw lyrics, existing Suno songs, or conversational ideas into highly optimized Suno V5 prompts ready to copy-paste into suno.com/create. It enforces V5 character limits, applies template-based best practices, and includes V5-enhanced emotion tags and reliable metatag patterns for higher generation success rates. Outputs are focused on practical, ready-to-use lyrics, style fields, and title suggestions.

How this skill works

The skill detects input type (YouTube URL, audio file, raw text, or conversation), extracts or summarizes content, and matches it to an appropriate template. It builds a copy-paste V5 prompt with optimized lyrics (up to 5000 chars), a concise style field (up to 1000 chars), and proven metatags, then produces 3 variations for quick review. If inputs exceed recommended lengths, the skill applies priority-aware compression to preserve choruses and core narrative.

When to use it

  • You want a ready-to-use Suno V5 prompt from a YouTube video or audio file
  • You have raw lyrics that need formatting, compression, or structure tagging
  • You want to turn a conversational idea into a song prompt quickly
  • You need help crafting an optimal V5 style field and reliable metatags
  • You want multiple prompt variations to iterate fast in Suno

Best practices

  • Keep lyrics between 2000–3500 characters for best audio quality (V5 allows up to 5000)
  • Keep style fields concise (100–300 chars recommended) and list 3–5 comma-separated tags
  • Prioritize keeping all choruses and first two verses when compressing long lyrics
  • Place the most important genre/mood first—V5 weighs the first tag most heavily
  • Use Tier 1 metatags ([Verse], [Chorus], [Bridge]) and prefer 'Short Instrumental Intro' over generic [Intro]

Example use cases

  • Convert a YouTube music video transcription into a Suno-ready lyric prompt with compressed structure
  • Analyze an uploaded wav/mp3 to extract mood and suggest a matching template and style field
  • Turn a chat description like 'sad piano ballad about loss' into three V5-optimized prompts
  • Remix an existing Suno song URL by extracting structure and generating variations with new emotion tags
  • Compress overly long lyrics to preserve hooks and narrative while meeting V5 quality targets

FAQ

What if my transcript is longer than V5 limits?

The skill applies smart compression prioritizing choruses and core verses, then reports new character counts and rationale.

How many tags should I put in the style field?

Use 3–5 concise comma-separated tags; place the primary genre/mood first for best influence.