home / skills / jeremylongshore / claude-code-plugins-plus-skills / speak-core-workflow-b
This skill guides users through phoneme-level pronunciation training with adaptive drills, phoneme analysis, and targeted feedback to improve accent.
npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill speak-core-workflow-bReview the files below or copy the command above to add this skill to your agents.
---
name: speak-core-workflow-b
description: |
Execute Speak secondary workflow: Pronunciation Training with detailed phoneme analysis.
Use when implementing pronunciation drills, speech scoring,
or targeted pronunciation improvement features.
Trigger with phrases like "speak pronunciation training",
"speak speech scoring", "secondary speak workflow".
allowed-tools: Read, Write, Edit, Bash(npm:*), Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---
# Speak Core Workflow B: Pronunciation Training
## Overview
Secondary workflow for Speak: Detailed pronunciation training with phoneme-level analysis and targeted practice.
## Prerequisites
- Completed `speak-install-auth` setup
- Familiarity with `speak-core-workflow-a`
- Valid API credentials configured
- High-quality audio input capabilities
## Instructions
### Step 1: Initialize Pronunciation Session
```typescript
// src/workflows/pronunciation-training.ts
import {
SpeakClient,
PronunciationTrainer,
PhonemeAnalyzer,
} from '@speak/language-sdk';
interface PronunciationConfig {
targetLanguage: string;
difficulty: 'beginner' | 'intermediate' | 'advanced';
focusPhonemes?: string[]; // Specific sounds to practice
category?: 'vowels' | 'consonants' | 'tones' | 'all';
}
async function initializePronunciationTraining(
client: SpeakClient,
config: PronunciationConfig
): Promise<PronunciationTrainer> {
const trainer = new PronunciationTrainer(client, {
language: config.targetLanguage,
difficulty: config.difficulty,
adaptiveMode: true, // Adjusts based on user performance
});
// Pre-load phoneme models for target language
await trainer.initialize();
// Set focus areas if specified
if (config.focusPhonemes) {
trainer.setFocusPhonemes(config.focusPhonemes);
}
return trainer;
}
```
### Step 2: Implement Drill Session
```typescript
interface DrillItem {
id: string;
text: string;
romanization?: string; // For non-Latin scripts
translation: string;
audioUrl: string;
targetPhonemes: string[];
difficulty: number;
}
interface DrillResult {
item: DrillItem;
userAudio: ArrayBuffer;
scores: PronunciationScores;
phonemeDetails: PhonemeResult[];
feedback: string[];
}
async function runPronunciationDrill(
trainer: PronunciationTrainer,
drillCount: number = 10
): Promise<DrillSession> {
const session = await trainer.startDrillSession({
itemCount: drillCount,
repeatOnMistake: true,
minScore: 70, // Repeat until 70% or better
});
const results: DrillResult[] = [];
while (!session.isComplete) {
// Get next drill item
const item = await session.getNextItem();
console.log('\n--- Pronunciation Drill ---');
console.log(`Say: "${item.text}"`);
if (item.romanization) {
console.log(`Romanization: ${item.romanization}`);
}
console.log(`Meaning: ${item.translation}`);
console.log('Listen to native pronunciation...');
// Play native audio for reference
await playAudio(item.audioUrl);
// Record user attempt
const userAudio = await recordUserAudio();
// Analyze pronunciation
const result = await session.submitAttempt({
itemId: item.id,
audioData: userAudio,
});
results.push({
item,
userAudio,
scores: result.scores,
phonemeDetails: result.phonemeDetails,
feedback: result.feedback,
});
// Display detailed feedback
displayPronunciationFeedback(result);
// Check if need to repeat
if (result.scores.overall < 70 && session.shouldRepeat) {
console.log('\nLet\'s try that one again...');
}
}
return session.getSummary();
}
```
### Step 3: Phoneme-Level Analysis
```typescript
interface PhonemeResult {
phoneme: string;
expected: string;
actual: string;
score: number;
issues: PhonemeIssue[];
visualGuide?: string; // Mouth position diagram URL
}
interface PhonemeIssue {
type: 'substitution' | 'omission' | 'addition' | 'distortion';
description: string;
tip: string;
}
function displayPronunciationFeedback(result: DrillResult) {
console.log(`\nš Pronunciation Score: ${result.scores.overall}/100`);
console.log(` Accuracy: ${result.scores.accuracy}/100`);
console.log(` Fluency: ${result.scores.fluency}/100`);
console.log(` Intonation: ${result.scores.intonation}/100`);
// Show problem phonemes
const problemPhonemes = result.phonemeDetails.filter(p => p.score < 70);
if (problemPhonemes.length > 0) {
console.log('\nš Phonemes to improve:');
for (const p of problemPhonemes) {
console.log(` [${p.phoneme}] Score: ${p.score}/100`);
for (const issue of p.issues) {
console.log(` ā ļø ${issue.type}: ${issue.description}`);
console.log(` š” Tip: ${issue.tip}`);
}
}
}
// Show overall feedback
if (result.feedback.length > 0) {
console.log('\nš¬ Feedback:');
result.feedback.forEach(f => console.log(` ⢠${f}`));
}
}
```
### Step 4: Adaptive Practice Generation
```typescript
interface WeaknessAnalysis {
phoneme: string;
averageScore: number;
attemptCount: number;
trend: 'improving' | 'stable' | 'declining';
suggestedDrills: DrillItem[];
}
async function generateAdaptivePractice(
trainer: PronunciationTrainer,
userHistory: DrillResult[]
): Promise<AdaptivePracticeSession> {
// Analyze user's weaknesses
const weaknesses = analyzeWeaknesses(userHistory);
// Generate targeted practice
const adaptiveSession = await trainer.createAdaptiveSession({
targetWeaknesses: weaknesses.map(w => w.phoneme),
intensity: 'focused', // or 'mixed' for variety
maxDuration: 15 * 60 * 1000, // 15 minutes
});
console.log('\nšÆ Adaptive Practice Plan:');
console.log(`Focus areas: ${weaknesses.map(w => w.phoneme).join(', ')}`);
return adaptiveSession;
}
function analyzeWeaknesses(results: DrillResult[]): WeaknessAnalysis[] {
const phonemeStats = new Map<string, number[]>();
// Collect scores by phoneme
for (const result of results) {
for (const p of result.phonemeDetails) {
if (!phonemeStats.has(p.phoneme)) {
phonemeStats.set(p.phoneme, []);
}
phonemeStats.get(p.phoneme)!.push(p.score);
}
}
// Identify weaknesses (average < 75)
const weaknesses: WeaknessAnalysis[] = [];
for (const [phoneme, scores] of phonemeStats) {
const avg = scores.reduce((a, b) => a + b, 0) / scores.length;
if (avg < 75) {
weaknesses.push({
phoneme,
averageScore: avg,
attemptCount: scores.length,
trend: calculateTrend(scores),
suggestedDrills: [], // Populated by trainer
});
}
}
return weaknesses.sort((a, b) => a.averageScore - b.averageScore);
}
```
## Complete Workflow Example
```typescript
async function pronunciationTrainingWorkflow() {
const client = getSpeakClient();
// Configure for Korean pronunciation (challenging for English speakers)
const config: PronunciationConfig = {
targetLanguage: 'ko',
difficulty: 'intermediate',
focusPhonemes: ['ć±', 'ć
', 'ć²'], // Korean aspirated/tense consonants
category: 'consonants',
};
console.log('Starting pronunciation training...');
console.log(`Language: ${config.targetLanguage}`);
console.log(`Focus: ${config.focusPhonemes?.join(', ') || 'General'}`);
// Initialize trainer
const trainer = await initializePronunciationTraining(client, config);
// Run initial assessment
console.log('\nš Initial Assessment...');
const assessment = await trainer.runAssessment();
console.log(`Baseline score: ${assessment.overallScore}/100`);
// Run pronunciation drills
const drillResults = await runPronunciationDrill(trainer, 10);
// Generate adaptive practice based on results
const adaptiveSession = await generateAdaptivePractice(
trainer,
drillResults.results
);
// Run adaptive practice
await adaptiveSession.run();
// Final summary
const summary = await trainer.getSessionSummary();
console.log('\n========== Training Complete ==========');
console.log(`Items practiced: ${summary.totalItems}`);
console.log(`Average score: ${summary.averageScore}/100`);
console.log(`Improvement: +${summary.improvement}%`);
return summary;
}
```
## Workflow Comparison
| Aspect | Workflow A (Conversation) | Workflow B (Pronunciation) |
|--------|---------------------------|----------------------------|
| Primary Focus | Communication | Accuracy |
| Feedback Type | Holistic | Phoneme-level |
| Session Style | Free-form dialogue | Structured drills |
| Pacing | User-driven | System-driven |
| Best For | Fluency building | Accent reduction |
## Output
- Detailed phoneme-level scores
- Visual pronunciation guides
- Adaptive practice recommendations
- Progress tracking over time
- Weakness identification
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| Audio Too Short | Brief recording | Minimum 0.5s audio |
| Background Noise | Poor recording conditions | Prompt for quieter environment |
| Phoneme Not Detected | Unclear speech | Slow down and articulate |
| Model Loading Failed | Network issue | Retry with fallback |
## Resources
- [Speak Pronunciation API](https://developer.speak.com/api/pronunciation)
- [Phoneme Reference](https://developer.speak.com/docs/phonemes)
- [Audio Recording Best Practices](https://developer.speak.com/docs/audio-quality)
## Next Steps
For common errors, see `speak-common-errors`.
This skill executes Speak secondary workflow B: Pronunciation Training with detailed phoneme-level analysis and adaptive practice. It guides initialization, drill sessions, phoneme diagnostics, and targeted practice generation for measurable pronunciation improvement. Use it to implement scoring, feedback, and progress tracking in language learning apps.
The skill initializes a PronunciationTrainer, preloads phoneme models, and configures focus areas and difficulty. It runs structured drill sessions where native audio is played, user audio is recorded, and each attempt is scored with phoneme-level breakdowns. Results feed an analyzer that identifies weak phonemes and generates adaptive practice sessions and targeted drills.
What input quality is required?
Use a clear microphone in a quiet room; recordings under 0.5s or heavy background noise can fail analysis.
How does adaptive practice choose drills?
It analyzes phoneme scores across attempts, ranks weaknesses (avg <75), and generates sessions that target the lowest-scoring phonemes with focused drills.