home / skills / zpankz / mcp-skillset / baseline-quality-assessment

baseline-quality-assessment skill

/baseline-quality-assessment

This skill helps you plan iteration 0 with a data-driven baseline (V_meta ≥0.40), enabling rapid convergence and clear prioritization.

npx playbooks add skill zpankz/mcp-skillset --skill baseline-quality-assessment

Review the files below or copy the command above to add this skill to your agents.

Files (6)
SKILL.md
12.4 KB
---
name: Baseline Quality Assessment
description: Achieve comprehensive baseline (V_meta ≥0.40) in iteration 0 to enable rapid convergence. Use when planning iteration 0 time allocation, domain has established practices to reference, rich historical data exists for immediate quantification, or targeting 3-4 iteration convergence. Provides 4 quality levels (minimal/basic/comprehensive/exceptional), component-by-component V_meta calculation guide, and 3 strategies for comprehensive baseline (leverage prior art, quantify baseline, domain universality analysis). 40-50% iteration reduction when V_meta(s₀) ≥0.40 vs <0.20. Spend 3-4 extra hours in iteration 0, save 3-6 hours overall.
allowed-tools: Read, Grep, Glob, Bash, Edit, Write
---

# Baseline Quality Assessment

**Invest in iteration 0 to save 40-50% total time.**

> A strong baseline (V_meta ≥0.40) is the foundation of rapid convergence. Spend hours in iteration 0 to save days overall.

---

## When to Use This Skill

Use this skill when:
- 📋 **Planning iteration 0**: Deciding time allocation and priorities
- 🎯 **Targeting rapid convergence**: Want 3-4 iterations (not 5-7)
- 📚 **Prior art exists**: Domain has established practices to reference
- 📊 **Historical data available**: Can quantify baseline immediately
- ⏰ **Time constraints**: Need methodology in 10-15 hours total
- 🔍 **Gap clarity needed**: Want obvious iteration objectives

**Don't use when**:
- ❌ Exploratory domain (no prior art)
- ❌ Greenfield project (no historical data)
- ❌ Time abundant (standard convergence acceptable)
- ❌ Incremental baseline acceptable (build up gradually)

---

## Quick Start (30 minutes)

### Baseline Quality Self-Assessment

Calculate your V_meta(s₀):

**V_meta = (Completeness + Effectiveness + Reusability + Validation) / 4**

**Completeness** (Documentation exists?):
- 0.00: No documentation
- 0.25: Basic notes only
- 0.50: Partial documentation (some categories)
- 0.75: Most documentation complete
- 1.00: Comprehensive documentation

**Effectiveness** (Speedup quantified?):
- 0.00: No baseline measurement
- 0.25: Informal estimates
- 0.50: Some metrics measured
- 0.75: Most metrics quantified
- 1.00: Full quantitative baseline

**Reusability** (Transferable patterns?):
- 0.00: No patterns identified
- 0.25: Ad-hoc solutions only
- 0.50: Some patterns emerging
- 0.75: Most patterns codified
- 1.00: Universal patterns documented

**Validation** (Evidence-based?):
- 0.00: No validation
- 0.25: Anecdotal only
- 0.50: Some data analysis
- 0.75: Systematic analysis
- 1.00: Comprehensive validation

**Example** (Bootstrap-003, V_meta(s₀) = 0.48):
```
Completeness: 0.60 (10-category taxonomy, 79.1% coverage)
Effectiveness: 0.40 (Error rate quantified: 5.78%)
Reusability: 0.40 (5 workflows, 5 patterns, 8 guidelines)
Validation: 0.50 (1,336 errors analyzed)
---
V_meta(s₀) = (0.60 + 0.40 + 0.40 + 0.50) / 4 = 0.475 ≈ 0.48
```

**Target**: V_meta(s₀) ≥ 0.40 for rapid convergence

---

## Four Baseline Quality Levels

### Level 1: Minimal (V_meta <0.20)

**Characteristics**:
- No or minimal documentation
- No quantitative metrics
- No pattern identification
- No validation

**Iteration 0 time**: 1-2 hours
**Total iterations**: 6-10 (standard to slow convergence)
**Example**: Starting from scratch in novel domain

**When acceptable**: Exploratory research, no prior art

### Level 2: Basic (V_meta 0.20-0.39)

**Characteristics**:
- Basic documentation (notes, informal structure)
- Some metrics identified (not quantified)
- Ad-hoc patterns (not codified)
- Anecdotal validation

**Iteration 0 time**: 2-3 hours
**Total iterations**: 5-7 (standard convergence)
**Example**: Bootstrap-002 (V_meta(s₀) = 0.04, but quickly built to basic)

**When acceptable**: Standard timelines, incremental approach

### Level 3: Comprehensive (V_meta 0.40-0.60) ⭐ TARGET

**Characteristics**:
- Structured documentation (taxonomy, categories)
- Quantified metrics (baseline measured)
- Codified patterns (initial pattern library)
- Systematic validation (data analysis)

**Iteration 0 time**: 3-5 hours
**Total iterations**: 3-4 (rapid convergence)
**Example**: Bootstrap-003 (V_meta(s₀) = 0.48, converged in 3 iterations)

**When to target**: Time constrained, prior art exists, data available

### Level 4: Exceptional (V_meta >0.60)

**Characteristics**:
- Comprehensive documentation (≥90% coverage)
- Full quantitative baseline (all metrics)
- Extensive pattern library
- Validated methodology (proven in 1+ contexts)

**Iteration 0 time**: 5-8 hours
**Total iterations**: 2-3 (exceptional rapid convergence)
**Example**: Hypothetical (not yet observed in experiments)

**When to target**: Adaptation of proven methodology, domain expertise high

---

## Three Strategies for Comprehensive Baseline

### Strategy 1: Leverage Prior Art (2-3 hours)

**When**: Domain has established practices

**Steps**:

1. **Literature review** (30 min):
   - Industry best practices
   - Existing methodologies
   - Academic research

2. **Extract patterns** (60 min):
   - Common approaches
   - Known anti-patterns
   - Success metrics

3. **Adapt to context** (60 min):
   - What's applicable?
   - What needs modification?
   - What's missing?

**Example** (Bootstrap-003):
```
Prior art: Error handling literature
- Detection: Industry standard (logs, monitoring)
- Diagnosis: Root cause analysis patterns
- Recovery: Retry, fallback patterns
- Prevention: Static analysis, linting

Adaptation:
- Detection: meta-cc MCP queries (novel application)
- Diagnosis: Session history analysis (context-specific)
- Recovery: Generic patterns apply
- Prevention: Pre-tool validation (novel approach)

Result: V_completeness = 0.60 (60% from prior art, 40% novel)
```

### Strategy 2: Quantify Baseline (1-2 hours)

**When**: Rich historical data exists

**Steps**:

1. **Identify data sources** (15 min):
   - Logs, session history, metrics
   - Git history, CI/CD logs
   - Issue trackers, user feedback

2. **Extract metrics** (30 min):
   - Volume (total instances)
   - Rate (frequency)
   - Distribution (categories)
   - Impact (cost)

3. **Analyze patterns** (45 min):
   - What's most common?
   - What's most costly?
   - What's preventable?

**Example** (Bootstrap-003):
```
Data source: meta-cc MCP server
Query: meta-cc query-tools --status error

Results:
- Volume: 1,336 errors
- Rate: 5.78% error rate
- Distribution: File-not-found 12.2%, Read-before-write 5.2%, etc.
- Impact: MTTD 15 min, MTTR 30 min

Analysis:
- Top 3 categories account for 23.7% of errors
- File path issues most preventable
- Clear automation opportunities

Result: V_effectiveness = 0.40 (baseline quantified)
```

### Strategy 3: Domain Universality Analysis (1-2 hours)

**When**: Domain is universal (errors, testing, CI/CD)

**Steps**:

1. **Identify universal patterns** (30 min):
   - What applies to all projects?
   - What's language-agnostic?
   - What's platform-agnostic?

2. **Document transferability** (30 min):
   - What % is reusable?
   - What needs adaptation?
   - What's project-specific?

3. **Create initial taxonomy** (30 min):
   - Categorize patterns
   - Identify gaps
   - Estimate coverage

**Example** (Bootstrap-003):
```
Universal patterns:
- Errors affect all software (100% universal)
- Detection, diagnosis, recovery, prevention (universal workflow)
- File operations, API calls, data validation (universal categories)

Taxonomy (iteration 0):
- 10 categories identified
- 1,058 errors classified (79.1% coverage)
- Gaps: Edge cases, complex interactions

Result: V_reusability = 0.40 (universal patterns identified)
```

---

## Baseline Investment ROI

**Trade-off**: Spend more in iteration 0 to save overall time

**Data** (from experiments):

| Baseline | Iter 0 Time | Total Iterations | Total Time | Savings |
|----------|-------------|------------------|------------|---------|
| Minimal (<0.20) | 1-2h | 6-10 | 24-40h | Baseline |
| Basic (0.20-0.39) | 2-3h | 5-7 | 20-28h | 10-30% |
| Comprehensive (0.40-0.60) | 3-5h | 3-4 | 12-16h | 40-50% |
| Exceptional (>0.60) | 5-8h | 2-3 | 10-15h | 50-60% |

**Example** (Bootstrap-003):
```
Comprehensive baseline:
- Iteration 0: 3 hours (vs 1 hour minimal)
- Total: 10 hours, 3 iterations
- Savings: 15-25 hours vs minimal baseline (60-70%)

ROI: +2 hours investment → 15-25 hours saved
```

**Recommendation**: Target comprehensive (V_meta ≥0.40) when:
- Time constrained (need fast convergence)
- Prior art exists (can leverage quickly)
- Data available (can quantify immediately)

---

## Component-by-Component Guide

### Completeness (Documentation)

**0.00**: No documentation

**0.25**: Basic notes
- Informal observations
- Bullet points
- No structure

**0.50**: Partial documentation
- Some categories/patterns
- 40-60% coverage
- Basic structure

**0.75**: Most documentation
- Structured taxonomy
- 70-90% coverage
- Clear organization

**1.00**: Comprehensive
- Complete taxonomy
- 90%+ coverage
- Production-ready

**Target for V_meta ≥0.40**: Completeness ≥0.50

### Effectiveness (Quantification)

**0.00**: No baseline measurement

**0.25**: Informal estimates
- "Errors happen sometimes"
- No numbers

**0.50**: Some metrics
- Volume measured (e.g., 1,336 errors)
- Rate not calculated

**0.75**: Most metrics
- Volume, rate, distribution
- Missing impact (MTTD/MTTR)

**1.00**: Full quantification
- All metrics measured
- Baseline fully quantified

**Target for V_meta ≥0.40**: Effectiveness ≥0.30

### Reusability (Patterns)

**0.00**: No patterns

**0.25**: Ad-hoc solutions
- One-off fixes
- No generalization

**0.50**: Some patterns
- 3-5 patterns identified
- Partial universality

**0.75**: Most patterns
- 5-10 patterns codified
- High transferability

**1.00**: Universal patterns
- Complete pattern library
- 90%+ transferable

**Target for V_meta ≥0.40**: Reusability ≥0.40

### Validation (Evidence)

**0.00**: No validation

**0.25**: Anecdotal
- "Seems to work"
- No data

**0.50**: Some data
- Basic analysis
- Limited scope

**0.75**: Systematic
- Comprehensive analysis
- Clear evidence

**1.00**: Validated
- Multiple contexts
- Statistical confidence

**Target for V_meta ≥0.40**: Validation ≥0.30

---

## Iteration 0 Checklist (for V_meta ≥0.40)

**Documentation** (Target: Completeness ≥0.50):
- [ ] Create initial taxonomy (≥5 categories)
- [ ] Document 3-5 patterns/workflows
- [ ] Achieve 60-80% coverage
- [ ] Structured markdown documentation

**Quantification** (Target: Effectiveness ≥0.30):
- [ ] Measure volume (total instances)
- [ ] Calculate rate (frequency)
- [ ] Analyze distribution (category breakdown)
- [ ] Baseline quantified with numbers

**Patterns** (Target: Reusability ≥0.40):
- [ ] Identify 3-5 universal patterns
- [ ] Document transferability
- [ ] Estimate reusability %
- [ ] Distinguish universal vs domain-specific

**Validation** (Target: Validation ≥0.30):
- [ ] Analyze historical data
- [ ] Sample validation (≥30 instances)
- [ ] Evidence-based claims
- [ ] Data sources documented

**Time Investment**: 3-5 hours

**Expected V_meta(s₀)**: 0.40-0.50

---

## Success Criteria

Baseline quality assessment succeeded when:

1. **V_meta target met**: V_meta(s₀) ≥ 0.40 achieved
2. **Iteration reduction**: 3-4 iterations vs 5-7 (40-50% reduction)
3. **Time savings**: Total time ≤12-16 hours (comprehensive baseline)
4. **Gap clarity**: Clear objectives for iteration 1-2
5. **ROI positive**: Baseline investment <total time saved

**Bootstrap-003 Validation**:
- ✅ V_meta(s₀) = 0.48 (target met)
- ✅ 3 iterations (vs 6 for Bootstrap-002 with minimal baseline)
- ✅ 10 hours total (60% reduction)
- ✅ Gaps clear (top 3 automations identified)
- ✅ ROI: +2h investment → 15h saved

---

## Related Skills

**Parent framework**:
- [methodology-bootstrapping](../methodology-bootstrapping/SKILL.md) - Core OCA cycle

**Uses baseline for**:
- [rapid-convergence](../rapid-convergence/SKILL.md) - V_meta ≥0.40 is criterion #1

**Validation**:
- [retrospective-validation](../retrospective-validation/SKILL.md) - Data quantification

---

## References

**Core guide**:
- [Quality Levels](reference/quality-levels.md) - Detailed level definitions
- [Component Guide](reference/components.md) - V_meta calculation
- [Investment ROI](reference/roi.md) - Time savings analysis

**Examples**:
- [Bootstrap-003 Comprehensive](examples/error-recovery-comprehensive-baseline.md) - V_meta=0.48
- [Bootstrap-002 Minimal](examples/testing-strategy-minimal-baseline.md) - V_meta=0.04

---

**Status**: ✅ Validated | 40-50% iteration reduction | Positive ROI

Overview

This skill defines a practical method to establish a strong baseline (V_meta ≥0.40) in iteration 0 to enable rapid convergence. It prescribes component-level scoring, four quality levels, and three concrete strategies to reach a comprehensive baseline. Use it to trade a small extra investment up front for large overall time savings and clearer iteration goals.

How this skill works

Inspect four components—Completeness, Effectiveness, Reusability, and Validation—and calculate V_meta as their average. Provide a short self-assessment rubric for each component, a checklist for iteration 0, and three focused strategies (leverage prior art, quantify baseline, domain universality) to raise V_meta quickly. The output guides time allocation and predicts iteration count and ROI.

When to use it

  • Planning iteration 0 time allocation and priorities
  • Targeting rapid convergence (3–4 iterations)
  • When prior art and rich historical data exist for immediate quantification
  • When time is constrained and clear iteration objectives are needed
  • When you want obvious gap identification and measurable baselines

Best practices

  • Spend 3–5 hours in iteration 0 to aim for V_meta ≥0.40 for best ROI
  • Start with prior-art extraction to fast-track completeness and reusability
  • Quantify baseline metrics from logs and issue history to boost Effectiveness
  • Document universal patterns and an initial taxonomy to improve transferability
  • Use the iteration 0 checklist: taxonomy, metrics, patterns, and sample validation

Example use cases

  • Bootstrap a recovery workflow where historical error logs exist and you need 3 iterations to converge
  • Adapt an established methodology to a new project by extracting and applying prior patterns
  • Measure and prioritize automation candidates by quantifying error volume and impact
  • Create a reusable pattern library for cross-project CI/CD issues using universality analysis

FAQ

What is the target V_meta for rapid convergence?

Target V_meta(s₀) ≥0.40 to achieve 3–4 iteration convergence and ~40–50% iteration reduction.

How much extra time should I invest in iteration 0?

Plan 3–5 hours for a comprehensive baseline; 2–3 hours for a basic baseline; exceptional baselines may take 5–8 hours.