home / skills / eddiebe147 / claude-settings / ab-test-designer

ab-test-designer skill

/skills/ab-test-designer

This skill helps design statistically valid A/B tests for marketing optimization, guiding hypothesis, sample size, duration, and result interpretation.

npx playbooks add skill eddiebe147/claude-settings --skill ab-test-designer

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.3 KB
---
name: A/B Test Designer
slug: ab-test-designer
description: Design statistically valid A/B tests for marketing optimization
category: marketing
complexity: advanced
version: "1.0.0"
author: "ID8Labs"
triggers:
  - "ab test"
  - "a/b test"
  - "split test"
  - "experiment design"
  - "statistical testing"
tags:
  - ab-testing
  - experimentation
  - optimization
  - statistics
  - conversion-testing
---

# A/B Test Designer

Design rigorous A/B tests that produce actionable, statistically significant results. This skill combines experimentation methodology with marketing intuition to help you test the right things, measure correctly, and make confident decisions based on data.

Most A/B tests fail before they start due to poor design: wrong sample sizes, multiple variable contamination, or testing low-impact elements. This skill provides the scientific framework for hypothesis formation, test design, sample size calculation, and result interpretation that separates real insights from statistical noise.

Essential for growth marketers, product managers, CRO specialists, and data-driven teams optimizing conversion funnels.

## Core Workflows

### Workflow 1: Test Hypothesis Development
1. **Data Analysis** - Review existing performance metrics
2. **Opportunity Identification** - Find high-impact test areas
3. **Hypothesis Formation** - "If we [change], then [outcome] because [rationale]"
4. **Success Metric Definition** - Primary and secondary KPIs
5. **Hypothesis Prioritization** - Rank by potential impact
6. **ICE Scoring** - Impact, Confidence, Ease framework
7. **Test Roadmap** - Sequence tests strategically

### Workflow 2: Test Design & Setup
1. **Variable Isolation** - Test one thing at a time
2. **Control Definition** - Current version as baseline
3. **Variant Creation** - Design the challenger
4. **Sample Size Calculation** - Required visitors for significance
5. **Test Duration Planning** - Account for traffic and cycles
6. **Segmentation Strategy** - Define audience splits
7. **Technical Implementation** - Testing tool configuration

### Workflow 3: Statistical Analysis
1. **Significance Threshold** - Set confidence level (95% typical)
2. **Minimum Detectable Effect** - What lift would matter?
3. **Power Analysis** - Reduce false negative risk
4. **P-Value Interpretation** - Understand what it means
5. **Confidence Intervals** - Range of likely outcomes
6. **Segment Analysis** - Performance by audience
7. **Result Documentation** - Clear winner/loser/inconclusive

### Workflow 4: Multivariate Testing
1. **Element Selection** - Choose factors to test
2. **Combination Matrix** - Map all variations
3. **Traffic Requirements** - Calculate needed sample size
4. **Interaction Effects** - Look for element synergies
5. **Fractional Factorial** - Reduce combinations if needed
6. **Winner Identification** - Best performing combination
7. **Learning Extraction** - Insights beyond the winner

## Quick Reference

| Action | Command/Trigger |
|--------|-----------------|
| Design test | "Create A/B test for [element/page]" |
| Write hypothesis | "Form hypothesis for testing [change]" |
| Calculate sample size | "How much traffic do I need to test [change]?" |
| Analyze results | "Interpret these A/B test results" |
| Prioritize tests | "Prioritize these test ideas using ICE" |
| MVT design | "Design multivariate test for [elements]" |
| Segment analysis | "Break down results by [segment]" |
| Test roadmap | "Create 90-day testing roadmap" |

## Best Practices

- **Hypothesis first** - No hypothesis, no learning
- **One variable** - Test single changes for clean insights
- **Primary metric** - Choose one main success measure
- **Statistical significance** - 95% confidence minimum
- **Don't peek** - Wait for full sample size
- **Consider business cycles** - Week over week variation matters
- **Document everything** - Build testing knowledge base
- **Size for significance** - Under-powered tests waste time
- **High-impact areas first** - Test headlines before button colors
- **Segment thoughtfully** - Look for hidden patterns in data
- **Accept null results** - "No difference" is valid learning
- **Iterate on winners** - Compound gains through sequential tests
- **Avoid HiPPO** - Highest Paid Person's Opinion isn't data
- **Share learnings** - Tests benefit the whole organization
- **Consider externalities** - Seasonality, competition, campaigns

Overview

This skill helps you design statistically valid A/B tests for marketing optimization, turning hypotheses into reliable, actionable results. It combines experiment methodology, sample size and power calculations, and interpretation guidance so teams can prioritize high-impact changes and avoid common pitfalls.

How this skill works

I guide you through hypothesis development, opportunity identification, and ICE-based prioritization to build a focused test roadmap. Then I help isolate variables, calculate sample sizes and test durations, and set up segmentation and technical implementation. After the test runs, I walk you through statistical analysis: significance, p-values, confidence intervals, and segment-level insights. For complex pages I support multivariate design, fractional factorial reduction, and interaction detection.

When to use it

  • When you need to validate a marketing or product change before full rollout
  • When conversion lifts are small and require proper power and sample sizing
  • When multiple ideas must be prioritized and sequenced strategically
  • When you want to understand variation across audience segments
  • When testing combinations of elements (multivariate scenarios)

Best practices

  • Start with a clear hypothesis and one primary success metric
  • Test a single variable or use proper multivariate design to avoid contamination
  • Size tests for statistical power and predefine duration; avoid peeking
  • Use 95% confidence as a default threshold and report confidence intervals
  • Document setup, variants, segments, and results to build organizational learning
  • Prioritize high-impact changes (headlines, funnels) before cosmetic tweaks

Example use cases

  • Create an A/B test to compare a new landing page headline against control with required sample size
  • Prioritize a backlog of test ideas using ICE scoring and expected impact
  • Design a multivariate test for headline, image, and CTA combinations with fractional factorial sampling
  • Interpret ambiguous test results and decide whether to extend, stop, or iterate
  • Segment results to see if a variant performs differently across device types or traffic sources

FAQ

How do you calculate required sample size?

I use baseline conversion, the minimum detectable effect you care about, desired power (commonly 80%) and significance level (commonly 95%) to compute the visitor count per variant and overall test duration based on traffic.

Can I test multiple elements at once?

Yes—use multivariate design or fractional factorials to manage combinations, but ensure you have the traffic to detect interaction effects or test sequentially to isolate learnings.