home / skills / pluginagentmarketplace / custom-plugin-ai-data-scientist / statistical-analysis
npx playbooks add skill pluginagentmarketplace/custom-plugin-ai-data-scientist --skill statistical-analysisReview the files below or copy the command above to add this skill to your agents.
---
name: statistical-analysis
description: Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.
sasmp_version: "1.3.0"
bonded_agent: 02-mathematics-statistics
bond_type: PRIMARY_BOND
---
# Statistical Analysis
Apply statistical methods to understand data and validate findings.
## Quick Start
```python
from scipy import stats
import numpy as np
# Descriptive statistics
data = np.array([1, 2, 3, 4, 5])
print(f"Mean: {np.mean(data)}")
print(f"Std: {np.std(data)}")
# Hypothesis testing
group1 = [23, 25, 27, 29, 31]
group2 = [20, 22, 24, 26, 28]
t_stat, p_value = stats.ttest_ind(group1, group2)
print(f"P-value: {p_value}")
```
## Core Tests
### T-Test (Compare Means)
```python
# One-sample: Compare to population mean
stats.ttest_1samp(data, 100)
# Two-sample: Compare two groups
stats.ttest_ind(group1, group2)
# Paired: Before/after comparison
stats.ttest_rel(before, after)
```
### Chi-Square (Categorical Data)
```python
from scipy.stats import chi2_contingency
observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)
```
### ANOVA (Multiple Groups)
```python
f_stat, p_value = stats.f_oneway(group1, group2, group3)
```
## Confidence Intervals
```python
from scipy import stats
confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)
print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")
```
## Correlation
```python
# Pearson (linear)
r, p_value = stats.pearsonr(x, y)
# Spearman (rank-based)
rho, p_value = stats.spearmanr(x, y)
```
## Distributions
```python
# Normal
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)
# Sampling
samples = np.random.normal(0, 1, 1000)
# Test normality
stat, p_value = stats.shapiro(data)
```
## A/B Testing Framework
```python
def ab_test(control, treatment, alpha=0.05):
"""
Run A/B test with statistical significance
Returns: significant (bool), p_value (float)
"""
t_stat, p_value = stats.ttest_ind(control, treatment)
significant = p_value < alpha
improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100
return {
'significant': significant,
'p_value': p_value,
'improvement': f"{improvement:.2f}%"
}
```
## Interpretation
**P-value < 0.05**: Reject null hypothesis (statistically significant)
**P-value >= 0.05**: Fail to reject null (not significant)
## Common Pitfalls
- Multiple testing without correction
- Small sample sizes
- Ignoring assumptions (normality, independence)
- Confusing correlation with causation
- p-hacking (searching for significance)
## Troubleshooting
### Common Issues
**Problem: Non-normal data for t-test**
```python
# Check normality first
stat, p = stats.shapiro(data)
if p < 0.05:
# Use non-parametric alternative
stat, p = stats.mannwhitneyu(group1, group2) # Instead of ttest_ind
```
**Problem: Multiple comparisons inflating false positives**
```python
from statsmodels.stats.multitest import multipletests
# Apply Bonferroni correction
p_values = [0.01, 0.03, 0.04, 0.02, 0.06]
rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')
```
**Problem: Underpowered study (sample too small)**
```python
from statsmodels.stats.power import TTestIndPower
# Calculate required sample size
power_analysis = TTestIndPower()
sample_size = power_analysis.solve_power(
effect_size=0.5, # Medium effect (Cohen's d)
power=0.8, # 80% power
alpha=0.05 # 5% significance
)
print(f"Required n per group: {sample_size:.0f}")
```
**Problem: Heterogeneous variances**
```python
# Check with Levene's test
stat, p = stats.levene(group1, group2)
if p < 0.05:
# Use Welch's t-test (default in scipy)
t, p = stats.ttest_ind(group1, group2, equal_var=False)
```
**Problem: Outliers affecting results**
```python
from scipy.stats import zscore
# Detect outliers (|z| > 3)
z_scores = np.abs(zscore(data))
clean_data = data[z_scores < 3]
# Or use robust statistics
median = np.median(data)
mad = np.median(np.abs(data - median)) # Median Absolute Deviation
```
### Debug Checklist
- [ ] Check sample size adequacy (power analysis)
- [ ] Test normality assumption (Shapiro-Wilk)
- [ ] Test homogeneity of variance (Levene's)
- [ ] Check for outliers (z-scores, IQR)
- [ ] Apply multiple testing correction if needed
- [ ] Report effect sizes, not just p-values