home / skills / benchflow-ai / skillsbench / trend-analysis

This skill helps you detect long-term trends in time series using linear regression and Sen's slope, guiding data-driven decisions.

npx playbooks add skill benchflow-ai/skillsbench --skill trend-analysis

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
2.7 KB
---
name: trend-analysis
description: Detect long-term trends in time series data using parametric and non-parametric methods. Use when determining if a variable shows statistically significant increase or decrease over time.
license: MIT
---

# Trend Analysis Guide

## Overview

Trend analysis determines whether a time series shows a statistically significant long-term increase or decrease. This guide covers both parametric (linear regression) and non-parametric (Sen's slope) methods.

## Parametric Method: Linear Regression

Linear regression fits a straight line to the data and tests if the slope is significantly different from zero.
```python
from scipy import stats

slope, intercept, r_value, p_value, std_err = stats.linregress(years, values)

print(f"Slope: {slope:.2f} units/year")
print(f"p-value: {p_value:.2f}")
```

### Assumptions

- Linear relationship between time and variable
- Residuals are normally distributed
- Homoscedasticity (constant variance)

## Non-Parametric Method: Sen's Slope with Mann-Kendall Test

Sen's slope is robust to outliers and does not assume normality. Recommended for environmental data.
```python
import pymannkendall as mk

result = mk.original_test(values)

print(result.slope)  # Sen's slope (rate of change per time unit)
print(result.p)      # p-value for significance
print(result.trend)  # 'increasing', 'decreasing', or 'no trend'
```

### Comparison

| Method | Pros | Cons |
|--------|------|------|
| Linear Regression | Easy to interpret, gives R² | Sensitive to outliers |
| Sen's Slope | Robust to outliers, no normality assumption | Slightly less statistical power |

## Significance Levels

| p-value | Interpretation |
|---------|----------------|
| p < 0.01 | Highly significant trend |
| p < 0.05 | Significant trend |
| p < 0.10 | Marginally significant |
| p >= 0.10 | No significant trend |

## Example: Annual Precipitation Trend
```python
import pandas as pd
import pymannkendall as mk

# Load annual precipitation data
df = pd.read_csv('precipitation.csv')
precip = df['Precipitation'].values

# Run Mann-Kendall test
result = mk.original_test(precip)
print(f"Sen's slope: {result.slope:.2f} mm/year")
print(f"p-value: {result.p:.2f}")
print(f"Trend: {result.trend}")
```

## Common Issues

| Issue | Cause | Solution |
|-------|-------|----------|
| p-value = NaN | Too few data points | Need at least 8-10 years |
| Conflicting results | Methods have different assumptions | Trust Sen's slope for environmental data |
| Slope near zero but significant | Large sample size | Check practical significance |

## Best Practices

- Use at least 10 data points for reliable results
- Prefer Sen's slope for environmental time series
- Report both slope magnitude and p-value
- Round results to 2 decimal places

Overview

This skill detects long-term trends in time series data using both parametric and non-parametric methods. It helps determine whether a variable shows a statistically significant increase or decrease over time and quantifies the rate of change. The skill is suitable for environmental, economic, and observational datasets where trend detection is needed.

How this skill works

The skill runs a linear regression to estimate a slope and associated p-value under parametric assumptions. It also runs a non-parametric Mann–Kendall test with Sen's slope estimator for robust trend detection that tolerates outliers and non-normal residuals. Results report slope magnitude, p-value, and a categorical trend (increasing, decreasing, no trend).

When to use it

  • Assessing whether a variable shows a long-term increase or decrease
  • Analyzing environmental series (precipitation, temperature, streamflow) where outliers or non-normality are expected
  • Comparing parametric and robust non-parametric results to check assumption sensitivity
  • Reporting both statistical and practical significance of change over time
  • Screening datasets before modeling or policy decision making

Best practices

  • Use at least 8–10 evenly spaced observations; 10+ points is preferred for reliability
  • Report both slope magnitude (units per time) and p-value to convey effect size and significance
  • Prefer Sen's slope/Mann–Kendall for environmental data or when residual assumptions fail
  • Check diagnostics for linear regression (residual normality, homoscedasticity) before trusting parametric results
  • Round reported slope and p-values to two decimal places and note practical significance

Example use cases

  • Testing if annual precipitation shows a significant upward or downward trend over decades
  • Evaluating whether air pollutant concentrations have declined following regulation
  • Monitoring glacier mass or river discharge for long-term climate signals
  • Validating trend direction when linear regression and non-parametric tests disagree
  • Flagging series with marginal p-values for further investigation or longer monitoring

FAQ

Which method should I trust if results conflict?

If assumptions of linear regression are violated or outliers are present, trust Sen's slope with Mann–Kendall for environmental data; otherwise report both and explain differences.

How many data points do I need?

Aim for at least 10 data points; fewer than 8–10 may produce unreliable or NaN p-values.