home / skills / meleantonio / awesome-econ-ai-stuff / stata-regression

stata-regression skill

/_skills/analysis/stata-regression

This skill helps economists run reproducible Stata regression workflows generating publication-ready tables with diagnostics and robust outputs.

npx playbooks add skill meleantonio/awesome-econ-ai-stuff --skill stata-regression

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
3.2 KB
---
name: stata-regression
description: Run regression analyses in Stata with publication-ready output tables.
workflow_stage: analysis
compatibility:
  - claude-code
  - cursor
  - codex
  - gemini-cli
author: Awesome Econ AI Community
version: 1.0.0
tags:
  - stata
  - regression
  - esttab
  - econometrics
---

# Stata Regression

## Purpose

This skill produces reproducible regression analysis workflows in Stata, including model diagnostics and publication-ready tables using `esttab` or `outreg2`.

## When to Use

- Estimating linear or nonlinear regression models in Stata
- Producing tables for academic papers and reports
- Running robustness checks and alternative specifications

## Instructions

Follow these steps to complete the task:

### Step 1: Understand the Context

Before generating any code, ask the user:

- What is the dependent variable and key regressors?
- What controls and fixed effects are required?
- How should standard errors be clustered?
- What output format is needed (LaTeX, Word, or CSV)?

### Step 2: Generate the Output

Based on the context, generate Stata code that:

1. **Loads and checks the data** - Handle missing values and verify variable types
2. **Runs the requested specification** - Use `regress`, `reghdfe`, or `xtreg` as appropriate
3. **Adds robust or clustered standard errors** - Match the study design
4. **Exports tables** - Use `esttab` or `outreg2` with clear labels

### Step 3: Verify and Explain

After generating output:

- Explain what each model estimates
- Highlight assumptions and diagnostics
- Suggest robustness checks or alternative models

## Example Prompts

- "Run OLS with firm and year fixed effects, clustering by firm"
- "Estimate a logit model and export results to LaTeX"
- "Create a regression table with three specifications"

## Example Output

```stata
* ============================================
* Regression Analysis with Stata
* ============================================

* Load data
use "data.dta", clear

* Summary stats
summarize y x1 x2 x3

* Main regression with clustered SEs
regress y x1 x2 x3, vce(cluster firm_id)
eststo model1

* Alternative specification with fixed effects
reghdfe y x1 x2 x3, absorb(firm_id year) vce(cluster firm_id)
eststo model2

* Export table
esttab model1 model2 using "results/regression_table.tex", replace se label
```

## Requirements

### Software

- Stata 17+

### Packages

- `estout` (for `esttab`)
- `reghdfe` (optional, for high-dimensional fixed effects)

Install with:

```stata
ssc install estout
ssc install reghdfe
```

## Best Practices

1. **Match standard errors to the design** (cluster where treatment varies)
2. **Report all model variants** used in the analysis
3. **Document variable definitions** and transformations

## Common Pitfalls

- Not clustering standard errors at the correct level
- Omitting fixed effects when required by the design
- Exporting tables without clear labels and notes

## References

- [Stata Regression Reference Manual](https://www.stata.com/manuals/rregress.pdf)
- [reghdfe documentation](https://github.com/sergiocorreia/reghdfe)
- [estout documentation](https://repec.sowi.unibe.ch/stata/estout/)

## Changelog

### v1.0.0

- Initial release

Overview

This skill runs reproducible regression analyses in Stata and produces publication-ready result tables. It automates data checks, model estimation (OLS, panel, or nonlinear), clustered or robust standard errors, and exports using esttab or outreg2. The workflow emphasizes clear labels, diagnostics, and ready-to-use LaTeX/Word/CSV output.

How this skill works

I generate Stata code tailored to your specification: loading and checking the dataset, handling missing values, and verifying variable types. I select the appropriate estimator (regress, xtreg, reghdfe, or logistic) and add robust or clustered standard errors consistent with your design. Finally, I produce formatted tables with esttab or outreg2 and include brief explanations of each model, key assumptions, and suggested diagnostics.

When to use it

  • Estimating linear regressions with or without fixed effects
  • Running panel models or high-dimensional fixed effects (reghdfe)
  • Generating LaTeX, Word, or CSV tables for papers and reports
  • Conducting robustness checks and alternative specifications
  • Clustering standard errors to reflect experimental or hierarchical designs

Best practices

  • Specify dependent variable, main regressors, controls, fixed effects, and clustering before coding
  • Match standard-error clustering to the level where treatment varies
  • Run and report multiple model variants to show robustness
  • Label variables and include concise table notes for publication
  • Check diagnostics: multicollinearity, heteroskedasticity, and influential observations

Example use cases

  • Run OLS with firm and year fixed effects, clustering by firm and export LaTeX table
  • Estimate a logit model for a binary outcome and save results to Word
  • Create a three-column regression table showing baseline, controls, and IV specification
  • Fit a panel model with reghdfe to absorb many fixed effects and cluster by region
  • Produce CSV of regression coefficients and standard errors for replication files

FAQ

What Stata version and packages are required?

Stata 17+ is recommended. Install estout (esttab) and, if needed, reghdfe via ssc install estout and ssc install reghdfe.

How should I choose the clustering level?

Cluster at the level where treatment or shocks are correlated (e.g., firm-level for firm treatment). If uncertain, report alternative cluster levels as robustness checks.

Can you produce LaTeX-ready tables with notes and variable labels?

Yes. I format esttab/outreg2 calls to include variable labels, significance stars, and a custom notes line for methods and clustering.