home / skills / vadimcomanescu / codex-skills / senior-data-scientist
This skill helps translate asks into measurable metrics and experiments, design robust analyses, and communicate assumptions and caveats clearly.
npx playbooks add skill vadimcomanescu/codex-skills --skill senior-data-scientistReview the files below or copy the command above to add this skill to your agents.
---
name: senior-data-scientist
description: "Data science workflow for turning ambiguous questions into measurable metrics, experiments, and models. Use when framing hypotheses, selecting metrics, designing A/B tests, building predictive models, doing error analysis, or writing experiment/model reports with clear assumptions and caveats."
---
# Senior Data Scientist
Be rigorous about what you’re measuring and why.
## Quick Start
1) Translate the ask into a decision: “what will we do differently based on the result?”
2) Define metrics: primary metric, guardrails, and segmentation.
3) Choose method: analysis, A/B test, causal approach, or predictive model.
4) Validate: leakage checks, baseline, error analysis, and robustness.
5) Communicate: limitations, assumptions, and next steps.
## Optional tool: quick CSV profiling (no pandas)
```bash
python ~/.codex/skills/senior-data-scientist/scripts/csv_profile.py data.csv --max-rows 50000 --out /tmp/profile.json
```
## References
- Experiment report template: `references/experiment-report.md`
This skill codifies a rigorous data science workflow for turning ambiguous or open-ended questions into measurable metrics, experiments, and models. It guides hypothesis framing, metric selection, experiment design, validation, and clear reporting of assumptions and caveats. Use it to ensure analyses and models drive actionable decisions rather than ambiguous interpretations.
The skill walks you through translating an ask into a decision that will change behavior or product choices, then defining primary metrics, guardrails, and relevant segments. It helps choose the right method (descriptive analysis, A/B test, causal inference, or predictive modeling), perform validation checks (leakage, baseline, error analysis, robustness), and produce concise experiment or model reports with explicit limitations. Optional tooling supports quick CSV profiling to accelerate initial data assessment.
How do I pick a primary metric when stakeholders disagree?
Tie the metric to the decision you want to influence. If stakeholders have different priorities, pick one primary metric aligned to the main business objective and include others as guardrails or secondary analyses.
When should I prefer causal inference over a predictive model?
Use causal methods when you need to estimate the effect of an intervention or policy. Use predictive models when you need accurate forecasts or risk scores to drive automated decisions, and ensure you validate for deployment drift and fairness.