home / skills / shaul1991 / shaul-agents-plugin / data-scientist

data-scientist skill

/skills/data-scientist

This skill helps you build and evaluate ML models through EDA, feature engineering, training, tuning, and experiment tracking.

npx playbooks add skill shaul1991/shaul-agents-plugin --skill data-scientist

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
608 B
---
name: data-scientist
description: Data Scientist Agent. ML 모델 개발, 실험, 분석을 담당합니다.
allowed-tools: Read, Write, Edit, Bash, Grep, Glob
---

# Data Scientist Agent

## 역할
머신러닝 모델 개발 및 데이터 분석을 담당합니다.

## 담당 업무
- EDA 및 피처 엔지니어링
- 모델 개발 및 튜닝
- 모델 평가 및 해석
- 실험 관리

## 평가 지표
| 문제 유형 | 지표 |
|-----------|------|
| 분류 | Accuracy, Precision, Recall, F1, AUC |
| 회귀 | MAE, MSE, RMSE, R² |

## 산출물 위치
- 노트북: `notebooks/`
- 모델: `models/`

Overview

This skill is a Data Scientist agent that leads machine learning model development, experiments, and data analysis. It focuses on end-to-end workflows from exploratory data analysis and feature engineering to model tuning, evaluation, and interpretation. Deliverables include analysis notebooks and trained models to support reproducible research and production handoff.

How this skill works

The agent inspects datasets to run EDA, generate feature pipelines, and propose candidate models based on problem type (classification or regression). It manages experiments, tracks metrics, and applies model selection and hyperparameter tuning. For each experiment it produces evaluation reports and interpretable artifacts to guide deployment decisions.

When to use it

  • Building or iterating on supervised ML models for tabular data
  • Running systematic exploratory data analysis and feature engineering
  • Setting up and tracking model experiments and hyperparameter searches
  • Evaluating model performance with appropriate metrics and interpretation
  • Preparing reproducible artifacts for handoff to engineering or product teams

Best practices

  • Start with thorough EDA to surface data quality issues and distribution shifts
  • Automate feature transformations in pipelines for reproducibility
  • Choose evaluation metrics aligned to business objectives (e.g., F1 vs. AUC)
  • Track experiments with clear naming, seeds, and configuration records
  • Validate models with cross-validation and test sets to avoid leakage

Example use cases

  • Developing a classification model and optimizing for recall in an imbalanced dataset
  • Conducting feature importance analysis and building interpretable models
  • Running grid or Bayesian hyperparameter tuning and comparing metrics like F1 and AUC
  • Estimating regression performance with MAE, RMSE, and R² for forecasting
  • Packaging notebooks and trained artifacts for review by engineering teams

FAQ

Which evaluation metrics should I use?

Select metrics by problem type and business goals: classification uses Accuracy, Precision, Recall, F1, AUC; regression uses MAE, MSE, RMSE, and R². Prefer multiple metrics to capture different failure modes.

Where are artifacts stored?

Analysis notebooks and experiment records are stored in notebooks/, and trained model artifacts are kept in models/ for reproducibility and handoff.