home / skills / baz-scm / awesome-reviewers / data-ml

data-ml skill

safe

This skill helps you leverage data and machine learning to build data-driven features by analyzing data, integrating models, and applying analytics insights.

npx playbooks add skill baz-scm/awesome-reviewers --skill data-ml

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

1.7 KB

---
name: data-ml
description: Competence in data analytics and machine learning, enabling developers to build data-driven features and integrate AI/ML capabilities.
version: '1.0'
---
# Data & Machine Learning Proficiency

Software is increasingly data-driven, and developers who can handle data and ML have a strong advantage. Python’s ongoing popularity is largely due to its use in data science and machine learning. Being able to analyze datasets, use ML libraries, and incorporate AI models into applications is a sought-after skill. Whether it’s integrating an ML API or building a model in-house, understanding how these technologies work is crucial in 2025.

## Examples
- Using Python libraries like **Pandas** and **NumPy** to manipulate and analyze data for an application feature.
- Integrating a pre-trained machine learning model (e.g. image recognition, NLP) into a web service or app.

## Guidelines
- **Learn Data Tools:** Gain proficiency with data-focused languages and libraries. For example, Python paired with libraries such as NumPy and Pandas is extremely popular for data tasks. This enables you to perform analysis or preprocessing as part of your development work.
- **Understand ML Workflows:** Even if you’re not a data scientist, understand the basics of training and using machine learning models. Know how to use ML frameworks or services (TensorFlow, PyTorch, scikit-learn, or cloud ML APIs) to add AI capabilities to applications.
- **Data-Driven Decision Making:** Use data to inform development decisions. This could mean instrumenting your app with analytics (and then querying that data), or A/B testing features. A developer who can derive insights from data and adjust software accordingly will create more effective, user-optimized products.

Overview

This skill captures practical competence in data analytics and machine learning for developers. It emphasizes using data tools, understanding ML workflows, and integrating AI capabilities to build data-driven features and improve product decisions. The goal is to make ML and analytics accessible to engineers, not just data scientists.

How this skill works

The skill inspects familiarity with core data libraries (for example Pandas and NumPy) and practical ML frameworks or services like scikit-learn, TensorFlow, PyTorch, or cloud ML APIs. It evaluates the ability to preprocess data, run experiments, integrate pre-trained models into apps, and use analytics to guide development decisions.

When to use it

Building features that require data transformation, aggregation, or analysis.
Integrating pre-trained ML models for NLP, vision, or recommendation into services.
Instrumenting applications for analytics and performing A/B tests to optimize UX.
Prototyping ML models or pipelines when in-house modeling is necessary.
Evaluating and selecting cloud ML APIs or frameworks for production use.

Best practices

Learn and apply core data libraries (Pandas, NumPy) for reliable preprocessing and exploration.
Understand end-to-end ML workflow: data collection, cleaning, validation, training, and monitoring.
Prefer reusable, testable preprocessing pipelines to avoid leakage and ensure reproducibility.
Use pre-trained models or managed APIs for fast delivery, and fall back to custom models when needed.
Instrument features with analytics and iterate based on measured user behavior and experiment results.

Example use cases

Normalize and aggregate user metrics with Pandas to drive a new personalization feature.
Embed a pre-trained NLP model into a microservice to auto-classify support tickets.
Run an A/B experiment and analyze results to decide which UI variant improves retention.
Prototype an image-classification endpoint with a cloud ML API before committing to a full training pipeline.
Build a data preprocessing pipeline that feeds features to a recommendation model in production.

FAQ

Do I need to be a data scientist to use this skill?

No. The focus is on practical developer-level competence: using data tools, integrating models, and making data-driven decisions. Deep research-level modeling is optional.

When should I use a pre-trained model vs build one in-house?

Use pre-trained models or managed APIs for speed and common tasks. Build in-house models when you need custom performance, control over data, or domain-specific behavior.