home / skills / doanchienthangdev / omgkit / ml-frameworks

This skill helps you apply best practices across PyTorch, TensorFlow, and scikit-learn for efficient training, optimization, and robust model deployment.

npx playbooks add skill doanchienthangdev/omgkit --skill ml-frameworks

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.7 KB
---
name: ml-frameworks
description: ML framework best practices for PyTorch, TensorFlow, scikit-learn, and modern ML libraries including training patterns and optimization.
---

# ML Frameworks

Best practices for popular ML frameworks.

## PyTorch

```python
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# Custom Dataset
class CustomDataset(Dataset):
    def __init__(self, X, y, transform=None):
        self.X = torch.FloatTensor(X)
        self.y = torch.LongTensor(y)
        self.transform = transform

    def __len__(self):
        return len(self.y)

    def __getitem__(self, idx):
        x = self.X[idx]
        if self.transform:
            x = self.transform(x)
        return x, self.y[idx]

# Model Definition
class Net(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.LayerNorm(hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_dim, output_dim)
        )

    def forward(self, x):
        return self.net(x)

# Training Loop
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = Net(100, 256, 10).to(device)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3, weight_decay=0.01)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=100)
scaler = torch.cuda.amp.GradScaler()  # Mixed precision

for epoch in range(100):
    model.train()
    for batch in train_loader:
        x, y = batch[0].to(device), batch[1].to(device)
        with torch.cuda.amp.autocast():
            output = model(x)
            loss = F.cross_entropy(output, y)
        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()
        optimizer.zero_grad()
    scheduler.step()
```

## TensorFlow/Keras

```python
import tensorflow as tf
from tensorflow import keras

# Model Definition
model = keras.Sequential([
    keras.layers.Dense(256, activation='relu', input_shape=(100,)),
    keras.layers.BatchNormalization(),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(
    optimizer=keras.optimizers.AdamW(learning_rate=1e-3),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Callbacks
callbacks = [
    keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5),
    keras.callbacks.ModelCheckpoint('best_model.h5', save_best_only=True),
    keras.callbacks.TensorBoard(log_dir='./logs')
]

# Training
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=100,
    batch_size=32,
    callbacks=callbacks
)
```

## Scikit-learn

```python
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import GradientBoostingClassifier

# Preprocessing Pipeline
numeric_features = ['age', 'income']
categorical_features = ['city', 'occupation']

preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_features),
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
    ]
)

# Full Pipeline
pipeline = Pipeline([
    ('preprocessor', preprocessor),
    ('classifier', GradientBoostingClassifier())
])

# Grid Search
from sklearn.model_selection import GridSearchCV

param_grid = {
    'classifier__n_estimators': [100, 200],
    'classifier__max_depth': [3, 5, 7],
    'classifier__learning_rate': [0.01, 0.1]
}

grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='f1_macro')
grid_search.fit(X_train, y_train)
```

## Hugging Face Transformers

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers import Trainer, TrainingArguments

model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased", num_labels=2
)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    fp16=True
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset
)

trainer.train()
```

## Commands
- `/omgtrain:train` - Train model

## Best Practices

1. Use mixed precision training
2. Implement proper data loading
3. Use learning rate scheduling
4. Enable gradient clipping
5. Save checkpoints regularly

Overview

This skill packages practical best practices for building, training, and optimizing models across PyTorch, TensorFlow/Keras, scikit-learn, and modern transformer libraries. It focuses on reproducible training patterns, efficient data pipelines, and performance optimization techniques. The guidance targets both research and production workflows to reduce iteration time and improve model robustness.

How this skill works

The skill inspects common training patterns and recommends concrete patterns: dataset and DataLoader setup, model definition conventions, optimizer/scheduler choices, mixed precision and gradient handling, and checkpointing strategies. It highlights framework-specific idioms (PyTorch loops, Keras callbacks, scikit-learn pipelines, Hugging Face Trainer) and maps them to universal best practices. Use the commands to trigger training flows that follow these patterns.

When to use it

  • Starting a new model project and choosing framework-specific conventions
  • Scaling training to GPUs with mixed precision and schedulers
  • Building robust data preprocessing and pipeline + feature engineering
  • Hyperparameter tuning and cross-validation for classical ML
  • Fine-tuning pretrained transformer models for NLP tasks

Best practices

  • Use mixed precision (AMP/fp16) to speed up GPU training and reduce memory
  • Implement efficient data loading and dataset transforms with batching and prefetching
  • Apply learning rate scheduling and warmup for stable convergence
  • Regularly checkpoint models and restore best weights using validation metrics
  • Use gradient clipping and weight decay to stabilize training
  • Keep preprocessing in reproducible pipelines (scikit-learn ColumnTransformer or TF data pipelines)

Example use cases

  • Train a classification model with PyTorch using AMP, AdamW, cosine annealing, and checkpointing
  • Quickly prototype a Keras model with callbacks: EarlyStopping, ReduceLROnPlateau, and ModelCheckpoint
  • Build a production-ready preprocessing + model pipeline with scikit-learn and GridSearchCV
  • Fine-tune a Hugging Face transformer with proper training args, fp16, and evaluation strategy
  • Integrate training command to run reproducible experiments and save logs for TensorBoard

FAQ

Should I always use mixed precision?

Use mixed precision when training on modern GPUs to improve throughput and reduce memory; validate numerical stability and test a few batches before full runs.

How often should I checkpoint?

Checkpoint frequently enough to avoid losing progress (e.g., end of epoch or after a set number of steps) and keep a best-checkpoint based on validation performance.