home / skills / doanchienthangdev / omgkit / ml-frameworks
This skill helps you apply best practices across PyTorch, TensorFlow, and scikit-learn for efficient training, optimization, and robust model deployment.
npx playbooks add skill doanchienthangdev/omgkit --skill ml-frameworksReview the files below or copy the command above to add this skill to your agents.
---
name: ml-frameworks
description: ML framework best practices for PyTorch, TensorFlow, scikit-learn, and modern ML libraries including training patterns and optimization.
---
# ML Frameworks
Best practices for popular ML frameworks.
## PyTorch
```python
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
# Custom Dataset
class CustomDataset(Dataset):
def __init__(self, X, y, transform=None):
self.X = torch.FloatTensor(X)
self.y = torch.LongTensor(y)
self.transform = transform
def __len__(self):
return len(self.y)
def __getitem__(self, idx):
x = self.X[idx]
if self.transform:
x = self.transform(x)
return x, self.y[idx]
# Model Definition
class Net(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super().__init__()
self.net = nn.Sequential(
nn.Linear(input_dim, hidden_dim),
nn.LayerNorm(hidden_dim),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_dim, output_dim)
)
def forward(self, x):
return self.net(x)
# Training Loop
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = Net(100, 256, 10).to(device)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3, weight_decay=0.01)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=100)
scaler = torch.cuda.amp.GradScaler() # Mixed precision
for epoch in range(100):
model.train()
for batch in train_loader:
x, y = batch[0].to(device), batch[1].to(device)
with torch.cuda.amp.autocast():
output = model(x)
loss = F.cross_entropy(output, y)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
optimizer.zero_grad()
scheduler.step()
```
## TensorFlow/Keras
```python
import tensorflow as tf
from tensorflow import keras
# Model Definition
model = keras.Sequential([
keras.layers.Dense(256, activation='relu', input_shape=(100,)),
keras.layers.BatchNormalization(),
keras.layers.Dropout(0.2),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=keras.optimizers.AdamW(learning_rate=1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Callbacks
callbacks = [
keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True),
keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5),
keras.callbacks.ModelCheckpoint('best_model.h5', save_best_only=True),
keras.callbacks.TensorBoard(log_dir='./logs')
]
# Training
history = model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=100,
batch_size=32,
callbacks=callbacks
)
```
## Scikit-learn
```python
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import GradientBoostingClassifier
# Preprocessing Pipeline
numeric_features = ['age', 'income']
categorical_features = ['city', 'occupation']
preprocessor = ColumnTransformer(
transformers=[
('num', StandardScaler(), numeric_features),
('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
]
)
# Full Pipeline
pipeline = Pipeline([
('preprocessor', preprocessor),
('classifier', GradientBoostingClassifier())
])
# Grid Search
from sklearn.model_selection import GridSearchCV
param_grid = {
'classifier__n_estimators': [100, 200],
'classifier__max_depth': [3, 5, 7],
'classifier__learning_rate': [0.01, 0.1]
}
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='f1_macro')
grid_search.fit(X_train, y_train)
```
## Hugging Face Transformers
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers import Trainer, TrainingArguments
model = AutoModelForSequenceClassification.from_pretrained(
"bert-base-uncased", num_labels=2
)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
training_args = TrainingArguments(
output_dir="./results",
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=64,
warmup_steps=500,
weight_decay=0.01,
evaluation_strategy="epoch",
fp16=True
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset
)
trainer.train()
```
## Commands
- `/omgtrain:train` - Train model
## Best Practices
1. Use mixed precision training
2. Implement proper data loading
3. Use learning rate scheduling
4. Enable gradient clipping
5. Save checkpoints regularly
This skill packages practical best practices for building, training, and optimizing models across PyTorch, TensorFlow/Keras, scikit-learn, and modern transformer libraries. It focuses on reproducible training patterns, efficient data pipelines, and performance optimization techniques. The guidance targets both research and production workflows to reduce iteration time and improve model robustness.
The skill inspects common training patterns and recommends concrete patterns: dataset and DataLoader setup, model definition conventions, optimizer/scheduler choices, mixed precision and gradient handling, and checkpointing strategies. It highlights framework-specific idioms (PyTorch loops, Keras callbacks, scikit-learn pipelines, Hugging Face Trainer) and maps them to universal best practices. Use the commands to trigger training flows that follow these patterns.
Should I always use mixed precision?
Use mixed precision when training on modern GPUs to improve throughput and reduce memory; validate numerical stability and test a few batches before full runs.
How often should I checkpoint?
Checkpoint frequently enough to avoid losing progress (e.g., end of epoch or after a set number of steps) and keep a best-checkpoint based on validation performance.