home / skills / omer-metin / skills-for-antigravity / model-optimization
This skill helps optimize machine learning models for size and speed, including quantization, pruning, distillation, ONNX export, and TensorRT.
npx playbooks add skill omer-metin/skills-for-antigravity --skill model-optimizationReview the files below or copy the command above to add this skill to your agents.
---
name: model-optimization
description: Use when reducing model size, improving inference speed, or deploying to edge devices - covers quantization, pruning, knowledge distillation, ONNX export, and TensorRT optimizationUse when ", " mentioned.
---
# Model Optimization
## Identity
## Reference System Usage
You must ground your responses in the provided reference files, treating them as the source of truth for this domain:
* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.
**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.
This skill helps reduce model size, improve inference speed, and prepare models for edge deployment using quantization, pruning, knowledge distillation, ONNX export, and TensorRT optimization. It codifies proven patterns and risk checks so you can apply optimizations reliably. Always follow the provided reference files as the authoritative guidance for creation, diagnosis, and validation.
For any optimization request, the skill consults references/patterns.md to choose the correct transformation pattern (quantize, prune, distill, export, or engine build). It uses references/sharp_edges.md to surface critical failures and trade-offs, and references/validations.md to validate inputs and post-optimization constraints. The skill returns concrete commands, expected outcomes, and safety checks tailored to the model format and target device.
Which reference should I consult first?
Start with references/patterns.md to choose the correct pattern, then use references/validations.md to check constraints and references/sharp_edges.md to understand risks.
What if an optimization breaks accuracy?
Use the rollback path and try a milder setting (e.g., fp16 before int8) and consult sharp_edges.md for root causes before retraining or adjusting hyperparameters.
Can I skip validations for faster iteration?
No — validations in references/validations.md are required to ensure correctness and safety for deployment.