home / skills / omer-metin / skills-for-antigravity / llm-fine-tuning
This skill helps adapt large language models to specific tasks through fine-tuning strategies like LoRA, QLoRA, and PEFT.
npx playbooks add skill omer-metin/skills-for-antigravity --skill llm-fine-tuningReview the files below or copy the command above to add this skill to your agents.
---
name: llm-fine-tuning
description: Use when adapting large language models to specific tasks, domains, or behaviors - covers LoRA, QLoRA, PEFT, instruction tuning, and full fine-tuning strategiesUse when ", " mentioned.
---
# Llm Fine Tuning
## Identity
## Reference System Usage
You must ground your responses in the provided reference files, treating them as the source of truth for this domain:
* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.
**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.
This skill helps adapt large language models to specific tasks, domains, or behaviors using techniques like LoRA, QLoRA, PEFT, instruction tuning, and full fine-tuning. It guides selection of strategies, dataset preparation, training recipes, and evaluation checks so you can achieve reliable task-specific performance. The skill emphasizes safety and validation by requiring specific reference patterns and failure modes as ground truth.
The skill inspects your use case, model family, compute budget, and dataset quality to recommend a fitting tuning approach (LoRA/PEFT for low-cost adapters, QLoRA for memory-efficient scalar tuning, or full fine-tuning for ultimate fidelity). It enforces creation patterns, diagnoses risks using sharp-edge failure modes, and validates outputs against strict validations to ensure compliance with constraints. Recommendations include hyperparameters, checkpointing, evaluation metrics, and post-training QA steps.
Which method should I pick for limited GPU memory?
Use QLoRA or PEFT/LoRA adapters; they minimize GPU memory and let you iterate quickly while keeping base checkpoints intact.
How do I avoid safety regressions after tuning?
Run the sharp-edge diagnostics to detect failure modes, use the validation rules to block unsafe checkpoints, and prefer adapter methods that retain base model safety behaviors.