home / skills / doanchienthangdev / omgkit / foundation-models
This skill helps you configure large language model generation and interpret foundation model behavior with structured outputs and sampling guidance.
npx playbooks add skill doanchienthangdev/omgkit --skill foundation-modelsReview the files below or copy the command above to add this skill to your agents.
---
name: foundation-models
description: Understanding Foundation Models - architecture, sampling parameters, structured outputs, post-training. Use when configuring LLM generation, selecting models, or understanding model behavior.
---
# Foundation Models
Deep understanding of how Foundation Models work.
## Sampling Parameters
```python
# Temperature Guide
TEMPERATURE = {
"factual_qa": 0.0, # Deterministic
"code_generation": 0.2, # Slightly creative
"translation": 0.3, # Mostly deterministic
"creative_writing": 0.9, # Creative
"brainstorming": 1.2, # Very creative
}
# Key parameters
response = client.chat.completions.create(
model="gpt-4",
messages=[...],
temperature=0.7, # 0.0-2.0, controls randomness
top_p=0.9, # Nucleus sampling (0.0-1.0)
max_tokens=1000, # Maximum output length
)
```
## Structured Outputs
```python
# JSON Mode
response = client.chat.completions.create(
model="gpt-4",
messages=[...],
response_format={"type": "json_object"}
)
# Function Calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}]
```
## Post-Training Stages
| Stage | Purpose | Result |
|-------|---------|--------|
| Pre-training | Learn language patterns | Base model |
| SFT | Instruction following | Chat model |
| RLHF/DPO | Human preference alignment | Aligned model |
## Model Selection Factors
| Factor | Consideration |
|--------|---------------|
| Context length | 4K-128K+ tokens |
| Multilingual | Tokenization costs (up to 10x for non-Latin) |
| Domain | General vs specialized (code, medical, legal) |
| Latency | TTFT, tokens/second |
| Cost | Input/output token pricing |
## Best Practices
1. Match temperature to task type
2. Use structured outputs when parsing needed
3. Consider context length limits
4. Test sampling parameters systematically
5. Account for knowledge cutoff dates
## Common Pitfalls
- High temperature for factual tasks
- Ignoring tokenization costs for multilingual
- Not accounting for context length limits
- Expecting determinism without temperature=0
This skill explains core concepts of foundation models, including architecture, sampling parameters, structured outputs, and post-training stages. It helps engineers and product teams choose models, tune generation, and design reliable structured responses. Practical guidance focuses on outcomes: accurate generation, predictable behavior, and efficient cost/latency trade-offs.
The skill breaks down how sampling parameters (temperature, top_p, max_tokens) shape randomness, creativity, and determinism in outputs. It describes structured output mechanisms such as JSON modes and function-calling to produce machine-parseable results. It also outlines post-training stages (pre-training, supervised fine-tuning, RLHF/DPO) and how they influence instruction following and alignment.
When should I set temperature to 0?
Use temperature=0 for deterministic outputs such as factual QA, exact templates, or verification steps where repeatability is required.
How do I choose between temperature and top_p?
Temperature scales randomness globally; top_p applies nucleus sampling by probability mass. Tune both together: reduce temperature for determinism, or lower top_p to constrain the sampling distribution.