home / skills / openclaw / skills / ollama-local
This skill helps you manage and use local Ollama models for inference, embeddings, and tool use with quick, reliable commands.
npx playbooks add skill openclaw/skills --skill ollama-localReview the files below or copy the command above to add this skill to your agents.
---
name: ollama-local
description: Manage and use local Ollama models. Use for model management (list/pull/remove), chat/completions, embeddings, and tool-use with local LLMs. Covers OpenClaw sub-agent integration and model selection guidance.
---
# Ollama Local
Work with local Ollama models for inference, embeddings, and tool use.
## Configuration
Set your Ollama host (defaults to `http://localhost:11434`):
```bash
export OLLAMA_HOST="http://localhost:11434"
# Or for remote server:
export OLLAMA_HOST="http://192.168.1.100:11434"
```
## Quick Reference
```bash
# List models
python3 scripts/ollama.py list
# Pull a model
python3 scripts/ollama.py pull llama3.1:8b
# Remove a model
python3 scripts/ollama.py rm modelname
# Show model details
python3 scripts/ollama.py show qwen3:4b
# Chat with a model
python3 scripts/ollama.py chat qwen3:4b "What is the capital of France?"
# Chat with system prompt
python3 scripts/ollama.py chat llama3.1:8b "Review this code" -s "You are a code reviewer"
# Generate completion (non-chat)
python3 scripts/ollama.py generate qwen3:4b "Once upon a time"
# Get embeddings
python3 scripts/ollama.py embed bge-m3 "Text to embed"
```
## Model Selection
See [references/models.md](references/models.md) for full model list and selection guide.
**Quick picks:**
- Fast answers: `qwen3:4b`
- Coding: `qwen2.5-coder:7b`
- General: `llama3.1:8b`
- Reasoning: `deepseek-r1:8b`
## Tool Use
Some local models support function calling. Use `ollama_tools.py`:
```bash
# Single request with tools
python3 scripts/ollama_tools.py single qwen2.5-coder:7b "What's the weather in Amsterdam?"
# Full tool loop (model calls tools, gets results, responds)
python3 scripts/ollama_tools.py loop qwen3:4b "Search for Python tutorials and summarize"
# Show available example tools
python3 scripts/ollama_tools.py tools
```
**Tool-capable models:** qwen2.5-coder, qwen3, llama3.1, mistral
## OpenClaw Sub-Agents
Spawn local model sub-agents with `sessions_spawn`:
```python
# Example: spawn a coding agent
sessions_spawn(
task="Review this Python code for bugs",
model="ollama/qwen2.5-coder:7b",
label="code-review"
)
```
Model path format: `ollama/<model-name>`
### Parallel Agents (Think Tank Pattern)
Spawn multiple local agents for collaborative tasks:
```python
agents = [
{"label": "architect", "model": "ollama/gemma3:12b", "task": "Design the system architecture"},
{"label": "coder", "model": "ollama/qwen2.5-coder:7b", "task": "Implement the core logic"},
{"label": "reviewer", "model": "ollama/llama3.1:8b", "task": "Review for bugs and improvements"},
]
for a in agents:
sessions_spawn(task=a["task"], model=a["model"], label=a["label"])
```
## Direct API
For custom integrations, use the Ollama API directly:
```bash
# Chat
curl $OLLAMA_HOST/api/chat -d '{
"model": "qwen3:4b",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false
}'
# Generate
curl $OLLAMA_HOST/api/generate -d '{
"model": "qwen3:4b",
"prompt": "Why is the sky blue?",
"stream": false
}'
# List models
curl $OLLAMA_HOST/api/tags
# Pull model
curl $OLLAMA_HOST/api/pull -d '{"name": "phi3:mini"}'
```
## Troubleshooting
**Connection refused?**
- Check Ollama is running: `ollama serve`
- Verify OLLAMA_HOST is correct
- For remote servers, ensure firewall allows port 11434
**Model not loading?**
- Check VRAM: larger models may need CPU offload
- Try a smaller model first
**Slow responses?**
- Model may be running on CPU
- Use smaller quantization (e.g., `:7b` instead of `:30b`)
**OpenClaw sub-agent falls back to default model?**
- Ensure `ollama:default` auth profile exists in OpenClaw config
- Check model path format: `ollama/modelname:tag`
This skill manages and uses local Ollama models for inference, embeddings, and integrated tool use. It provides CLI and script helpers for listing, pulling, removing, showing models, chatting, generating completions, and producing embeddings. It also supports OpenClaw sub-agent spawning and multi-agent patterns for collaborative tasks. Use it to run local LLM workflows without cloud dependencies.
The skill talks to an Ollama daemon (default http://localhost:11434) via environment-configured host or direct API calls. It wraps common operations—list, pull, remove, show models—and exposes chat, generate, and embed commands plus a tool loop for function-calling models. For multi-agent or persistent sessions, it spawns OpenClaw sub-agents using model paths like ollama/<model-name> and coordinates tool-enabled interactions.
How do I change the Ollama host?
Set the OLLAMA_HOST environment variable to the server URL (for example export OLLAMA_HOST="http://192.168.1.100:11434").
Which models support tools/function calling?
Tool-capable models include qwen2.5-coder, qwen3, llama3.1, and mistral; check model docs for specifics.
What if a model fails to load or is very slow?
Confirm available VRAM, try a smaller quantized model, or enable CPU offload; large models may require more resources.