home / skills / openclaw / skills / llm
This skill provides a unified multi-provider LLM interface to compare models, switch providers, and estimate tokens with streaming responses.
npx playbooks add skill openclaw/skills --skill llmReview the files below or copy the command above to add this skill to your agents.
---
name: llm
description: Multi-provider LLM integration. Unified interface for OpenAI, Anthropic, Google, and local models.
metadata: {"clawdbot":{"emoji":"🔮","always":true,"requires":{"bins":["curl","jq"]}}}
---
# LLM 🔮
Multi-provider Large Language Model integration.
## Supported Providers
- OpenAI (GPT-4, GPT-4o)
- Anthropic (Claude)
- Google (Gemini)
- Local models (Ollama, LM Studio)
## Features
- Unified chat interface
- Model comparison
- Token counting
- Cost estimation
- Streaming responses
## Usage Examples
```
"Compare GPT-4 vs Claude on this task"
"Use local Llama model"
"Estimate tokens for this prompt"
```
This skill provides a unified interface to multiple large language model providers including OpenAI, Anthropic, Google, and local models. It simplifies switching between providers, comparing outputs, and managing model-specific details like token usage and costs. The integration focuses on consistent chat behavior and streaming responses for low-latency applications.
The skill routes prompts to the selected provider using provider-specific adapters while exposing a consistent chat API. It can query multiple models in parallel to produce model comparisons, measure token usage, and estimate cost based on provider pricing. Streaming support forwards incremental model outputs to your client, and local adapters let you run inference on on-prem models like Ollama or LM Studio.
Can I run local models and cloud providers together?
Yes. The skill supports hybrid workflows so you can route some requests to local models and others to cloud providers using the same API.
How does cost estimation work?
Cost estimation uses model-specific token counting plus configurable pricing rates to provide approximate runtime costs before you send requests.