home / skills / jrajasekera / claude-skills / openrouter-api

openrouter-api skill

/openrouter-api

This skill helps you call and route across 400+ models via OpenRouter with unified API and provider fallbacks.

npx playbooks add skill jrajasekera/claude-skills --skill openrouter-api

Review the files below or copy the command above to add this skill to your agents.

Files (8)
SKILL.md
5.8 KB
---
name: openrouter-api
description: OpenRouter API integration for unified access to 400+ LLM models from 70+ providers. Use when building applications that need to call OpenRouter's API for chat completions, streaming, tool calling, structured outputs, or model routing. Triggers on OpenRouter, model routing, multi-model, provider fallbacks, or when users need to access multiple LLM providers through a single API.
---

# OpenRouter API

OpenRouter provides a unified API to access 400+ models from 70+ providers (OpenAI, Anthropic, Google, Meta, Mistral, etc.) through a single OpenAI-compatible interface.

## Quick Start

```typescript
// Using fetch
const response = await fetch("https://openrouter.ai/api/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${OPENROUTER_API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "openai/gpt-5.2",
    messages: [{ role: "user", content: "Hello!" }]
  })
});

// Using OpenAI SDK (Python)
from openai import OpenAI
client = OpenAI(base_url="https://openrouter.ai/api/v1", api_key=OPENROUTER_API_KEY)
response = client.chat.completions.create(
    model="openai/gpt-5.2",
    messages=[{"role": "user", "content": "Hello!"}]
)
```

## Core Concepts

### Model Format

Models use `provider/model-name` format: `openai/gpt-5.2`, `anthropic/claude-sonnet-4.5`, `google/gemini-3-pro`

### Model Variants

Append suffixes to modify behavior:
- `:thinking` - Extended reasoning
- `:free` - Free tier (rate limited)
- `:nitro` - Speed-optimized
- `:extended` - Larger context
- `:online` - Web search enabled
- `:exacto` - Tool-calling optimized

Example: `openai/gpt-5.2:online`, `deepseek/deepseek-r1:thinking`

### Provider Routing

Control which providers serve your requests:

```typescript
{
  model: "anthropic/claude-sonnet-4.5",
  provider: {
    order: ["Anthropic", "Amazon Bedrock"],  // Preference order
    allow_fallbacks: true,                    // Enable backup providers
    sort: "price",                            // "price" | "throughput" | "latency"
    data_collection: "deny",                  // Privacy control
    zdr: true                                 // Zero Data Retention
  }
}
```

### Model Fallbacks

Specify backup models:

```typescript
{
  models: ["anthropic/claude-sonnet-4.5", "openai/gpt-5.2", "google/gemini-3-pro"]
}
```

## Common Patterns

### Streaming

```typescript
const response = await fetch(url, {
  body: JSON.stringify({ ...params, stream: true })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  // Parse SSE: "data: {...}\n"
  const data = JSON.parse(chunk.slice(6));
  console.log(data.choices[0]?.delta?.content);
}
```

### Tool Calling

```typescript
const response = await fetch(url, {
  body: JSON.stringify({
    model: "openai/gpt-5.2",
    messages: [{ role: "user", content: "What's the weather in NYC?" }],
    tools: [{
      type: "function",
      function: {
        name: "get_weather",
        description: "Get weather for a location",
        parameters: {
          type: "object",
          properties: { location: { type: "string" } },
          required: ["location"]
        }
      }
    }]
  })
});

// Handle tool_calls in response, execute locally, return results
```

### Structured Output

```typescript
{
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "response",
      strict: true,
      schema: {
        type: "object",
        properties: {
          answer: { type: "string" },
          confidence: { type: "number" }
        },
        required: ["answer", "confidence"],
        additionalProperties: false
      }
    }
  }
}
```

### Web Search

```typescript
// Using plugin
{ plugins: [{ id: "web", max_results: 5 }] }

// Or model suffix
{ model: "openai/gpt-5.2:online" }
```

### Reasoning (Thinking Models)

```typescript
{
  model: "deepseek/deepseek-r1:thinking",
  reasoning: {
    effort: "high",       // xhigh, high, medium, low, minimal, none
    summary: "concise"    // auto, concise, detailed
  }
}
```

## Reference Documentation

For detailed API documentation, read the appropriate reference file:

- **[chat-completions.md](references/chat-completions.md)** - Core chat API, request/response formats, code examples
- **[routing-providers.md](references/routing-providers.md)** - Provider routing, model variants, fallbacks, Auto Router
- **[tool-calling.md](references/tool-calling.md)** - Function calling, tool definitions, agentic loops
- **[streaming.md](references/streaming.md)** - SSE streaming, cancellation, error handling
- **[plugins-features.md](references/plugins-features.md)** - Plugins, structured outputs, multimodal, caching, reasoning
- **[responses-api.md](references/responses-api.md)** - Beta stateless Responses API
- **[api-endpoints.md](references/api-endpoints.md)** - All API endpoints reference

## Key Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | string | Model ID (e.g., `openai/gpt-5.2`) |
| `messages` | Message[] | Conversation history |
| `stream` | boolean | Enable streaming |
| `max_tokens` | number | Max completion tokens |
| `temperature` | number | Randomness [0-2] |
| `tools` | Tool[] | Function definitions |
| `response_format` | object | Output format control |
| `provider` | object | Routing preferences |
| `plugins` | Plugin[] | Enable plugins |

## Error Handling

```typescript
// Check response status
if (!response.ok) {
  const error = await response.json();
  // error.error.code: 400, 401, 402, 403, 429, 502, 503
  // error.error.message: Human-readable error
}
```

Key status codes:
- `401` - Invalid API key
- `402` - Insufficient credits
- `429` - Rate limited
- `502` - Provider error
- `503` - No available provider

Overview

This skill integrates the OpenRouter API to give applications unified access to 400+ LLM models from 70+ providers through a single OpenAI-compatible interface. It simplifies calling chat completions, streaming outputs, tool/function calling, structured outputs, provider routing, and failover. Use it when you need flexible multi-model access, provider preference control, or advanced features like web search and reasoning variants.

How this skill works

The skill issues requests to OpenRouter endpoints using a provider/model identifier like provider/model-name and optional suffixes to adjust behavior (e.g., :online, :thinking). It supports chat completions, streaming SSE reads, tool calling (function definitions and tool_calls), structured JSON schema outputs, provider routing and fallback lists, and plugin-style web search. Responses and errors follow a consistent OpenAI-compatible format so existing clients and SDKs work with minimal changes.

When to use it

  • Building apps that must access multiple LLM providers through one API
  • Needing provider routing, priority/fallback rules, or cost/latency sorting
  • Implementing streaming chat, tool/function calling, or structured outputs
  • Adding web search or online models without bespoke provider integrations
  • Enforcing privacy controls like zero data retention or disallowing collection

Best practices

  • Specify providers and allow_fallbacks to ensure high availability
  • Use model suffixes to tune behavior (e.g., :online for web, :thinking for reasoning)
  • Stream large responses via SSE and handle incremental delta messages
  • Define strict response_format json_schema for reliable downstream parsing
  • Monitor status codes (401, 402, 429, 502, 503) and implement retry/backoff

Example use cases

  • Multi-provider chatbot that falls back from premium to cheaper models on overload
  • Agent that calls local functions/tools and returns structured JSON results
  • Real-time streaming assistant that renders token deltas as they arrive
  • Search-enabled assistant using :online models or web plugin integration
  • Routing requests by price, latency, or zero-data-retention policy

FAQ

How do I pick a model?

Use the provider/model-name format and add suffixes to change behavior; test models for cost, latency, and quality and use provider.order or models fallback lists to control selection.

How should I handle streaming responses?

Request stream: true, read the response body reader, parse SSE-style chunks and extract delta content, and implement cancellation and backpressure handling.