home / skills / jeremylongshore / claude-code-plugins-plus-skills / openrouter-streaming-setup

openrouter-streaming-setup skill

safe

/plugins/saas-packs/openrouter-pack/skills/openrouter-streaming-setup

This skill enables real-time streaming with OpenRouter to reduce latency in chat interfaces and improve user responsiveness.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill openrouter-streaming-setup

Review the files below or copy the command above to add this skill to your agents.

Files (9)

SKILL.md

1.6 KB

---
name: openrouter-streaming-setup
description: |
  Implement streaming responses with OpenRouter. Use when building real-time chat interfaces or reducing time-to-first-token. Trigger with phrases like 'openrouter streaming', 'openrouter sse', 'stream response', 'real-time openrouter'.
allowed-tools: Read, Write, Edit, Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Openrouter Streaming Setup

## Overview

This skill demonstrates streaming response implementation for lower perceived latency and real-time output display.

## Prerequisites

- OpenRouter integration
- Frontend capable of handling SSE/streaming

## Instructions

Follow these steps to implement this skill:

1. **Verify Prerequisites**: Ensure all prerequisites listed above are met
2. **Review the Implementation**: Study the code examples and patterns below
3. **Adapt to Your Environment**: Modify configuration values for your setup
4. **Test the Integration**: Run the verification steps to confirm functionality
5. **Monitor in Production**: Set up appropriate logging and monitoring

## Output

Successful execution produces:
- Working OpenRouter integration
- Verified API connectivity
- Example responses demonstrating functionality

## Error Handling

See `{baseDir}/references/errors.md` for comprehensive error handling.

## Examples

See `{baseDir}/references/examples.md` for detailed examples.

## Resources

- [OpenRouter Documentation](https://openrouter.ai/docs)
- [OpenRouter Models](https://openrouter.ai/models)
- [OpenRouter API Reference](https://openrouter.ai/docs/api-reference)
- [OpenRouter Status](https://status.openrouter.ai)

Overview

This skill implements streaming responses using OpenRouter to reduce time-to-first-token and enable real-time chat interfaces. It provides a clear integration pattern and practical steps for connecting a backend to an SSE-capable frontend. Use it to deliver progressive model output to users while the full response is still being generated.

How this skill works

The skill configures an OpenRouter client to request streamed output and exposes server endpoints that forward chunked events to the frontend via Server-Sent Events (SSE) or similar streaming transports. It includes verification steps to confirm API connectivity, sample request/response flows, and guidance for adapting configuration values to your environment. Error and connection handling patterns are provided so the stream remains robust in production.

When to use it

Building a real-time chat UI where users see tokens as they appear
Reducing perceived latency by showing partial outputs immediately
Implementing long-response scenarios where progressive feedback improves UX
Testing OpenRouter models for interactive demonstrations or live coding assistants

Best practices

Ensure the frontend can handle SSE, fetch-stream, or WebSocket chunks and gracefully reconnect on network errors
Keep streaming sessions authenticated and short-lived; rotate API keys or use scoped tokens
Log stream lifecycle events (start, chunk, error, end) and expose metrics for time-to-first-token
Validate and sanitize partial outputs before rendering when downstream actions depend on intermediate text
Provide a fallback non-streaming endpoint for clients that can’t use streaming transports

Example use cases

A chat app that displays assistant replies token-by-token to keep users engaged
A live coding helper that streams generated code while the model continues to produce the rest
Customer support agent interface that shows incremental suggestions to reduce wait times
Demo pages and interactive tutorials that showcase model capabilities with immediate output

FAQ

What frontend transports are supported?

Use Server-Sent Events (SSE), fetch streaming with ReadableStream, or WebSockets. Choose the one that matches your frontend stack and connection requirements.

How do I handle model errors mid-stream?

Emit a final error event, close the stream cleanly, and provide a short explanatory message. Log details on the server and offer a retry path for the client.