home / skills / sstobo / convex-skills / convex-agents-streaming

convex-agents-streaming skill

/convex-agents-streaming

This skill enables real-time, non-blocking streaming of agent responses to UIs, improving responsiveness and multi-client collaboration.

npx playbooks add skill sstobo/convex-skills --skill convex-agents-streaming

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.0 KB
---
name: "Convex Agents Streaming"
description: "Streams agent responses in real-time to clients without blocking. Use this for responsive UIs, long-running generations, and asynchronous streaming to multiple clients."
---

## Purpose

Streaming allows responses to appear character-by-character in real-time, improving UX and perceived performance. Supports async streaming and multiple clients.

## When to Use This Skill

- Building real-time chat interfaces with live updates
- Generating long responses that benefit from progressive display
- Streaming to multiple clients from single generation
- Using asynchronous streaming in background actions
- Implementing smooth text animation

## Basic Async Streaming

Stream and save deltas to database:

```typescript
export const streamResponse = action({
  args: { threadId: v.string(), prompt: v.string() },
  handler: async (ctx, { threadId, prompt }) => {
    const { thread } = await myAgent.continueThread(ctx, { threadId });

    await thread.streamText(
      { prompt },
      { saveStreamDeltas: true }
    );

    return { success: true };
  },
});
```

## Configure Stream Chunking

```typescript
await thread.streamText(
  { prompt },
  {
    saveStreamDeltas: {
      chunking: "line", // "word" | "line" | regex | function
      throttleMs: 500, // Save deltas every 500ms
    },
  }
);
```

## Retrieve Stream Deltas

```typescript
import { vStreamArgs, syncStreams } from "@convex-dev/agent";

export const listMessagesWithStreams = query({
  args: {
    threadId: v.string(),
    paginationOpts: paginationOptsValidator,
    streamArgs: vStreamArgs,
  },
  handler: async (ctx, { threadId, paginationOpts, streamArgs }) => {
    const messages = await listUIMessages(ctx, components.agent, {
      threadId,
      paginationOpts,
    });

    const streams = await syncStreams(ctx, components.agent, {
      threadId,
      streamArgs,
    });

    return { ...messages, streams };
  },
});
```

## Display Streaming in React

```typescript
import { useUIMessages, useSmoothText } from "@convex-dev/agent/react";

function ChatStreaming({ threadId }: { threadId: string }) {
  const { results } = useUIMessages(
    api.streaming.listMessages,
    { threadId },
    { initialNumItems: 20, stream: true }
  );

  return (
    <div>
      {results?.map((message) => (
        <StreamingMessage key={message.key} message={message} />
      ))}
    </div>
  );
}

function StreamingMessage({ message }: { message: UIMessage }) {
  const [visibleText] = useSmoothText(message.text, {
    startStreaming: message.status === "streaming",
  });

  return <div>{visibleText}</div>;
}
```

## Key Principles

- **Asynchronous streaming**: Best for background generations
- **Delta throttling**: Balances responsiveness with write volume
- **Stream status**: Check `message.status === "streaming"`
- **Smooth animation**: Use `useSmoothText` for text updates
- **Persistence**: Deltas survive page reloads

## Next Steps

- See **messages** for message management
- See **fundamentals** for agent setup
- See **context** for streaming-aware context

Overview

This skill streams agent responses in real-time to clients without blocking, enabling responsive UIs and progressive content display. It supports asynchronous background generations, configurable delta chunking, and broadcasting a single generation to multiple clients. Use it to improve perceived performance and maintain persistence across reloads.

How this skill works

The skill continues an agent thread and emits text deltas as the model generates, optionally saving those deltas to a database. You can configure chunking (by word, line, regex, or custom function) and throttle saves to balance responsiveness and write volume. Clients subscribe to stream deltas and render updates live; stream status flags (e.g., message.status === "streaming") indicate active streams.

When to use it

  • Real-time chat interfaces with live typing updates
  • Generating long responses where progressive reveal improves UX
  • Broadcasting a single generation to multiple clients or devices
  • Background asynchronous generations that shouldn’t block other work
  • Smooth text animations or progressive content rendering in the UI

Best practices

  • Enable saveStreamDeltas to persist progress so reloads resume cleanly
  • Configure chunking and throttleMs to reduce DB writes while keeping UI responsive
  • Check message.status === "streaming" to trigger client-side animations or listeners
  • Use useSmoothText (or equivalent) on the client for visually smooth updates
  • Stream generation in background actions for long-running tasks to avoid blocking requests

Example use cases

  • Chat app showing token-by-token or line-by-line replies for better perceived speed
  • A collaborative session broadcasting a single agent answer to multiple participants
  • Long-form content generation where the user can read early sections while later sections finish
  • Background summary generation that updates the UI when each delta is available
  • Interactive demos with smooth text animations tied to streaming status

FAQ

How do I persist partial outputs so users can refresh and still see progress?

Enable saveStreamDeltas when calling streamText so deltas are stored and can be synchronized later with syncStreams.

How do I avoid excessive database writes from many tiny deltas?

Configure chunking (e.g., "line" or a regex) and set throttleMs to batch deltas at a reasonable interval.