home / skills / sstobo / convex-skills / convex-agents-debugging

convex-agents-debugging skill

/convex-agents-debugging

This skill helps debugging convex agents by logging LLM requests, inspecting context, and auditing data to diagnose unexpected behavior.

npx playbooks add skill sstobo/convex-skills --skill convex-agents-debugging

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.5 KB
---
name: "Convex Agents Debugging"
description: "Troubleshoots agent behavior, logs LLM interactions, and inspects database state. Use this when responses are unexpected, to understand context the LLM receives, or to diagnose data issues."
---

## Purpose

Debugging tools help understand what's happening inside agents, what the LLM receives, and what's stored. Essential for developing reliable agent applications.

## When to Use This Skill

- Agent behavior is unexpected
- LLM responses are off-target
- Investigating why certain context isn't being used
- Understanding message ordering
- Checking file storage and references
- Auditing tool calls and results
- Profiling token usage

## Log Raw LLM Requests and Responses

```typescript
const myAgent = new Agent(components.agent, {
  name: "My Agent",
  languageModel: openai.chat("gpt-4o-mini"),
  rawRequestResponseHandler: async (ctx, { request, response }) => {
    console.log("LLM Request:", JSON.stringify(request, null, 2));
    console.log("LLM Response:", JSON.stringify(response, null, 2));

    await ctx.runMutation(internal.logging.saveLLMCall, {
      request,
      response,
      timestamp: Date.now(),
    });
  },
});
```

## Log Context Messages

See exactly what context the LLM receives:

```typescript
const myAgent = new Agent(components.agent, {
  name: "My Agent",
  languageModel: openai.chat("gpt-4o-mini"),
  contextHandler: async (ctx, args) => {
    console.log("Context Messages:", {
      recent: args.recent.length,
      search: args.search.length,
      input: args.inputMessages.length,
    });

    args.allMessages.forEach((msg, i) => {
      console.log(`Message ${i}:`, {
        role: msg.role,
        contentLength: typeof msg.content === "string"
          ? msg.content.length
          : JSON.stringify(msg.content).length,
      });
    });

    return args.allMessages;
  },
});
```

## Inspect Database Tables

Query agent data directly:

```typescript
export const getThreadMessages = query({
  args: { threadId: v.string() },
  handler: async (ctx, { threadId }) => {
    return await ctx.db
      .query(components.agent.tables.messages)
      .filter((msg) => msg.threadId === threadId)
      .collect();
  },
});
```

## Fetch Context Manually

Inspect what context would be used:

```typescript
import { fetchContextWithPrompt } from "@convex-dev/agent";

export const inspectContext = action({
  args: { threadId: v.string(), prompt: v.string() },
  handler: async (ctx, { threadId, prompt }) => {
    const { messages } = await fetchContextWithPrompt(ctx, components.agent, {
      threadId,
      prompt,
    });

    return {
      contextMessages: messages.length,
      messages: messages.map((msg) => ({
        role: msg.role,
        contentType: typeof msg.content,
      })),
    };
  },
});
```

## Trace Tool Calls

Log all tool invocations:

```typescript
export const myTool = createTool({
  description: "My tool",
  args: z.object({ query: z.string() }),
  handler: async (ctx, { query }): Promise<string> => {
    console.log("[TOOL] myTool called with:", query);
    const result = await someOperation(query);
    console.log("[TOOL] myTool returned:", result);
    return result;
  },
});
```

## Fix Type Errors

Common circular reference issue:

```typescript
// WRONG - no return type
export const myFunction = action({
  args: { prompt: v.string() },
  handler: async (ctx, { prompt }) => {
    return await someLogic();
  },
});

// CORRECT - explicit return type
export const myFunction = action({
  args: { prompt: v.string() },
  returns: v.string(),
  handler: async (ctx, { prompt }): Promise<string> => {
    return await someLogic();
  },
});
```

## Analyze Message Structure

Debug message ordering:

```typescript
export const analyzeMessages = query({
  args: { threadId: v.string() },
  handler: async (ctx, { threadId }) => {
    const messages = await listMessages(ctx, components.agent, {
      threadId,
      paginationOpts: { cursor: null, numItems: 100 },
    });

    return messages.results.map((msg) => ({
      order: msg.order,
      stepOrder: msg.stepOrder,
      role: msg.message.role,
      status: msg.status,
    }));
  },
});
```

## Key Principles

- **Log early**: Capture data while developing
- **Use console for quick checks**: Fast iteration
- **Save important events**: Archive LLM calls for analysis
- **Explicit return types**: Prevents circular references
- **Dashboard inspection**: Easiest way to see database state

## Next Steps

- See **playground** for interactive debugging
- See **fundamentals** for agent setup
- See **context** for context-aware debugging

Overview

This skill helps troubleshoot agent behavior by capturing LLM interactions, inspecting the context sent to models, and examining agent database state. It provides concrete hooks to log raw requests/responses, trace tool calls, and fetch the exact context used for a prompt. Use it to quickly pinpoint why an agent produced unexpected or off-target results.

How this skill works

You enable handlers that capture LLM requests/responses and context construction, then persist or print those artifacts for analysis. The skill also includes examples for querying agent message tables, tracing tool invocations, and fetching the assembled context for a given thread and prompt. Together these techniques reveal message ordering, token usage, storage references, and tool outputs that influence behavior.

When to use it

  • Agent produces unexpected or off-target responses
  • You need to confirm exactly what context the LLM received
  • Investigating message ordering or missing context items
  • Auditing tool calls, file references, or external operations
  • Profiling token usage and preserving LLM call records

Best practices

  • Log early and often during development to capture transient issues
  • Use raw request/response handlers to save full LLM interactions for replay
  • Inspect context messages (recent/search/input) to see what was included
  • Persist important events to a database for long-term analysis
  • Declare explicit return types in server handlers to avoid circular type issues

Example use cases

  • Record full LLM requests/responses to debug hallucinations or prompt engineering changes
  • Fetch assembled context for a thread to verify missing documents or truncated history
  • Query message tables to audit ordering, status, and step metadata
  • Add logging inside tools to surface unexpected inputs or faulty tool results
  • Analyze token consumption across messages to optimize prompts and reduce cost

FAQ

How do I see the exact messages the model received?

Enable a contextHandler to log allMessages or use fetchContextWithPrompt to retrieve the assembled context for a thread and prompt.

Where should I store LLM calls for later analysis?

Persist raw request/response objects using your app database or a logging table so you can replay, filter, and audit calls over time.