home / skills / phrazzld / claude-config / reconciliation-patterns

reconciliation-patterns skill

safe

This skill helps you keep external service data in sync with your database using reconciliation patterns for webhooks, missed events, and drift recovery.

npx playbooks add skill phrazzld/claude-config --skill reconciliation-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

7.4 KB

---
name: reconciliation-patterns
description: "Patterns for syncing state between external services (Stripe, Clerk) and local database. Invoke for: webhook failures, data sync issues, eventual consistency, recovery from missed events, subscription state management."
effort: high
---

# Reconciliation Patterns

Patterns for maintaining data consistency between external services and your database when webhooks fail or events are missed.

## The Problem

External services (Stripe, Clerk, etc.) notify your app via webhooks. But webhooks can:
- Fail silently (wrong URL, network issues)
- Be delivered out of order
- Be duplicated
- Miss events entirely

**Result**: Your database state diverges from source of truth.

## Core Principle

**Webhooks for speed, reconciliation for correctness.**

1. Process webhooks for real-time updates (optimistic)
2. Run periodic reconciliation to catch and fix drift (defensive)

## Pattern 1: Scheduled Reconciliation

Run cron job to compare local state with external service.

```typescript
// Convex scheduled function
export const reconcileSubscriptions = internalAction({
  handler: async (ctx) => {
    const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!)

    // Get all active subscriptions from Stripe
    const stripeSubscriptions = await stripe.subscriptions.list({
      status: 'all',
      limit: 100,
    })

    // Get all users from our database
    const users = await ctx.runQuery(internal.users.listWithStripeId)

    for (const user of users) {
      const stripeSub = stripeSubscriptions.data.find(
        (s) => s.customer === user.stripeCustomerId
      )

      const expectedStatus = stripeSub?.status ?? 'none'

      if (user.subscriptionStatus !== expectedStatus) {
        console.log(`Drift detected: user ${user._id}`, {
          local: user.subscriptionStatus,
          stripe: expectedStatus,
        })

        await ctx.runMutation(internal.users.updateSubscriptionStatus, {
          userId: user._id,
          status: expectedStatus,
          subscriptionId: stripeSub?.id,
        })
      }
    }
  },
})

// Schedule: Run every hour
// crons.ts
export default {
  reconcileSubscriptions: {
    schedule: "0 * * * *",  // Every hour
    handler: internal.reconciliation.reconcileSubscriptions,
  },
}
```

## Pattern 2: On-Demand Reconciliation

Reconcile specific user when they report issues.

```typescript
export const reconcileUser = action({
  args: { userId: v.id("users") },
  handler: async (ctx, args) => {
    const user = await ctx.runQuery(internal.users.get, { id: args.userId })
    if (!user?.stripeCustomerId) return { status: "no_stripe_customer" }

    const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!)

    // Fetch current state from Stripe
    const customer = await stripe.customers.retrieve(user.stripeCustomerId, {
      expand: ['subscriptions'],
    })

    if (customer.deleted) {
      await ctx.runMutation(internal.users.clearSubscription, { userId: args.userId })
      return { status: "customer_deleted" }
    }

    const subscription = customer.subscriptions?.data[0]
    const stripeStatus = subscription?.status ?? 'none'

    if (user.subscriptionStatus !== stripeStatus) {
      await ctx.runMutation(internal.users.updateSubscriptionStatus, {
        userId: args.userId,
        status: stripeStatus,
        subscriptionId: subscription?.id,
      })
      return {
        status: "fixed",
        was: user.subscriptionStatus,
        now: stripeStatus,
      }
    }

    return { status: "already_synced" }
  },
})
```

## Pattern 3: Event Replay

Fetch and replay missed events from Stripe.

```typescript
export const replayMissedEvents = internalAction({
  args: {
    since: v.number(),  // Unix timestamp
    eventTypes: v.array(v.string()),
  },
  handler: async (ctx, args) => {
    const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!)

    const events = await stripe.events.list({
      created: { gte: args.since },
      types: args.eventTypes,
      limit: 100,
    })

    for (const event of events.data) {
      // Check if we already processed this event
      const existing = await ctx.runQuery(internal.events.findByStripeId, {
        stripeEventId: event.id,
      })

      if (existing) {
        console.log(`Event ${event.id} already processed, skipping`)
        continue
      }

      // Process the event
      await ctx.runAction(internal.webhooks.processStripeEvent, {
        event: event,
      })
    }

    return { processed: events.data.length }
  },
})
```

## Pattern 4: Idempotent Webhook Handler

Ensure webhooks can be safely replayed.

```typescript
export const handleStripeWebhook = action({
  args: { event: v.any() },
  handler: async (ctx, args) => {
    const event = args.event

    // Check idempotency
    const existing = await ctx.runQuery(internal.events.findByStripeId, {
      stripeEventId: event.id,
    })

    if (existing) {
      console.log(`Duplicate event ${event.id}, returning early`)
      return { status: "duplicate" }
    }

    // Record event before processing
    await ctx.runMutation(internal.events.record, {
      stripeEventId: event.id,
      type: event.type,
      processedAt: Date.now(),
    })

    // Process based on event type
    switch (event.type) {
      case 'customer.subscription.updated':
        await handleSubscriptionUpdate(ctx, event.data.object)
        break
      case 'invoice.paid':
        await handleInvoicePaid(ctx, event.data.object)
        break
      // ... other events
    }

    return { status: "processed" }
  },
})
```

## When to Reconcile

### Scheduled (Cron)
- **Hourly**: High-value data (subscriptions, payments)
- **Daily**: User profiles, preferences
- **Weekly**: Historical data, analytics

### On-Demand
- User reports "my subscription isn't showing"
- Support escalation
- After incident recovery

### Event-Triggered
- After webhook failure alert
- After deployment (reconcile during quiet period)
- When dashboard shows `pending_webhooks > 0`

## Best Practices

### Do
- **Log all drift detected** with before/after values
- **Store event IDs** for idempotency
- **Paginate** when fetching from external APIs
- **Rate limit** reconciliation to avoid API limits
- **Alert on significant drift** (e.g., >5% mismatch)

### Don't
- **Don't trust local state** as source of truth for external service data
- **Don't skip idempotency** checks
- **Don't reconcile too frequently** (API rate limits)
- **Don't ignore failed reconciliations** (alert and investigate)

## Debugging Drift

```typescript
// Diagnostic query: Find users with stale subscription data
export const findDriftedUsers = internalQuery({
  handler: async (ctx) => {
    const users = await ctx.db.query("users").collect()

    return users.filter((u) => {
      // Users with subscription but no Stripe ID
      if (u.subscriptionStatus === 'active' && !u.stripeSubscriptionId) {
        return true
      }
      // Users with lastSyncedAt > 24 hours ago
      if (u.lastSyncedAt && Date.now() - u.lastSyncedAt > 86400000) {
        return true
      }
      return false
    })
  },
})
```

## References

- `references/stripe-reconciliation.md` — Stripe-specific patterns
- `references/clerk-reconciliation.md` — Clerk user sync patterns
- `references/monitoring.md` — Alerting on drift

## Related Skills

- `stripe-best-practices` — Stripe integration patterns
- `clerk-auth` — Clerk authentication integration
- `verify-fix` — Incident verification protocol

Overview

This skill provides practical patterns for syncing state between external services (Stripe, Clerk) and your local database to prevent and recover from drift. It focuses on combining real-time webhook processing with defensive reconciliation: scheduled jobs, on-demand checks, event replay, and idempotent handlers. Use it to detect, log, and repair mismatches so your app reflects the external source of truth.

How this skill works

It processes webhooks optimistically for low-latency updates, while running reconciliation flows to detect and fix divergence. Scheduled reconciliation compares local records to the external API and updates local state where necessary. On-demand and event-replay patterns let support or incident responders resync specific users or replay missed events. Idempotency and event recording prevent double-processing.

When to use it

Webhooks fail, are duplicated, delivered out-of-order, or missed
Users report incorrect subscription or billing state
After incidents or deployments to ensure consistency
Regular audits for high-value data like subscriptions and payments
When dashboard or monitoring shows pending or failed webhooks

Best practices

Treat external services as the source of truth; do not trust local state alone
Record external event IDs and check idempotency before processing
Schedule reconciliation by data value: hourly for payments, daily for profiles, weekly for analytics
Paginate and rate-limit API calls; alert if reconciliation error rates or drift exceed thresholds
Log before/after values for any drift and create alerts for significant mismatches

Example use cases

Hourly cron that lists Stripe subscriptions and repairs users whose subscriptionStatus diverges
Support-triggered on-demand reconciliation for a single user who reports a billing mismatch
Replay events from Stripe since a given timestamp to process missed webhook events safely
Idempotent webhook handler that stores event IDs, skips duplicates, and processes only new events
Diagnostic query that finds users with stale sync timestamps or subscriptions without external IDs

FAQ

How often should I run scheduled reconciliation?

Run hourly for high-value transactional data (subscriptions/payments), daily for user profile data, and weekly for noncritical historical or analytics data.

How do I avoid hitting API rate limits during reconciliation?

Page through results, batch work, add backoff and rate limiting, and limit concurrent reconciliation jobs. Prioritize users with recent activity or suspected drift.

What if reconciliation finds conflicting states that require human review?

Log the discrepancy, surface it to a support queue, and optionally mark the record for manual review instead of auto-fixing when uncertainty exists.