home / skills / anton-abyzov / specweave / confluent-architect

confluent-architect skill

/plugins/specweave-confluent/skills/confluent-architect

This skill guides Confluent Cloud architecture decisions, sizing eCKU, cluster linking, multi-region strategies, and Schema Registry HA for resilient streams.

This is most likely a fork of the sw-confluent-architect skill from openclaw
npx playbooks add skill anton-abyzov/specweave --skill confluent-architect

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
206 B
---
name: confluent-architect
description: Confluent Cloud architecture - eCKU sizing, cluster linking, multi-region strategies, Schema Registry HA, ksqlDB, Stream Governance.
model: opus
context: fork
---

Overview

This skill helps design and size Confluent Cloud architectures for production workloads, covering eCKU sizing, cluster linking, multi-region strategies, Schema Registry HA, ksqlDB, and stream governance. It provides prescriptive recommendations and patterns to balance cost, latency, availability, and operational complexity. Outcomes include validated sizing guidance, topology options, and concrete deployment patterns for resilient streaming platforms.

How this skill works

The skill analyzes workload characteristics (throughput, retention, consumer patterns) and maps them to eCKU sizing and cluster topologies. It evaluates trade-offs for cluster linking and multi-region replication, proposes high-availability patterns for Schema Registry and ksqlDB, and recommends governance guardrails for topic, schema, and connector lifecycle. Outputs include configuration suggestions, failure-domain diagrams, and migration/operational runbooks.

When to use it

  • Planning a new Confluent Cloud deployment with production SLAs
  • Migrating from self-managed Kafka to Confluent Cloud or between regions
  • Designing multi-region, active-active or active-passive streaming topologies
  • Sizing eCKUs to match variable traffic patterns and retention needs
  • Hardening Schema Registry and ksqlDB for HA and automated failover

Best practices

  • Base eCKU sizing on peak sustained throughput and retention, then add buffer for growth and spikes
  • Use cluster linking for asynchronous replication; prefer it for regional isolation and controlled failover
  • Adopt multi-region patterns (active-active for low-latency read, active-passive for simplicity) with clear failover runbooks
  • Run Schema Registry in HA mode with redundant replicas and separate storage; replicate schemas across regions
  • Deploy ksqlDB clusters per region with stateful storage backed by durable, replicated storage and clear application restart strategies
  • Implement stream governance: topic naming, quotas, schema evolution rules, and automated CI/CD for connectors and ksqlDB apps

Example use cases

  • Succeeding with a global event mesh: cluster linking across three regions with read replicas and automated consumer failover
  • Sizing eCKUs for a financial workload with high retention and bursty throughput while keeping cost under control
  • Designing Schema Registry HA with cross-region replication to ensure schema availability during regional outages
  • Migrating ksqlDB apps from single-region to multi-region with state migration and replay strategies
  • Implementing stream governance to prevent schema-breaking changes and to automate topic provisioning

FAQ

How do I choose between active-active and active-passive multi-region patterns?

Active-active reduces cross-region read latency but increases complexity and conflict handling; choose it when low-latency global reads are critical. Active-passive is simpler and safer for strict consistency or cost control.

What’s the simplest way to ensure Schema Registry availability across regions?

Run redundant Schema Registry instances per region and replicate schemas via the Registry API or tooling. Store schema metadata in durable storage and automate failover of producer/consumer config.