home / skills / personamanagmentlayer / pcl / kafka-expert

kafka-expert skill

safe

This skill helps you design, implement, and optimize Apache Kafka architectures and event-driven streams for reliable messaging.

npx playbooks add skill personamanagmentlayer/pcl --skill kafka-expert

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

2.7 KB

---
name: kafka-expert
version: 1.0.0
description: Expert-level Apache Kafka, event streaming, Kafka Streams, and distributed messaging
category: data
tags: [kafka, streaming, messaging, event-driven, kafka-streams]
allowed-tools:
  - Read
  - Write
  - Edit
  - Bash(kafka:*)
---

# Apache Kafka Expert

Expert guidance for Apache Kafka, event streaming, Kafka Streams, and building event-driven architectures.

## Core Concepts

- Topics, partitions, and offsets
- Producers and consumers
- Consumer groups
- Kafka Streams
- Kafka Connect
- Exactly-once semantics

## Producer

```python
from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers=['localhost:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8'),
    acks='all',  # Wait for all replicas
    retries=3
)

# Send message
future = producer.send('user-events', {
    'user_id': '123',
    'event': 'login',
    'timestamp': '2024-01-01T00:00:00Z'
})

# Wait for acknowledgment
record_metadata = future.get(timeout=10)
print(f"Topic: {record_metadata.topic}, Partition: {record_metadata.partition}")

producer.flush()
producer.close()
```

## Consumer

```python
from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'user-events',
    bootstrap_servers=['localhost:9092'],
    group_id='my-group',
    auto_offset_reset='earliest',
    enable_auto_commit=False,
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

for message in consumer:
    print(f"Received: {message.value}")

    # Process message
    process_event(message.value)

    # Manual commit
    consumer.commit()
```

## Kafka Streams

```java
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");

StreamsBuilder builder = new StreamsBuilder();

KStream<String, String> source = builder.stream("input-topic");

// Transform and filter
KStream<String, String> transformed = source
    .filter((key, value) -> value.length() > 10)
    .mapValues(value -> value.toUpperCase());

transformed.to("output-topic");

KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();
```

## Best Practices

- Use appropriate partition keys
- Monitor consumer lag
- Implement idempotent producers
- Use consumer groups for scaling
- Set proper retention policies
- Handle rebalancing gracefully
- Monitor cluster metrics

## Anti-Patterns

❌ Single partition topics
❌ No error handling
❌ Ignoring consumer lag
❌ Producing to wrong partitions
❌ Not using consumer groups
❌ Synchronous processing
❌ No monitoring

## Resources

- Apache Kafka: https://kafka.apache.org/
- Confluent Platform: https://www.confluent.io/

Overview

This skill delivers expert-level guidance for Apache Kafka, event streaming, Kafka Streams, and distributed messaging patterns. It focuses on practical design, operational best practices, and code-level patterns to build reliable, scalable event-driven systems. Use it to design producers, consumers, stream processors, and operational runbooks.

How this skill works

The skill inspects common Kafka components: topics, partitions, offsets, producers, consumers, consumer groups, Kafka Streams topologies, and Connectors. It provides actionable recommendations for configuration, partitioning, idempotency, exactly-once semantics, error handling, and monitoring. It also highlights anti-patterns and recovery strategies for rebalances, lag, and broker failures.

When to use it

Designing or reviewing event-driven architectures and data pipelines
Writing or optimizing Kafka producers and consumers for reliability and performance
Implementing Kafka Streams transformations and stateful stream processing
Setting operational runbooks: monitoring, scaling, and incident response
Choosing retention, partitioning, and exactly-once semantics policies

Best practices

Choose partition keys to balance throughput and preserve ordering where required
Enable idempotent producers and configure acks/retries to avoid duplicates
Monitor consumer lag, partition distribution, and broker metrics continuously
Use consumer groups for horizontal scale and handle rebalances gracefully
Set retention and compaction policies based on business SLAs and storage cost
Prefer asynchronous processing with proper error handling and dead-letter topics

Example use cases

High-throughput event ingestion with idempotent producers and partition key strategy
Stateful stream aggregations using Kafka Streams with changelog topics for recovery
Linking databases and external systems with Kafka Connect and SMTs for schema evolution
Building resilient consumer applications that commit offsets after successful processing
Implementing exactly-once processing for financial or inventory updates

FAQ

How do I guarantee no duplicates when consuming?

Use idempotent producers and at-least-once consumers, combine with deduplication in consumers or use Kafka Streams with exactly-once processing where appropriate.

When should I use a single-topic partition versus many partitions?

Use a single partition only when strict global ordering is mandatory and throughput is low. Most high-throughput applications need many partitions to parallelize consumers while balancing ordering by key.

How do I handle long processing times without blocking offsets?

Process messages asynchronously and commit offsets only after successful processing. Use a separate worker pool or external task queue and consider using a dead-letter topic for failures.