home / skills / personamanagmentlayer / pcl / kafka-expert
This skill helps you design, implement, and optimize Apache Kafka architectures and event-driven streams for reliable messaging.
npx playbooks add skill personamanagmentlayer/pcl --skill kafka-expertReview the files below or copy the command above to add this skill to your agents.
---
name: kafka-expert
version: 1.0.0
description: Expert-level Apache Kafka, event streaming, Kafka Streams, and distributed messaging
category: data
tags: [kafka, streaming, messaging, event-driven, kafka-streams]
allowed-tools:
- Read
- Write
- Edit
- Bash(kafka:*)
---
# Apache Kafka Expert
Expert guidance for Apache Kafka, event streaming, Kafka Streams, and building event-driven architectures.
## Core Concepts
- Topics, partitions, and offsets
- Producers and consumers
- Consumer groups
- Kafka Streams
- Kafka Connect
- Exactly-once semantics
## Producer
```python
from kafka import KafkaProducer
import json
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
acks='all', # Wait for all replicas
retries=3
)
# Send message
future = producer.send('user-events', {
'user_id': '123',
'event': 'login',
'timestamp': '2024-01-01T00:00:00Z'
})
# Wait for acknowledgment
record_metadata = future.get(timeout=10)
print(f"Topic: {record_metadata.topic}, Partition: {record_metadata.partition}")
producer.flush()
producer.close()
```
## Consumer
```python
from kafka import KafkaConsumer
consumer = KafkaConsumer(
'user-events',
bootstrap_servers=['localhost:9092'],
group_id='my-group',
auto_offset_reset='earliest',
enable_auto_commit=False,
value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)
for message in consumer:
print(f"Received: {message.value}")
# Process message
process_event(message.value)
# Manual commit
consumer.commit()
```
## Kafka Streams
```java
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("input-topic");
// Transform and filter
KStream<String, String> transformed = source
.filter((key, value) -> value.length() > 10)
.mapValues(value -> value.toUpperCase());
transformed.to("output-topic");
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();
```
## Best Practices
- Use appropriate partition keys
- Monitor consumer lag
- Implement idempotent producers
- Use consumer groups for scaling
- Set proper retention policies
- Handle rebalancing gracefully
- Monitor cluster metrics
## Anti-Patterns
❌ Single partition topics
❌ No error handling
❌ Ignoring consumer lag
❌ Producing to wrong partitions
❌ Not using consumer groups
❌ Synchronous processing
❌ No monitoring
## Resources
- Apache Kafka: https://kafka.apache.org/
- Confluent Platform: https://www.confluent.io/
This skill delivers expert-level guidance for Apache Kafka, event streaming, Kafka Streams, and distributed messaging patterns. It focuses on practical design, operational best practices, and code-level patterns to build reliable, scalable event-driven systems. Use it to design producers, consumers, stream processors, and operational runbooks.
The skill inspects common Kafka components: topics, partitions, offsets, producers, consumers, consumer groups, Kafka Streams topologies, and Connectors. It provides actionable recommendations for configuration, partitioning, idempotency, exactly-once semantics, error handling, and monitoring. It also highlights anti-patterns and recovery strategies for rebalances, lag, and broker failures.
How do I guarantee no duplicates when consuming?
Use idempotent producers and at-least-once consumers, combine with deduplication in consumers or use Kafka Streams with exactly-once processing where appropriate.
When should I use a single-topic partition versus many partitions?
Use a single partition only when strict global ordering is mandatory and throughput is low. Most high-throughput applications need many partitions to parallelize consumers while balancing ordering by key.
How do I handle long processing times without blocking offsets?
Process messages asynchronously and commit offsets only after successful processing. Use a separate worker pool or external task queue and consider using a dead-letter topic for failures.