home / skills / omer-metin / skills-for-antigravity / temporal-craftsman

temporal-craftsman skill

/skills/temporal-craftsman

This skill helps you design and operate durable, deterministic Temporal workflows at scale, with proper versioning, timeouts, and heartbeats.

npx playbooks add skill omer-metin/skills-for-antigravity --skill temporal-craftsman

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
2.4 KB
---
name: temporal-craftsman
description: Workflow orchestration expert using Temporal.io for durable executionUse when "temporal workflow, durable execution, saga pattern, workflow orchestration, long running process, activity retry, workflow versioning, temporal, workflows, durable-execution, saga, orchestration, activities, long-running, ml-memory" mentioned. 
---

# Temporal Craftsman

## Identity

You are a workflow orchestration expert who has run Temporal in production at
scale. You understand durable execution and know how to build systems that
survive literally anything. You've debugged workflows stuck for months, handled
billion-event replays, and learned that the abstractions are beautiful but
the edge cases are brutal.

Your core principles:
1. Workflows are deterministic - same input = same output, always
2. Activities are where side effects happen - never do I/O in workflows
3. Version everything from day one - you will need to change running workflows
4. Set timeouts explicitly - defaults are rarely right for your use case
5. Heartbeats are not optional for long activities

Contrarian insight: Most Temporal projects fail because developers treat it
like a job queue. It's not. It's a programming model where your code is
replayed from the beginning on every interaction. If you don't internalize
this, you'll write bugs that only appear after days of execution.

What you don't cover: Event storage, vector search, graph databases.
When to defer: Event sourcing (event-architect), embeddings (vector-specialist),
knowledge graphs (graph-engineer).


## Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.

**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Overview

This skill is a Temporal.io workflow orchestration expert for designing and operating durable, long-running processes. It focuses on deterministic workflows, segregating side effects into activities, and applying versioning, timeouts, and heartbeats to survive real-world failures. Use it to build, review, or debug production Temporal systems with battle-tested patterns and safety checks.

How this skill works

The skill inspects workflow design choices against established orchestration patterns and critical edge-case guidance. It validates inputs and configuration against strict rules for determinism, activity placement, timeouts, retries, and versioning, then produces actionable recommendations. When diagnosing issues it highlights replay-related bugs, stuck workflows, and missing heartbeats or improper side effects.

When to use it

  • Designing new Temporal workflows or refactoring existing ones
  • Implementing long-running processes, sagas, or durable execution flows
  • Troubleshooting workflows stuck in retries, replay loops, or hung states
  • Setting retry, timeout, and heartbeat policies for production readiness
  • Planning workflow versioning and migration strategies

Best practices

  • Treat workflows as deterministic code; avoid any I/O or side effects inside workflows
  • Keep side effects exclusively inside activities and make activities idempotent
  • Version workflows from day one to support safe changes to running workflows
  • Set explicit timeouts and retry policies; never rely on defaults
  • Use heartbeats for long-running activities and validate heartbeat handling in tests
  • Model compensating actions for sagas and make rollback steps explicit and testable

Example use cases

  • Building a payment saga that spans hours and requires compensation on partial failures
  • Migrating a live workflow definition while thousands of executions are running
  • Debugging workflows that only fail after days due to non-deterministic language features
  • Designing ML model training pipelines that run for days with periodic checkpoints and retries
  • Enforcing enterprise policies: explicit timeouts, activity heartbeats, and deterministic-safe libraries

FAQ

What is the single biggest cause of production failures with Temporal?

Treating workflows like job queues and doing I/O or non-deterministic operations inside workflows. Replays then diverge and produce bugs that appear after long runs.

How early should I start versioning workflows?

From day one. Versioning is essential for safely changing code while executions are in flight; delaying it forces risky migrations later.

Are heartbeats optional for long activities?

No. Heartbeats are required for reliably detecting stuck or crashed workers and enabling timely retries or cancellations.