home / skills / copyleftdev / sk1llz / gray

This skill helps design reliable data systems using Jim Gray's ACID principles, WAL, and fault-tolerant recovery to ensure data integrity.

npx playbooks add skill copyleftdev/sk1llz --skill gray

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
11.2 KB
---
name: gray-transaction-systems
description: Design data systems using Jim Gray's principles of transaction processing and fault tolerance. Emphasizes ACID properties, recovery mechanisms, and the science of making systems reliable. Use when building databases, payment systems, or any system where data integrity is paramount.
---

# Jim Gray Style Guide

## Overview

Jim Gray (1944–2007) was the father of transaction processing. He formalized ACID properties, invented key recovery algorithms, pioneered database benchmarking (TPC), and advanced our understanding of fault tolerance. Turing Award winner (1998). His work underpins every reliable database and financial system in existence.

## Core Philosophy

> "A transaction is a transformation of state that has the properties of atomicity, consistency, isolation, and durability."

> "Simplicity does not precede complexity, but follows it."

> "The key to performance is elegance, not battalions of special cases."

## Design Principles

1. **ACID is Non-Negotiable**: For critical data, atomicity, consistency, isolation, and durability are requirements, not optimizations.

2. **Failures are Certain**: Hardware fails, software has bugs, operators make mistakes. Design for recovery, not just operation.

3. **Measure Everything**: You can't improve what you can't measure. Benchmarks reveal truth.

4. **Modularity Enables Reliability**: Separate concerns cleanly. Each component should be independently testable and replaceable.

5. **The Log is Truth**: The write-ahead log is the foundation of durability and recovery.

## The ACID Properties

```text
ATOMICITY:    All or nothing. A transaction either completes entirely or has no effect.
CONSISTENCY:  Transactions transform the database from one valid state to another.
ISOLATION:    Concurrent transactions appear to execute serially.
DURABILITY:   Once committed, data survives any subsequent failure.
```

### Isolation Levels (from weakest to strongest)

```text
READ UNCOMMITTED:  See uncommitted changes (dirty reads possible)
READ COMMITTED:    Only see committed changes (non-repeatable reads possible)
REPEATABLE READ:   Same query returns same rows (phantom reads possible)
SERIALIZABLE:      Full isolation (no anomalies, but limits concurrency)
```

## When Designing Systems

### Always

- Use write-ahead logging (WAL) for durability
- Design idempotent operations where possible
- Plan for crash recovery from the start
- Test failure scenarios explicitly
- Measure transaction throughput AND latency
- Document consistency guarantees clearly
- Use timeouts on all distributed operations

### Never

- Assume commits are durable without fsync
- Mix transaction boundaries with business logic haphazardly
- Ignore the difference between isolation levels
- Design recovery as an afterthought
- Trust in-memory state without persistence guarantees
- Assume network operations will succeed

### Prefer

- Pessimistic locking over optimistic when conflicts are common
- Shorter transactions over longer ones
- Explicit transaction boundaries over implicit
- Simple recovery mechanisms over clever optimizations
- Proven algorithms over novel approaches for critical paths

## Key Concepts

### Write-Ahead Logging (WAL)

```text
The fundamental rule: WRITE THE LOG BEFORE THE DATA

1. Before modifying data, write the intended change to the log
2. Ensure the log record is durable (fsync)
3. Only then apply the change to the data pages
4. Periodically checkpoint (flush dirty pages, truncate log)

Recovery:
1. Read the log from last checkpoint
2. REDO all committed transactions
3. UNDO all uncommitted transactions
```

### The Five-Minute Rule (1987, updated over time)

```text
Original insight: There's a break-even point for caching

If data is accessed more frequently than once per break-even interval,
keep it in memory. Otherwise, fetch from disk.

The rule: Break-even interval ≈ (Price per MB of disk) / (Price per MB of RAM × disk accesses/sec)

In 1987: ~5 minutes
In 2007: Still ~5 minutes (both got cheaper proportionally)
Today: SSD changes the math, but the principle remains
```

### Transaction States

```text
          ┌─────────────────────────┐
          ▼                         │
    ┌──────────┐    ┌──────────┐    │
    │  ACTIVE  │───▶│ PARTIALLY│────┘
    └──────────┘    │ COMMITTED│
          │         └──────────┘
          │               │
          ▼               ▼
    ┌──────────┐    ┌──────────┐
    │  FAILED  │    │COMMITTED │
    └──────────┘    └──────────┘
          │
          ▼
    ┌──────────┐
    │ ABORTED  │
    └──────────┘
```

### Two-Phase Commit (2PC)

```text
Coordinator                    Participants
     │                              │
     │──── PREPARE ────────────────▶│
     │                              │ (write to log, lock resources)
     │◀─── VOTE (YES/NO) ──────────│
     │                              │
     │ (if all YES)                 │
     │──── COMMIT ─────────────────▶│
     │                              │ (commit, release locks)
     │◀─── ACK ────────────────────│
     │                              │
     
If any participant votes NO, or timeout: ABORT all.
```

## Code Patterns

### Implementing WAL in Principle

```python
from dataclasses import dataclass
from enum import Enum
from typing import Any
import os

class LogRecordType(Enum):
    BEGIN = "BEGIN"
    UPDATE = "UPDATE"
    COMMIT = "COMMIT"
    ABORT = "ABORT"
    CHECKPOINT = "CHECKPOINT"

@dataclass
class LogRecord:
    lsn: int              # Log Sequence Number
    txn_id: int
    record_type: LogRecordType
    table: str = ""
    key: Any = None
    before_value: Any = None  # For UNDO
    after_value: Any = None   # For REDO

class WriteAheadLog:
    """
    Jim Gray's WAL protocol: Log before data.
    """
    
    def __init__(self, log_path: str):
        self.log_path = log_path
        self.lsn = 0
        self.log_file = open(log_path, 'a+b')
    
    def append(self, record: LogRecord) -> int:
        """Append record to log and force to disk."""
        self.lsn += 1
        record.lsn = self.lsn
        
        # Serialize and write
        data = self._serialize(record)
        self.log_file.write(data)
        
        # CRITICAL: Force to stable storage
        self.log_file.flush()
        os.fsync(self.log_file.fileno())
        
        return self.lsn
    
    def _serialize(self, record: LogRecord) -> bytes:
        # Implementation detail
        pass
```

### Transaction Manager Pattern

```python
from contextlib import contextmanager
from threading import Lock
from typing import Generator

class TransactionManager:
    """
    Manages transaction lifecycle with ACID guarantees.
    """
    
    def __init__(self, wal: WriteAheadLog, storage: Storage):
        self.wal = wal
        self.storage = storage
        self.active_txns: dict[int, Transaction] = {}
        self.lock = Lock()
        self.next_txn_id = 0
    
    @contextmanager
    def transaction(self) -> Generator[Transaction, None, None]:
        """
        Context manager for transaction scope.
        
        Usage:
            with tm.transaction() as txn:
                txn.update('accounts', 'alice', balance=100)
                txn.update('accounts', 'bob', balance=200)
            # Auto-commit on success, auto-abort on exception
        """
        txn = self._begin()
        try:
            yield txn
            self._commit(txn)
        except Exception:
            self._abort(txn)
            raise
    
    def _begin(self) -> Transaction:
        with self.lock:
            txn_id = self.next_txn_id
            self.next_txn_id += 1
        
        # Log the begin FIRST
        self.wal.append(LogRecord(
            lsn=0, txn_id=txn_id, record_type=LogRecordType.BEGIN
        ))
        
        txn = Transaction(txn_id, self.wal, self.storage)
        self.active_txns[txn_id] = txn
        return txn
    
    def _commit(self, txn: Transaction) -> None:
        # Log commit record
        self.wal.append(LogRecord(
            lsn=0, txn_id=txn.txn_id, record_type=LogRecordType.COMMIT
        ))
        # Release locks, clean up
        txn.release_locks()
        del self.active_txns[txn.txn_id]
    
    def _abort(self, txn: Transaction) -> None:
        # UNDO all changes using log records
        txn.rollback()
        self.wal.append(LogRecord(
            lsn=0, txn_id=txn.txn_id, record_type=LogRecordType.ABORT
        ))
        txn.release_locks()
        del self.active_txns[txn.txn_id]
```

### Idempotent Operations

```python
from hashlib import sha256
from datetime import datetime, timedelta

class IdempotentExecutor:
    """
    Ensure operations execute exactly once, even with retries.
    
    Gray's insight: Idempotency transforms "at-least-once" 
    into "exactly-once" semantics.
    """
    
    def __init__(self, storage):
        self.storage = storage
        self.executed_ops: dict[str, tuple[datetime, Any]] = {}
    
    def execute(
        self, 
        idempotency_key: str, 
        operation: callable,
        ttl: timedelta = timedelta(hours=24)
    ) -> Any:
        """
        Execute operation exactly once for given key.
        """
        # Check if already executed
        if idempotency_key in self.executed_ops:
            timestamp, result = self.executed_ops[idempotency_key]
            if datetime.now() - timestamp < ttl:
                return result  # Return cached result
        
        # Execute and store result
        result = operation()
        self.executed_ops[idempotency_key] = (datetime.now(), result)
        
        return result
```

## Mental Model

Jim Gray approached systems as a scientist:

1. **Define the invariants**: What must always be true?
2. **Identify failure modes**: What can go wrong?
3. **Design recovery**: How do we get back to a valid state?
4. **Measure and benchmark**: Quantify performance precisely
5. **Simplify**: Remove complexity until it breaks, then add back only what's needed

### The Failure Model

```text
Types of failures (increasingly severe):
1. Transaction failure  → Abort and undo
2. System failure       → Restart and recover from log
3. Media failure        → Restore from backup + log
4. Disaster             → Failover to remote site

Design for all of them.
```

## Warning Signs

You're violating Gray's principles if:

- You don't know your system's durability guarantees
- Transactions span user think time (long-held locks)
- Recovery is "we'll figure it out if it happens"
- You can't explain your isolation level choice
- You assume fsync is called when it isn't
- Your benchmarks don't include failure scenarios

## Additional Resources

- For detailed philosophy, see [philosophy.md](philosophy.md)
- For references (papers, books), see [references.md](references.md)

Overview

This skill encodes Jim Gray's transaction-processing and fault-tolerance principles into practical design guidance and Python patterns. It focuses on delivering ACID guarantees, robust recovery, and measurable reliability for critical data systems. Use it to design databases, payment processors, or any system where data integrity and predictable failure behavior are mandatory.

How this skill works

The skill inspects system designs and suggests concrete mechanisms: write-ahead logging (WAL), explicit transaction boundaries, idempotency, and recovery procedures like REDO/UNDO from log checkpoints. It provides patterns for a transaction manager, WAL append/flush semantics, two-phase commit coordination, and idempotent executors with TTL-based caching. It emphasizes measurement, failure testing, and clear documentation of consistency guarantees.

When to use it

  • Building or auditing databases, storage engines, or transaction coordinators
  • Designing financial systems, payment rails, or ledgers that require strict durability
  • Implementing distributed transactions or two-phase commit across services
  • Hardening systems against crashes, media failure, and disaster scenarios
  • Defining SLAs and documenting isolation/consistency guarantees for clients

Best practices

  • Treat ACID as a requirement for critical paths; document which properties apply where
  • Write the log before data: WAL records must be fsync'd before payload persistence
  • Design idempotent operations to turn retries into exactly-once semantics
  • Keep transactions short and explicit; avoid spanning user think time
  • Plan and test recovery from day one; include failure scenarios in benchmarks
  • Measure both latency and throughput and record metrics under fault conditions

Example use cases

  • Implementing a write-ahead log and recovery loop for a key-value store
  • Building a transaction manager with BEGIN/COMMIT/ABORT semantics and lock management
  • Adding idempotency keys and executor caches to payment endpoints to prevent double charging
  • Designing a 2PC coordinator for multi-service commits with clear timeout and abort paths
  • Defining cache-vs-disk policies using the five-minute rule as a guiding heuristic

FAQ

Do I always need WAL and fsync for durability?

For critical durability guarantees, yes: write-ahead logging plus an explicit fsync (or equivalent stable storage flush) is required. Decide trade-offs only after documenting the risk.

When should I prefer pessimistic locking over optimistic?

Prefer pessimistic locking when conflicts are common or the cost of retries is high. Use optimistic concurrency for low-conflict workloads to improve throughput.