home / skills / phrazzld / claude-config / schema-design

schema-design skill

safe

This skill guides schema designers to apply normalization, keys, constraints, and anti-pattern checks for robust, scalable database architectures.

npx playbooks add skill phrazzld/claude-config --skill schema-design

Review the files below or copy the command above to add this skill to your agents.

Files (6)

skill.md

3.4 KB

---
name: schema-design
description: "Universal database schema design principles: normalization strategy, data types, primary keys, constraints, anti-patterns, and red flags. Apply when designing schemas, reviewing database architecture, or refactoring data models."
effort: high
---

# Database Schema Design Principles

**"Schema design debt compounds faster than code debt. Fix it now or pay 10x later."**

## Foundation Principles

### Primary Keys
- Every table MUST have a primary key
- Prefer surrogate keys (auto-increment, UUIDv7) over composite natural keys
- UUIDv7 for distributed systems, auto-increment BIGINT for monoliths

### Foreign Keys
- Use FK constraints unless specific reason not to (high-volume logging, sharded DBs)
- ON DELETE: RESTRICT (safest), CASCADE (use sparingly), SET NULL (breaks audit)

### Data Types
- Choose smallest sufficient type (BIGINT vs INT = 4 bytes * rows)
- Money: DECIMAL (never FLOAT/DOUBLE)
- Dates without time: DATE not DATETIME
- Small sets: ENUM not VARCHAR

### Constraints
- NOT NULL on required columns
- UNIQUE on natural keys
- CHECK for business rules
- DEFAULT where appropriate

## Quality Checklist

### Structural Integrity
- [ ] Every table has primary key
- [ ] Foreign key constraints defined
- [ ] Appropriate data types (smallest sufficient)
- [ ] NOT NULL, UNIQUE, CHECK constraints

### Anti-Pattern Scan
- [ ] No EAV (entity-attribute-value) patterns
- [ ] No god tables (> 50 columns)
- [ ] No multi-valued fields (CSV in columns)
- [ ] No DATETIME for date-only data

### Performance
- [ ] Indexes match query patterns
- [ ] Foreign keys indexed
- [ ] Composite index column order optimized

## Decision Trees

### "Should I denormalize this?"
```
Have evidence of query performance problem?
├─ NO → DON'T denormalize (premature optimization)
└─ YES → Tried indexes, query optimization, caching?
   ├─ NO → Try those first
   └─ YES → Read-heavy (> 100:1)?
      ├─ NO → Normalize, optimize queries
      └─ YES → Denormalize specific fields
```

### "UUID or auto-increment?"
```
Distributed system (multiple write nodes)?
├─ YES → UUIDv7 (time-ordered, better than v4)
└─ NO → Exposed to users (issue-123)?
   ├─ YES → Auto-increment (better UX)
   └─ NO → Auto-increment (better performance)
```

### "Soft or hard delete?"
```
GDPR "right to erasure" applies?
├─ YES → Hard delete or audit table
└─ NO → Need audit trail?
   ├─ YES → Audit table pattern (recommended)
   └─ NO → High deletion rate (> 20%)?
      ├─ YES → Hard delete
      └─ NO → Soft delete acceptable
```

## References

Detailed patterns and examples:
- `references/anti-patterns.md` — EAV, god tables, multi-valued fields, red flags
- `references/normalization.md` — 1NF/2NF/3NF, when to denormalize, OLTP vs OLAP
- `references/advanced-patterns.md` — Soft delete, temporal data, JSON columns
- `references/naming-conventions.md` — Tables, columns, indexes, constraints
- `references/performance-patterns.md` — Indexing strategy, partitioning, data types

## Remember

**"The best schema is one you can understand in 6 months and modify with confidence."**

Design schemas that:
1. **Enforce integrity** — Constraints, foreign keys, data types
2. **Optimize for common patterns** — Indexes, denormalization where proven
3. **Enable evolution** — Proper normalization, migration strategy
4. **Prevent known anti-patterns** — No EAV, god tables, multi-valued fields

Overview

This skill packages universal database schema design rules and practical checks to build reliable, performant data models. It focuses on keys, data types, constraints, anti-pattern detection, and decision trees for denormalization, deletes, and ID strategies. Use it to guide new designs, architecture reviews, and refactors to avoid costly schema debt.

How this skill works

It inspects schema elements and design decisions against a compact set of principles: every table must have a primary key, foreign key usage, minimal adequate data types, and appropriate constraints (NOT NULL, UNIQUE, CHECK, DEFAULT). It scans for anti-patterns (EAV, god tables, multi-valued fields), evaluates indexing and FK indexing, and applies decision trees for denormalization, UUID vs auto-increment, and soft vs hard deletes. The skill produces concrete recommendations and actionable checklist items to enforce integrity, performance, and evolvability.

When to use it

Designing a new database schema for an application or microservice
Reviewing or auditing existing database architecture before deployment
Refactoring or migrating legacy schemas to reduce technical debt
Choosing ID strategies and delete policies for distributed vs monolithic systems
Deciding whether to denormalize for performance after profiling

Best practices

Always define a primary key; prefer surrogate keys (UUIDv7 for distributed, auto-increment BIGINT for monoliths)
Use FK constraints by default; prefer ON DELETE RESTRICT, use CASCADE sparingly
Pick the smallest sufficient data type; use DECIMAL for money and DATE for date-only values
Apply NOT NULL, UNIQUE, CHECK constraints and sensible DEFAULTs to encode business rules
Index to match query patterns and ensure foreign keys are indexed; optimize composite index column order
Avoid anti-patterns: no EAV, no god tables (>50 columns), no multi-valued fields (CSV in a column)

Example use cases

Designing transactional OLTP schemas with clear normalization and constraints
Auditing a legacy database to find and prioritize schema debt fixes
Choosing between UUIDv7 and auto-increment for a new service in a distributed environment
Guiding a performance optimization workflow: index first, then denormalize only with evidence
Establishing delete strategy and audit patterns to meet GDPR and retention requirements

FAQ

When should I denormalize a schema?

Only after you have evidence of a query performance problem and after trying indexes, query optimization, and caching; denormalize specific fields for heavily read-biased workloads (e.g., >100:1).

Which key type should I use for distributed systems?

Use UUIDv7 for distributed write nodes because it is time-ordered and performs better than v4; prefer auto-increment BIGINT for single-node monoliths for performance and UX.