home / skills / plurigrid / asi / duckdb-timetravel

duckdb-timetravel skill

/skills/duckdb-timetravel

This skill enables temporal versioning and ACSet-driven schema generation for DuckDB, empowering historical queries and schema automation.

npx playbooks add skill plurigrid/asi --skill duckdb-timetravel

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.6 KB
---
name: duckdb-timetravel
description: "' Layer 3: Temporal Versioning and ACSet Schema Generation for DuckDB'"
metadata:
  trit: 0
---

# duckdb-timetravel

> Layer 3: Temporal Versioning and ACSet Schema Generation for DuckDB

**Version**: 1.0.0  
**Trit**: 0 (Ergodic - coordinates data flow)  
**Bundle**: database  

## Overview

DuckDB-timetravel provides temporal versioning for interaction data, enabling queries across historical states. It integrates with ACSets for schema generation and supports DuckLake-style snapshots.

## Capabilities

### 1. temporal-query

Query data at specific points in time.

```sql
-- DuckLake-style time travel
SELECT * FROM interactions 
AT (VERSION => 3);

-- Snapshot at specific timestamp
ATTACH 'ducklake:interactions.db' 
  (SNAPSHOT_TIME '2024-11-30 00:00:00');

-- Query historical state
SELECT * FROM interactions
WHERE created_at < '2024-11-30'
AS OF TIMESTAMP '2024-11-15';
```

### 2. version-management

Create and manage temporal versions.

```python
from duckdb_timetravel import VersionManager

vm = VersionManager("interactions.duckdb")

# Create checkpoint
version_id = vm.checkpoint(
    message="Before pattern training",
    seed=0xf061ebbc2ca74d78
)

# List versions
versions = vm.list_versions()
# [
#   {id: 1, timestamp: "2024-11-01", message: "Initial import"},
#   {id: 2, timestamp: "2024-11-15", message: "Added network data"},
#   {id: 3, timestamp: "2024-11-30", message: "Before pattern training"}
# ]

# Restore to version
vm.restore(version_id=2)
```

### 3. acset-schema-gen

Generate DuckDB schemas from ACSet definitions.

```python
from duckdb_timetravel import ACSsetSchemaGenerator

gen = ACSsetSchemaGenerator()

# From ACSet category definition
schema = gen.from_acset("""
@acset ThreadOperad begin
    Thread::Ob
    Concept::Ob
    touches::Hom(Thread, Concept)
    parent::Hom(Thread, Thread)
    trit::Attr(Thread, Int)
    color_h::Attr(Thread, Float)
end
""")

# Generates:
# CREATE TABLE threads (
#     id VARCHAR PRIMARY KEY,
#     parent_id VARCHAR REFERENCES threads(id),
#     trit INT CHECK (trit IN (-1, 0, 1)),
#     color_h FLOAT CHECK (color_h >= 0 AND color_h < 360)
# );
# CREATE TABLE concepts (id VARCHAR PRIMARY KEY, name VARCHAR);
# CREATE TABLE thread_concepts (thread_id VARCHAR, concept_id VARCHAR);
```

### 4. diff-versions

Compare data between versions.

```python
diff = vm.diff(from_version=2, to_version=3)

# Returns:
# {
#   added: [{table: "interactions", count: 150}],
#   modified: [{table: "users", count: 12}],
#   deleted: [{table: "temp_cache", count: 1}],
#   schema_changes: []
# }
```

### 5. color-stream-history

Track GF(3) color stream evolution over time.

```sql
CREATE TABLE color_stream_history (
    stream VARCHAR,           -- 'LIVE', 'VERIFY', 'BACKFILL'
    index_position INT,
    thread_id VARCHAR,
    h FLOAT,                  -- Hue
    trit INT,                 -- -1, 0, +1
    timestamp TIMESTAMP,
    version_id INT,
    PRIMARY KEY (stream, index_position, version_id)
);

-- Query color evolution
SELECT stream, index_position, h, trit, timestamp
FROM color_stream_history
WHERE thread_id LIKE '%T-019b44%'
ORDER BY stream, index_position;
```

## DuckDB Schema Templates

### Thread Operad Schema

```sql
-- Core tables
CREATE TABLE thread_nodes (
    thread_id VARCHAR PRIMARY KEY,
    title VARCHAR,
    message_count INT,
    created_at TIMESTAMP,
    updated_at TIMESTAMP,
    trit INT CHECK (trit IN (-1, 0, 1)),
    color_h FLOAT,
    color_s FLOAT DEFAULT 0.7,
    color_l FLOAT DEFAULT 0.5,
    parent_thread_id VARCHAR,
    depth INT DEFAULT 0
);

-- GF(3) sibling verification view
CREATE VIEW gf3_sibling_triplets AS
SELECT 
    t1.thread_id as thread_1,
    t2.thread_id as thread_2,
    t3.thread_id as thread_3,
    (t1.trit + t2.trit + t3.trit) % 3 as sum_mod_3,
    CASE WHEN (t1.trit + t2.trit + t3.trit) % 3 = 0 
         THEN 'CONSERVED' ELSE 'VIOLATION' END as status
FROM thread_nodes t1
JOIN thread_nodes t2 ON t1.parent_thread_id = t2.parent_thread_id
JOIN thread_nodes t3 ON t2.parent_thread_id = t3.parent_thread_id
WHERE t1.thread_id < t2.thread_id 
  AND t2.thread_id < t3.thread_id;
```

## GF(3) Triad Integration

| Trit | Skill | Role |
|------|-------|------|
| -1 | clj-kondo-3color | Validates schema/queries |
| 0 | **duckdb-timetravel** | Coordinates temporal data |
| +1 | rama-gay-clojure | Generates data streams |

**Conservation**: (-1) + (0) + (+1) = 0 ✓

## Configuration

```yaml
# duckdb-timetravel.yaml
database:
  path: "interactions.duckdb"
  wal_mode: true
  checkpoint_interval: 100

versioning:
  auto_checkpoint: true
  max_versions: 50
  compression: zstd

acset:
  schema_prefix: "acset_"
  generate_views: true
  gf3_verification: true

reproducibility:
  seed: 0xf061ebbc2ca74d78
```

## Justfile Recipes

```makefile
# Initialize database with schema
discohy-init-db:
    duckdb thread_operad.duckdb -c ".read db/thread_operad_schema.sql"

# Query thread tree
discohy-tree:
    duckdb thread_operad.duckdb -c "SELECT * FROM thread_tree ORDER BY depth, path"

# Check GF(3) conservation
discohy-gf3:
    duckdb thread_operad.duckdb -c "SELECT * FROM gf3_sibling_triplets"

# Time travel to version
discohy-restore version="1":
    duckdb thread_operad.duckdb -c "CALL restore_version({{version}})"

# View color stream history
discohy-stream-history thread_id:
    duckdb thread_operad.duckdb -c "SELECT stream, index_position, h, trit, timestamp FROM color_stream_history WHERE thread_id LIKE '%{{thread_id}}%' ORDER BY stream, index_position"
```

## Related Skills

- `acsets` - ACSet category definitions
- `clj-kondo-3color` - Schema validation
- `gay-mcp` - Color stream generation
- `entropy-sequencer` - Temporal arrangement

Overview

This skill implements temporal versioning and ACSet-driven schema generation for DuckDB, enabling reproducible time-travel queries and lightweight snapshot management. It combines version checkpoints, diffs, and DuckLake-style snapshots with ACSet-to-SQL schema generation and GF(3) color-stream tracking. The result is a practical layer for auditing, experimentation, and historical analysis on interaction-style datasets.

How this skill works

It records named checkpoints and timestamps as immutable versions and exposes SQL/SDK primitives to query tables 'AS OF' a version or timestamp. A VersionManager API creates, lists, restores, and diffs versions while storage can use WAL/checkpoint policies and optional compression. An ACSet schema generator translates category definitions into normalized DuckDB DDL, including referential constraints and attribute checks, and can emit views for GF(3) verification and color-stream history tables.

When to use it

  • When you need reproducible snapshots for model training, backtesting, or debugging.
  • To audit changes across ingest pipelines and detect schema drift over time.
  • When migrating domain category definitions (ACSets) into a DuckDB schema quickly.
  • To compare dataset states, find added/modified/deleted records between versions.
  • When tracking temporal evolution of GF(3) color streams or other indexed histories.

Best practices

  • Enable WAL and periodic checkpoints; set checkpoint_interval to balance latency and storage.
  • Use descriptive checkpoint messages and consistent seeds for reproducibility.
  • Limit max_versions and enable compression to control disk usage for long histories.
  • Generate ACSet schemas before ingest and enable gf3_verification for data integrity checks.
  • Use diffs to drive targeted restores instead of full rollbacks when possible.

Example use cases

  • Create a checkpoint before running an experiment and restore it to reproduce results exactly.
  • Generate a DuckDB schema from an ACSet category and instantiate normalized tables with constraints.
  • Run a time-travel query to inspect the interactions table at a specific version or timestamp.
  • Compute diffs between two versions to produce change summaries for auditing or ETL pipelines.
  • Track color_stream_history to analyze GF(3) trit conservation across thread evolutions.

FAQ

Can I query a table at an arbitrary timestamp?

Yes. The skill supports DuckLake-style snapshots and 'AS OF TIMESTAMP' queries when snapshots include timestamp mappings.

How large should max_versions be?

Choose based on retention needs and storage; use compression and periodic consolidation if you need long-term history with limited disk.