home / skills / tursodatabase / turso / storage-format

storage-format skill

/.claude/skills/storage-format

This skill helps you understand and navigate SQLite-like on-disk storage structures, including pages, B-trees, and overflow handling.

npx playbooks add skill tursodatabase/turso --skill storage-format

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.7 KB
---
name: storage-format
description: SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb
---
# Storage Format Guide

## Database File Structure

```
┌─────────────────────────────┐
│ Page 1: Header + Schema     │  ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages     │  ← Tables and indexes
│            Overflow pages   │
│            Freelist pages   │
└─────────────────────────────┘
```

Page size: power of 2, 512-65536 bytes. Default 4096.

## Database Header (First 100 Bytes)

| Offset | Size | Field |
|--------|------|-------|
| 0 | 16 | Magic: `"SQLite format 3\0"` |
| 16 | 2 | Page size (big-endian) |
| 18 | 1 | Write format version (1=rollback, 2=WAL) |
| 19 | 1 | Read format version |
| 24 | 4 | Change counter |
| 28 | 4 | Database size in pages |
| 32 | 4 | First freelist trunk page |
| 36 | 4 | Total freelist pages |
| 40 | 4 | Schema cookie |
| 56 | 4 | Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE) |

All multi-byte integers: **big-endian**.

## Page Types

| Flag | Type | Purpose |
|------|------|---------|
| 0x02 | Interior index | Index B-tree internal node |
| 0x05 | Interior table | Table B-tree internal node |
| 0x0a | Leaf index | Index B-tree leaf |
| 0x0d | Leaf table | Table B-tree leaf |
| - | Overflow | Payload exceeding cell capacity |
| - | Freelist | Unused pages (trunk or leaf) |

## B-tree Structure

Two B-tree types:
- **Table B-tree**: 64-bit rowid keys, stores row data
- **Index B-tree**: Arbitrary keys (index columns + rowid)

```
Interior page:  [ptr0] key1 [ptr1] key2 [ptr2] ...
                   │         │         │
                   ▼         ▼         ▼
               child     child     child
               pages     pages     pages

Leaf page:     key1:data  key2:data  key3:data ...
```

Page 1 always root of `sqlite_schema` table.

## Cell Format

### Table Leaf Cell
```
[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]
```

### Table Interior Cell
```
[left_child_page: u32] [rowid: varint]
```

### Index Cells
Similar but key is arbitrary (columns + rowid), not just rowid.

## Record Format (Payload)

```
[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
```

Serial types:
| Type | Meaning |
|------|---------|
| 0 | NULL |
| 1-4 | 1/2/3/4 byte signed int |
| 5 | 6 byte signed int |
| 6 | 8 byte signed int |
| 7 | IEEE 754 float |
| 8 | Integer 0 |
| 9 | Integer 1 |
| ≥12 even | BLOB, length=(N-12)/2 |
| ≥13 odd | Text, length=(N-13)/2 |

## Overflow Pages

When payload exceeds threshold, excess stored in overflow chain:
```
[next_page: u32] [data...]
```
Last page has next_page=0.

## Freelist

Linked list of trunk pages, each containing leaf page numbers:
```
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]
```

## Turso Implementation

Key files:
- `core/storage/sqlite3_ondisk.rs` - On-disk format, `PageType` enum
- `core/storage/btree.rs` - B-tree operations (large file)
- `core/storage/pager.rs` - Page management
- `core/storage/buffer_pool.rs` - Page caching

## Debugging Storage

```bash
# Integrity check
cargo run --bin tursodb test.db "PRAGMA integrity_check;"

# Page count
cargo run --bin tursodb test.db "PRAGMA page_count;"

# Freelist info
cargo run --bin tursodb test.db "PRAGMA freelist_count;"
```

## References

- [SQLite File Format](https://sqlite.org/fileformat.html)
- [SQLite B-Tree Module](https://sqlite.org/btreemodule.html)
- [SQLite Internals: Pages & B-trees](https://fly.io/blog/sqlite-internals-btree/)

Overview

This skill documents the on-disk storage format used in TursoDB, compatible with SQLite. It explains database file layout, page types, B-tree structures, record and cell formats, overflow handling, and the freelist. It also points to the Rust modules that implement these structures in Turso.

How this skill works

It inspects the first 100-byte database header and parses big-endian fields such as page size, page count, freelist pointers, and encoding. It walks B-tree pages (interior and leaf) to interpret cells and row payloads, follows overflow chains for large payloads, and reads freelist trunk/leaf lists to identify unused pages. The skill ties these concepts to the Turso Rust modules that implement pager, buffer pool, and B-tree logic.

When to use it

  • When you need to debug or inspect a Turso/SQLite file at the page level.
  • When implementing or modifying on-disk storage behavior in Turso.
  • When diagnosing corruption, freelist issues, or unexpected page counts.
  • When tracing how a row is stored across leaf/interior and overflow pages.
  • When optimizing page size or cache behavior for embedded deployments.

Best practices

  • Always treat multi-byte integers in the header and pages as big-endian.
  • Start analysis from page 1 (schema root) — it contains the database header and sqlite_schema root.
  • Follow overflow chains by reading the 4-byte next_page pointer at the start of each overflow page.
  • Use PRAGMA integrity_check and page_count to verify file consistency before manual edits.
  • Prefer reading Turso’s core/storage modules (ont-disk, btree, pager, buffer_pool) to see canonical parsing and invariants.

Example use cases

  • Locate and extract a row whose payload spans overflow pages by following overflow pointers and reassembling data segments.
  • Identify freed pages via freelist trunk entries to estimate reclaimable space or detect freelist corruption.
  • Confirm page size and encoding from the header to ensure client compatibility with database file.
  • Trace why a specific rowid is stored on a particular leaf by walking interior page pointers to the leaf page.
  • Validate an on-disk change by running PRAGMA integrity_check and comparing page counts before/after.

FAQ

How do I tell if the file uses WAL or rollback journaling?

Check byte 18 of the header: 1 means rollback, 2 means WAL write format version.

Where are large payloads stored?

When a cell payload exceeds the per-page threshold, excess bytes are stored on a chain of overflow pages; each overflow page begins with a 4-byte next_page pointer (0 = end).