home / skills / tursodatabase / turso / storage-format
This skill helps you understand and navigate SQLite-like on-disk storage structures, including pages, B-trees, and overflow handling.
npx playbooks add skill tursodatabase/turso --skill storage-formatReview the files below or copy the command above to add this skill to your agents.
---
name: storage-format
description: SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb
---
# Storage Format Guide
## Database File Structure
```
┌─────────────────────────────┐
│ Page 1: Header + Schema │ ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages │ ← Tables and indexes
│ Overflow pages │
│ Freelist pages │
└─────────────────────────────┘
```
Page size: power of 2, 512-65536 bytes. Default 4096.
## Database Header (First 100 Bytes)
| Offset | Size | Field |
|--------|------|-------|
| 0 | 16 | Magic: `"SQLite format 3\0"` |
| 16 | 2 | Page size (big-endian) |
| 18 | 1 | Write format version (1=rollback, 2=WAL) |
| 19 | 1 | Read format version |
| 24 | 4 | Change counter |
| 28 | 4 | Database size in pages |
| 32 | 4 | First freelist trunk page |
| 36 | 4 | Total freelist pages |
| 40 | 4 | Schema cookie |
| 56 | 4 | Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE) |
All multi-byte integers: **big-endian**.
## Page Types
| Flag | Type | Purpose |
|------|------|---------|
| 0x02 | Interior index | Index B-tree internal node |
| 0x05 | Interior table | Table B-tree internal node |
| 0x0a | Leaf index | Index B-tree leaf |
| 0x0d | Leaf table | Table B-tree leaf |
| - | Overflow | Payload exceeding cell capacity |
| - | Freelist | Unused pages (trunk or leaf) |
## B-tree Structure
Two B-tree types:
- **Table B-tree**: 64-bit rowid keys, stores row data
- **Index B-tree**: Arbitrary keys (index columns + rowid)
```
Interior page: [ptr0] key1 [ptr1] key2 [ptr2] ...
│ │ │
▼ ▼ ▼
child child child
pages pages pages
Leaf page: key1:data key2:data key3:data ...
```
Page 1 always root of `sqlite_schema` table.
## Cell Format
### Table Leaf Cell
```
[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]
```
### Table Interior Cell
```
[left_child_page: u32] [rowid: varint]
```
### Index Cells
Similar but key is arbitrary (columns + rowid), not just rowid.
## Record Format (Payload)
```
[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
```
Serial types:
| Type | Meaning |
|------|---------|
| 0 | NULL |
| 1-4 | 1/2/3/4 byte signed int |
| 5 | 6 byte signed int |
| 6 | 8 byte signed int |
| 7 | IEEE 754 float |
| 8 | Integer 0 |
| 9 | Integer 1 |
| ≥12 even | BLOB, length=(N-12)/2 |
| ≥13 odd | Text, length=(N-13)/2 |
## Overflow Pages
When payload exceeds threshold, excess stored in overflow chain:
```
[next_page: u32] [data...]
```
Last page has next_page=0.
## Freelist
Linked list of trunk pages, each containing leaf page numbers:
```
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]
```
## Turso Implementation
Key files:
- `core/storage/sqlite3_ondisk.rs` - On-disk format, `PageType` enum
- `core/storage/btree.rs` - B-tree operations (large file)
- `core/storage/pager.rs` - Page management
- `core/storage/buffer_pool.rs` - Page caching
## Debugging Storage
```bash
# Integrity check
cargo run --bin tursodb test.db "PRAGMA integrity_check;"
# Page count
cargo run --bin tursodb test.db "PRAGMA page_count;"
# Freelist info
cargo run --bin tursodb test.db "PRAGMA freelist_count;"
```
## References
- [SQLite File Format](https://sqlite.org/fileformat.html)
- [SQLite B-Tree Module](https://sqlite.org/btreemodule.html)
- [SQLite Internals: Pages & B-trees](https://fly.io/blog/sqlite-internals-btree/)
This skill documents the on-disk storage format used in TursoDB, compatible with SQLite. It explains database file layout, page types, B-tree structures, record and cell formats, overflow handling, and the freelist. It also points to the Rust modules that implement these structures in Turso.
It inspects the first 100-byte database header and parses big-endian fields such as page size, page count, freelist pointers, and encoding. It walks B-tree pages (interior and leaf) to interpret cells and row payloads, follows overflow chains for large payloads, and reads freelist trunk/leaf lists to identify unused pages. The skill ties these concepts to the Turso Rust modules that implement pager, buffer pool, and B-tree logic.
How do I tell if the file uses WAL or rollback journaling?
Check byte 18 of the header: 1 means rollback, 2 means WAL write format version.
Where are large payloads stored?
When a cell payload exceeds the per-page threshold, excess bytes are stored on a chain of overflow pages; each overflow page begins with a 4-byte next_page pointer (0 = end).