home / skills / simhacker / moollm / data-flow

data-flow skill

/skills/data-flow

This skill helps you model data pipelines as rooms and objects, enabling modular processing with inbox and outbox queues.

npx playbooks add skill simhacker/moollm --skill data-flow

Review the files below or copy the command above to add this skill to your agents.

Files (5)
SKILL.md
3.1 KB
---
name: data-flow
description: Rooms as pipeline nodes, exits as edges, objects as messages
allowed-tools:
  - read_file
  - write_file
  - run_terminal_cmd
tier: 2
protocol: DATA-FLOW
tags: [moollm, pipeline, processing, kilroy, messaging]
related: [room, card, sister-script, coherence-engine]
---

# Data Flow

> *"Rooms are nodes. Exits are edges. Thrown objects are messages."*

MOOLLM's approach to building processing pipelines using rooms and objects. The filesystem IS the data flow network.

## The Pattern

- **Rooms** are processing stages (nodes)
- **Exits** connect stages (edges)
- **Objects** flow through as messages
- **THROW** sends objects through exits
- **INBOX** receives incoming objects
- **OUTBOX** stages outgoing objects

## Commands

| Command | Effect |
|---------|--------|
| `THROW obj exit` | Send object through exit to destination |
| `INBOX` | List items waiting to be processed |
| `NEXT` | Get next item from inbox (FIFO) |
| `PEEK` | Look at next item without removing |
| `STAGE obj exit` | Add object to outbox for later throw |
| `FLUSH` | Throw all staged objects |
| `FLUSH exit` | Throw staged objects for specific exit |

## Room Structure

```
stage/
├── ROOM.yml       # Config and processor definition
├── inbox/         # Incoming queue (FIFO)
├── outbox/        # Staged for batch throwing
└── door-next/     # Exit to next stage
```

## Processor Types

### Script (Deterministic)

```yaml
processor:
  type: script
  command: "python parse.py ${input}"
```

### LLM (Semantic)

```yaml
processor:
  type: llm
  prompt: |
    Analyze this document:
    - Extract key entities
    - Summarize in 3 sentences
```

### Hybrid

```yaml
processor:
  type: hybrid
  pre_process: "extract.py ${input}"
  llm_prompt: "Analyze extracted data"
  post_process: "format.py ${output}"
```

**Mix and match.** LLM for reasoning, scripts for transformation.

## Example Pipeline

```
uploads/              # Raw files land here
├── inbox/
│   ├── doc-001.pdf
│   └── doc-002.pdf
└── door-parser/

parser/               # Extract text
├── script: parse.py
└── door-analyzer/

analyzer/             # LLM analyzes
├── prompt: "Summarize..."
├── door-output/
└── door-errors/

output/               # Final results
└── inbox/
    ├── doc-001-summary.yml
    └── doc-002-summary.yml
```

## Processing Loop

```
> ENTER parser
Inbox: 2 items waiting.

> NEXT
Processing doc-001.pdf...
Text extracted.

> STAGE doc-001.txt door-analyzer
Staged.

> FLUSH
Throwing 2 items through door-analyzer...
```

## Fan-Out (one-to-many)

```yaml
routing_rules:
  - if: "priority == 'high'"
    throw_to: door-fast-track
  - if: "type == 'archive'"
    throw_to: door-archive
  - default: door-standard
```

## Fan-In (many-to-one)

```yaml
batch_size: 10
on_batch_complete: |
  Combine all results
  Generate summary report
  THROW report.yml door-output
```

## Kilroy Mapping

| MOOLLM | Kilroy |
|--------|--------|
| Room | Node |
| Exit | Edge |
| THROW | Message passing |
| inbox/ | Input queue |
| Script processor | Deterministic module |
| LLM processor | LLM node |

Overview

This skill models data-processing pipelines as rooms, exits, and objects so the filesystem becomes the network. It treats rooms as nodes, exits as edges, and thrown objects as messages, making pipelines explicit and inspectable. Use it to compose deterministic scripts and LLM stages, route messages, and handle batching and fan-in/fan-out flows.

How this skill works

Each room contains an inbox (incoming messages), an outbox (staged messages) and doors that act as exits to other rooms. Processors attached to rooms run as scripts, LLM prompts, or hybrids that combine extraction, LLM reasoning, and post-processing. Commands like THROW, NEXT, STAGE, and FLUSH move or inspect objects so you can orchestrate synchronous and asynchronous flows using the filesystem as the transport.

When to use it

  • Building file-driven pipelines where visibility into queues and messages is required.
  • Combining deterministic preprocessing scripts with LLM reasoning stages.
  • Implementing fan-out routing rules or fan-in batch aggregation behaviors.
  • Debugging or iterating on pipeline stages locally using simple commands.
  • Orchestrating workflows without a separate message broker or complex infra.

Best practices

  • Design rooms with single responsibility: one processing concern per node.
  • Keep inbox/outbox file formats consistent (YAML/JSON) for interoperability.
  • Use STAGE and FLUSH to batch sends and reduce noisy file operations.
  • Define clear routing rules for fan-out and batch_size/on_batch_complete for fan-in.
  • Mix LLM nodes for semantic reasoning and script nodes for deterministic transforms.

Example use cases

  • Ingest PDFs into an uploads room, parse text in a parser room, then send to an analyzer LLM room to produce summaries.
  • Route high-priority messages to a fast-track door while archiving low-priority items using routing_rules.
  • Aggregate results from many analyzer rooms into batches of 10 and emit combined reports when a batch completes.
  • Use a hybrid processor to extract structured data, pass it to an LLM for enrichment, then format with a script.
  • Locally debug a stage by ENTERing a room, NEXT an item, run the processor, STAGE outputs, and FLUSH to continue the flow.

FAQ

How do I ensure FIFO ordering?

Use the inbox directory as a FIFO queue and rely on NEXT to pull the next item; avoid out-of-band file manipulations that break ordering.

Can I mix LLM and script processors in one pipeline?

Yes. Define rooms with script, llm, or hybrid processors to combine deterministic transforms and semantic reasoning at different stages.