home / skills / plurigrid / asi / assembly-index

assembly-index skill

/skills/assembly-index

This skill assesses molecular complexity using assembly index to indicate biosignature potential and guide synthesis decisions.

npx playbooks add skill plurigrid/asi --skill assembly-index

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.0 KB
---
name: assembly-index
description: Lee Cronin's Assembly Theory for molecular complexity measurement and
version: 1.0.0
---

# Assembly Index Skill: Molecular Complexity Validation

**Status**: ✅ Production Ready
**Trit**: -1 (MINUS - validator/constraint)
**Color**: #2626D8 (Blue)
**Principle**: Complexity threshold → Life signature
**Frame**: Assembly pathways with minimal step counting

---

## Overview

**Assembly Index** measures molecular complexity by counting the minimum number of joining operations needed to construct a molecule from basic building blocks. Molecules with assembly index > 15 are biosignatures—too complex for random chemistry.

1. **Assembly pathway**: Shortest construction sequence
2. **Copy number threshold**: Abundance × complexity = life signal
3. **Molecular DAG**: Directed acyclic graph of substructures
4. **Mass spectrometry integration**: MA(m/z) measurement

## Core Formula

```
MA(molecule) = min |steps| to construct from primitives
Life threshold: MA > 15 with copy_number > 1
```

```python
def assembly_index(molecule: Molecule) -> int:
    """Compute minimum assembly steps via dynamic programming."""
    substructures = enumerate_substructures(molecule)
    dag = build_assembly_dag(substructures)
    return shortest_path_length(dag, source="primitives", target=molecule)
```

## Key Concepts

### 1. Assembly Pathway Enumeration

```python
class AssemblyPathway:
    def __init__(self, molecule):
        self.mol = molecule
        self.fragments = self.decompose()
    
    def decompose(self) -> list[Fragment]:
        """Find all valid bond-breaking decompositions."""
        return [split for split in self.mol.bonds 
                if split.yields_valid_fragments()]
    
    def minimal_pathway(self) -> list[JoinOperation]:
        """DP over fragment DAG for minimum steps."""
        memo = {}
        return self._dp_assemble(self.mol, memo)
```

### 2. Copy Number Amplification

```python
def is_biosignature(molecule, sample) -> bool:
    ma = assembly_index(molecule)
    copies = sample.count(molecule)
    # Life creates copies of complex molecules
    return ma > 15 and copies > 1
```

### 3. Tandem Mass Spectrometry Integration

```python
def ma_from_ms2(spectrum: MS2Spectrum) -> float:
    """Estimate assembly index from fragmentation pattern."""
    fragments = spectrum.peaks
    dag = reconstruct_assembly_dag(fragments)
    return dag.longest_path()
```

---

## End-of-Skill Interface

## Commands

```bash
# Compute assembly index
just assembly-index molecule.sdf

# Validate biosignature threshold
just assembly-validate sample.ms2

# Compare assembly pathways
just assembly-compare mol1.sdf mol2.sdf
```

## Integration with GF(3) Triads

```
assembly-index (-1) ⊗ turing-chemputer (0) ⊗ crn-topology (+1) = 0 ✓  [Molecular Complexity]
```

## Related Skills

- **turing-chemputer** (0): Execute chemical synthesis programs
- **crn-topology** (+1): Generate reaction network topologies
- **kolmogorov-compression** (-1): Algorithmic complexity baseline

## r2con Speaker Resources

| Speaker | Relevance | Repository/Talk |
|---------|-----------|-----------------|
| **oddcoder** | RAIR assembly analysis | [rair-core](https://github.com/rair-project/rair-core) |
| **mr_phrazer** | MBA complexity (msynth) | [msynth](https://github.com/mrphrazer/msynth) |
| **pancake** | Core r2 assembly | [radare2](https://github.com/radareorg/radare2) |

---

**Skill Name**: assembly-index
**Type**: Complexity Validator
**Trit**: -1 (MINUS)
**Color**: #2626D8 (Blue)

## SDF Interleaving

This skill connects to **Software Design for Flexibility** (Hanson & Sussman, 2021):

### Primary Chapter: 1. Flexibility through Abstraction

**Concepts**: combinators, compose, parallel-combine, spread-combine, arity

### GF(3) Balanced Triad

```
assembly-index (○) + SDF.Ch1 (+) + [balancer] (−) = 0
```

**Skill Trit**: 0 (ERGODIC - coordination)

### Secondary Chapters

- Ch4: Pattern Matching
- Ch7: Propagators

### Connection Pattern

Combinators compose operations. This skill provides composable abstractions.

Overview

This skill implements Lee Cronin’s Assembly Theory to measure molecular complexity by counting the minimal joining operations required to build a molecule from primitive building blocks. It flags candidate biosignatures by combining an assembly index threshold with observed copy number. The tool integrates structure decomposition, directed acyclic assembly graphs, and mass-spectrometry-based estimation for practical validation workflows.

How this skill works

The skill enumerates valid bond-breaking decompositions to generate substructures and builds a molecular DAG where nodes are fragments and edges are join operations. A dynamic programming shortest-path search yields the minimal number of assembly steps (the assembly index). For samples, the assembly index is combined with measured copy number to mark likely biosignatures. It can also estimate assembly index from MS2 fragmentation patterns by reconstructing fragment connectivity.

When to use it

  • Screen molecules from environmental or planetary samples for complex, non-random chemistry signatures
  • Prioritize targets for synthesis or follow-up analysis based on structural complexity
  • Interpret tandem mass spectrometry (MS2) datasets to infer assembly-like fragment relationships
  • Compare relative complexity of candidate molecules across samples or experiments
  • Validate whether observed abundances of complex molecules indicate biological processes

Best practices

  • Use high-quality structural representations (SDF/MOL) to ensure accurate fragment enumeration
  • Pre-filter trivial small molecules to avoid false positives below the complexity threshold
  • Combine assembly index with copy-number or abundance metadata before declaring biosignatures
  • When using MS2, apply noise filtering and peak annotation to improve fragment-to-DAG reconstruction
  • Cache substructure decompositions and reuse DAG fragments for batch processing to reduce runtime

Example use cases

  • Compute assembly index for a library of synthesized compounds to rank synthetic difficulty
  • Analyze planetary MS2 data to flag molecules with MA > 15 and nontrivial copy number as biosignature candidates
  • Compare minimal assembly pathways of two isomeric products to identify convergent complexity
  • Validate outputs of generative chemistry pipelines by checking assembly complexity against expected distributions
  • Integrate with reaction-network tools to explore plausible assembly routes and experimental synthesis plans

FAQ

What numeric threshold indicates a biosignature?

An assembly index greater than 15 combined with a copy number above 1 is the operational biosignature criterion used here.

Can I estimate assembly index from MS2 without full structures?

Yes. The skill can reconstruct an assembly-like DAG from annotated MS2 fragments and provide an estimated MA, but results depend on fragmentation coverage and peak annotation quality.

Is the assembly index sensitive to representation details?

Yes. Different tautomeric, protonation, or stereochemical representations can change fragmenting behavior. Use standardized input formats and preprocessing.