home / skills / plurigrid / asi / assembly-index

assembly-index skill

/skills/assembly-index

This skill evaluates molecular complexity using assembly index to detect biosignatures by computing minimal assembly steps and applying a life threshold.

npx playbooks add skill plurigrid/asi --skill assembly-index

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.2 KB
---
name: assembly-index
description: Lee Cronin's Assembly Theory for molecular complexity measurement and
  life detection via assembly index computation.
license: UNLICENSED
metadata:
  trit: 1
  source: local
---

# Assembly Index Skill: Molecular Complexity Validation

**Status**: ✅ Production Ready
**Trit**: -1 (MINUS - validator/constraint)
**Color**: #2626D8 (Blue)
**Principle**: Complexity threshold → Life signature
**Frame**: Assembly pathways with minimal step counting

---

## Overview

**Assembly Index** measures molecular complexity by counting the minimum number of joining operations needed to construct a molecule from basic building blocks. Molecules with assembly index > 15 are biosignatures—too complex for random chemistry.

1. **Assembly pathway**: Shortest construction sequence
2. **Copy number threshold**: Abundance × complexity = life signal
3. **Molecular DAG**: Directed acyclic graph of substructures
4. **Mass spectrometry integration**: MA(m/z) measurement

## Core Formula

```
MA(molecule) = min |steps| to construct from primitives
Life threshold: MA > 15 with copy_number > 1
```

```python
def assembly_index(molecule: Molecule) -> int:
    """Compute minimum assembly steps via dynamic programming."""
    substructures = enumerate_substructures(molecule)
    dag = build_assembly_dag(substructures)
    return shortest_path_length(dag, source="primitives", target=molecule)
```

## Key Concepts

### 1. Assembly Pathway Enumeration

```python
class AssemblyPathway:
    def __init__(self, molecule):
        self.mol = molecule
        self.fragments = self.decompose()
    
    def decompose(self) -> list[Fragment]:
        """Find all valid bond-breaking decompositions."""
        return [split for split in self.mol.bonds 
                if split.yields_valid_fragments()]
    
    def minimal_pathway(self) -> list[JoinOperation]:
        """DP over fragment DAG for minimum steps."""
        memo = {}
        return self._dp_assemble(self.mol, memo)
```

### 2. Copy Number Amplification

```python
def is_biosignature(molecule, sample) -> bool:
    ma = assembly_index(molecule)
    copies = sample.count(molecule)
    # Life creates copies of complex molecules
    return ma > 15 and copies > 1
```

### 3. Tandem Mass Spectrometry Integration

```python
def ma_from_ms2(spectrum: MS2Spectrum) -> float:
    """Estimate assembly index from fragmentation pattern."""
    fragments = spectrum.peaks
    dag = reconstruct_assembly_dag(fragments)
    return dag.longest_path()
```

## Commands

```bash
# Compute assembly index
just assembly-index molecule.sdf

# Validate biosignature threshold
just assembly-validate sample.ms2

# Compare assembly pathways
just assembly-compare mol1.sdf mol2.sdf
```

## Integration with GF(3) Triads

```
assembly-index (-1) ⊗ turing-chemputer (0) ⊗ crn-topology (+1) = 0 ✓  [Molecular Complexity]
```

## Related Skills

- **turing-chemputer** (0): Execute chemical synthesis programs
- **crn-topology** (+1): Generate reaction network topologies
- **kolmogorov-compression** (-1): Algorithmic complexity baseline

---

**Skill Name**: assembly-index
**Type**: Complexity Validator
**Trit**: -1 (MINUS)
**Color**: #2626D8 (Blue)

Overview

This skill computes the Assembly Index, a topological measure of molecular complexity derived from minimal construction steps. It flags high-complexity molecules as candidate biosignatures when complexity and copy-number thresholds are both met. The implementation uses fragment enumeration, directed acyclic assembly graphs, and can ingest mass-spectrometry fragmentation data.

How this skill works

The skill enumerates valid substructure decompositions and builds a molecular DAG linking primitives to the target molecule. Dynamic programming finds the shortest join-path (minimal number of joining operations) to produce the molecule, returning the Assembly Index (MA). It can also reconstruct assembly information from MS2 fragmentation patterns to estimate MA for molecules observed by tandem mass spectrometry. A simple life-detection rule marks molecules with MA > 15 and observed copy number > 1 as biosignatures.

When to use it

  • Assess whether a detected molecule is plausibly produced by random chemistry or requires biological processes
  • Compare structural complexity between molecules or monitor complexity changes across samples
  • Integrate with MS2 workflows to estimate complexity directly from fragmentation spectra
  • Filter large molecular datasets to prioritize high-complexity candidates for follow-up
  • Validate synthetic routes by comparing assembly pathways and minimal join steps

Best practices

  • Provide canonical molecular representations (e.g., standardized SDF/SMILES) to ensure consistent substructure enumeration
  • Pre-filter noisy MS2 spectra to improve fragment reconstruction accuracy before estimating MA
  • Use copy-number or abundance measurements alongside MA; complexity alone is not a definitive biosignature
  • Report assembly pathways and intermediate fragments for auditability and experimental planning
  • Calibrate MA thresholds on known standards for the instrument and sample matrix before deployment

Example use cases

  • Screen environmental MS2 datasets to find molecules with MA > 15 and nontrivial abundance as life-detection candidates
  • Compare assembly pathways of an unknown compound and a synthesized standard to validate identification
  • Rank molecules in a chemical inventory by topological construction cost to prioritize synthetic targets
  • Integrate into an automated chemputer pipeline to choose routes that minimize assembly steps and resource use

FAQ

What exactly does the Assembly Index measure?

It measures the minimum number of joining operations required to assemble a molecule from defined primitive building blocks, computed via the shortest path on an assembly DAG.

Why use MA > 15 as a threshold?

Empirical and theoretical analyses indicate molecules with MA above ~15 are extremely unlikely to arise from random, abiotic chemistry at appreciable copy numbers; combined with abundance, this forms a practical biosignature rule.