home / skills / plurigrid / asi / cantordust-viz

cantordust-viz skill

/skills/cantordust-viz

This skill visualizes binary data to reveal human-perceptible patterns, helping analysts compare binaries, verify embeddings, and detect obfuscation.

npx playbooks add skill plurigrid/asi --skill cantordust-viz

Review the files below or copy the command above to add this skill to your agents.

Files (10)
SKILL.md
3.5 KB
---
name: cantordust-viz
description: Binary visualization for human pattern recognition - Ghidra plugin by Chris Domas (xoreaxeaxeax)
version: 1.0.0
---

# Cantordust Binary Visualization

> **Use when embeddings fail: humans see patterns algorithms miss.**

Visual binary analysis tool for Ghidra. Converts binary data to bitmaps/visualizations where structural patterns become visible to human pattern recognition.

## GF(3) Triad

```
cantordust-viz (-1) ⊗ skill-embedding-vss (0) ⊗ radare2-hatchery (+1) = 0 ✓
```

## Lineage: 2020 Binary Analysis

| Tool | Approach | Strength |
|------|----------|----------|
| **Cantordust** | Visual/human | Sees patterns ML misses |
| **Zignatures** | Soft signatures | Fuzzy matching + keyspace reduction |
| **skill-embedding-vss** | MLX embeddings | O(1) similarity at scale |

## Installation

```bash
git clone https://github.com/Battelle/cantordust.git
# Add to Ghidra Script Manager
```

## Key Insight

From xoreaxeaxeax's work:
- **movfuscator**: All x86 can be MOV (Turing-complete)
- **sandsifter**: Fuzzing reveals undocumented CPU instructions
- **Cantordust**: Binary structure visible in 2D projections

## When to Use

1. **Embedding similarity unclear** → visualize both binaries
2. **Obfuscation suspected** → visual patterns survive obfuscation
3. **Cross-architecture comparison** → structural similarity visible
4. **Malware family classification** → visual fingerprinting

## xoreaxeaxeax Ecosystem (19K+ stars)

| Repo | Stars | Category |
|------|-------|----------|
| movfuscator | 10,075 | obfuscation |
| sandsifter | 4,998 | hardware security |
| rosenbridge | 2,380 | hardware backdoors |
| REpsych | 1,031 | anti-RE |

## Integration with skill-embedding-vss

```python
# When embeddings show high similarity but you want visual confirmation
from cantordust import visualize_binary
from skill_embedding_vss import SkillEmbeddingVSS

vss = SkillEmbeddingVSS('/path/to/skills')
similar = vss.find_nearest('target', k=5)

# Visual confirm top matches
for name, dist in similar[:3]:
    visualize_binary(f'/path/to/{name}')  # Human reviews
```

## References

- [Cantordust GitHub](https://github.com/Battelle/cantordust)
- [Battelle Blog Post](https://inside.battelle.org/blog-details/battelle-publishes-open-source-binary-visualization-tool)
- [DEF CON talks by xoreaxeaxeax](https://www.youtube.com/results?search_query=xoreaxeaxeax+defcon)

## Cantordust ↔ Gay.jl Bridge

```julia
# cantordust_gay_bridge.jl connects:
# 1. Cantordust 2-tuple byte pair visualization
# 2. CJ Carr spectral features (diffusion transformers)  
# 3. Gay.jl deterministic coloring (SPI)

result = analyze_binary_with_gay("target.bin")
# Returns: matrix, diagonal_score, ascii_score, trit_sum, sample_colors
```

## Pattern Theory

| Domain | Representation | Gay.jl Mapping |
|--------|----------------|----------------|
| Binary (Cantordust) | 2-tuple → 256×256 | entropy → trit → color |
| Audio (CJ Carr) | Mel spectrogram | centroid/flatness → HSL |
| Color (Gay.jl) | SplitMix64 + golden angle | SPI deterministic |

## SDF Interleaving

This skill connects to **Software Design for Flexibility** (Hanson & Sussman, 2021):

### Primary Chapter: 4. Pattern Matching

**Concepts**: unification, match, segment variables, pattern

### GF(3) Balanced Triad

```
cantordust-viz (−) + SDF.Ch4 (+) + [balancer] (○) = 0
```

**Skill Trit**: -1 (MINUS - verification)


### Connection Pattern

Pattern matching extracts structure. This skill recognizes and transforms patterns.

Overview

This skill provides binary visualization for human pattern recognition inside Ghidra. It converts raw binary bytes into 2D visual projections where structural and repetitive patterns become immediately visible. Use it to complement automated embeddings and signature systems when human intuition can reveal anomalies or families.

How this skill works

The plugin maps byte pair tuples into a fixed 2D bitmap (commonly 256×256) and applies deterministic coloring to expose entropy, repetition, and structural motifs. Visual outputs include bitmap images and spectral summaries that highlight control-flow, data regions, and obfuscation artifacts. It integrates with analysis workflows so you can visualize candidates returned by embeddings or search tools for manual confirmation.

When to use it

  • When embedding or similarity scores are ambiguous and you need a visual confirmation
  • When you suspect obfuscation and want patterns that survive syntactic transformations
  • When comparing binaries across architectures for structural likenesses
  • When classifying malware families via visual fingerprinting
  • When triaging large result sets quickly by eye

Best practices

  • Run visualizations on both target and top embedding matches to spot genuine similarity vs false positives
  • Combine bitmap views with ASCII/spectral summaries to separate code from data regions
  • Use consistent coloring and scaling across comparisons to avoid visual bias
  • Inspect multiple projections (byte-pair, diagonal scores, trit sums) to reveal different structural features
  • Treat visualization as a verifier, not a definitive classifier — follow up with conventional static analysis

Example use cases

  • Confirm whether two high-similarity embeddings actually share implementation patterns
  • Detect packed or metamorphic code where textual signatures fail
  • Quickly triage candidate samples from a bulk search to prioritize deeper analysis
  • Compare firmware images across CPU architectures for reuse or shared modules
  • Visualize Mutation or obfuscation effects to evaluate robustness of automated detectors

FAQ

Does visualization replace automated analysis?

No. Visualization is a human-centric verifier that highlights patterns algorithms can miss. It complements, not replaces, disassembly and automated similarity methods.

Can visual fingerprints be compared automatically?

Yes — outputs can be converted to feature vectors or spectral metrics for automated comparison, but human review remains valuable for ambiguous cases.