home / skills / 2389-research / claude-plugins / triage

triage skill

/binary-re/skills/triage

This skill performs fast binary triage to identify architecture, ABI, and dependencies for quick initial assessment.

npx playbooks add skill 2389-research/claude-plugins --skill triage

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
6.6 KB
---
name: binary-re:triage
description: Use when first encountering an unknown binary, ELF file, executable, or firmware blob. Fast fingerprinting via rabin2 - architecture detection (ARM, x86, MIPS), ABI identification, dependency mapping, string extraction. Keywords - "what is this binary", "identify architecture", "check file type", "rabin2", "file analysis", "quick scan"
---

# Binary Triage (Phase 1)

## Purpose

Quick fingerprinting to establish baseline facts before deeper analysis. Runs in seconds, not minutes.

## When to Use

- First contact with an unknown binary
- Need architecture/ABI info for tool selection
- Quick capability assessment
- Before committing to expensive analysis

## Key Principle

**Gather facts fast, defer analysis.**

This phase identifies WHAT the binary is, not HOW it works.

## Triage Sequence

### Step 1: File Identification

```bash
# Basic identification
file binary

# Expected output patterns:
# ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3
# ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1
```

**Extract:**
- Architecture (ARM, ARM64, x86_64, MIPS)
- Bit width (32/64)
- Endianness (LSB/MSB)
- Link type (static/dynamic)
- Interpreter path (libc indicator)

### Step 2: Structured Metadata (rabin2)

```bash
# All metadata as JSON
rabin2 -q -j -I binary | jq .

# Key fields:
# .arch     - "arm", "x86", "mips"
# .bits     - 32 or 64
# .endian   - "little" or "big"
# .os       - "linux", "none"
# .machine  - "ARM", "AARCH64"
# .stripped - true/false
# .static   - true/false
```

### Step 3: ABI Detection

```bash
# Interpreter detection
readelf -p .interp binary 2>/dev/null

# Or via rabin2
rabin2 -I binary | grep interp

# ARM-specific: float ABI
readelf -A binary | grep "Tag_ABI_VFP_args"
# hard-float: "VFP registers"
# soft-float: missing or "compatible"
```

**Interpreter → Libc mapping:**

| Interpreter | Libc | Notes |
|-------------|------|-------|
| `/lib/ld-linux-armhf.so.3` | glibc | ARM hard-float |
| `/lib/ld-linux.so.3` | glibc | ARM soft-float |
| `/lib/ld-musl-arm.so.1` | musl | ARM 32-bit |
| `/lib/ld-musl-aarch64.so.1` | musl | ARM 64-bit |
| `/lib/ld-uClibc.so.0` | uClibc | Embedded |
| `/lib64/ld-linux-x86-64.so.2` | glibc | x86_64 |

### Step 4: Dependencies

```bash
# Library dependencies
rabin2 -q -j -l binary | jq '.libs[]'

# Common patterns:
# libcurl.so.* → HTTP client
# libssl.so.* → TLS/crypto
# libpthread.so.* → Threading
# libz.so.* → Compression
# libsqlite3.so.* → Local database
```

### Step 5: Entry Points & Exports

```bash
# Entry points
rabin2 -q -j -e binary | jq .

# Exports (for shared libraries)
rabin2 -q -j -E binary | jq '.exports[] | {name, vaddr}'
```

### Step 6: Quick String Scan

```bash
# All strings with metadata
rabin2 -q -j -zz binary | jq '.strings | length'  # Count first

# Filter interesting strings (URLs, paths, errors)
rabin2 -q -j -zz binary | jq '
  .strings[] |
  select(.length > 8) |
  select(.string | test("http|ftp|/etc|/var|error|fail|pass|key|token"; "i"))
'
```

### Step 7: Import Analysis

```bash
# All imports
rabin2 -q -j -i binary | jq '.imports[] | {name, lib}'

# Group by capability
rabin2 -q -j -i binary | jq '
  .imports | group_by(.lib) |
  map({lib: .[0].lib, functions: [.[].name]})
'
```

## Capability Mapping

| Import Pattern | Capability |
|----------------|------------|
| `socket`, `connect`, `send` | Network client |
| `bind`, `listen`, `accept` | Network server |
| `open`, `read`, `write` | File I/O |
| `fork`, `exec*`, `system` | Process spawning |
| `pthread_*` | Multi-threading |
| `SSL_*`, `EVP_*` | Cryptography |
| `dlopen`, `dlsym` | Dynamic loading |
| `mmap`, `mprotect` | Memory manipulation |

## Output Format

After triage, record structured facts:

```json
{
  "artifact": {
    "path": "/path/to/binary",
    "sha256": "abc123...",
    "size_bytes": 245760
  },
  "identification": {
    "arch": "arm",
    "bits": 32,
    "endian": "little",
    "os": "linux",
    "stripped": true,
    "static": false
  },
  "abi": {
    "interpreter": "/lib/ld-musl-arm.so.1",
    "libc": "musl",
    "float_abi": "hard"
  },
  "dependencies": [
    "libcurl.so.4",
    "libssl.so.1.1",
    "libz.so.1"
  ],
  "capabilities_inferred": [
    "network_client",
    "tls_encryption",
    "compression"
  ],
  "strings_of_interest": [
    {"value": "https://api.vendor.com/telemetry", "type": "url"},
    {"value": "/etc/config.json", "type": "path"}
  ],
  "complexity_estimate": {
    "functions": "unknown (stripped)",
    "strings": 847,
    "imports": 156
  }
}
```

## Knowledge Journaling

After triage completes, record findings for episodic memory:

```
[BINARY-RE:triage] {filename} (sha256: {hash})

Identification:
  Architecture: {arch} {bits}-bit {endian}
  Libc: {glibc|musl|uclibc} ({interpreter_path})
  Stripped: {yes|no}
  Size: {bytes}

FACT: Links against {library} (source: rabin2 -l)
FACT: Contains {N} strings of interest (source: rabin2 -zz)
FACT: Imports {function} from {library} (source: rabin2 -i)

Capabilities inferred:
  - {capability_1} (evidence: {import/string})
  - {capability_2} (evidence: {import/string})

HYPOTHESIS: {what binary likely does} (confidence: {0.0-1.0})

QUESTION: {open unknown that needs investigation}

Next phase: {static-analysis|dynamic-analysis}
Sysroot needed: {path or "extract from device"}
```

### Example Journal Entry

```
[BINARY-RE:triage] thermostat_daemon (sha256: a1b2c3d4...)

Identification:
  Architecture: ARM 32-bit LE
  Libc: musl (/lib/ld-musl-arm.so.1)
  Stripped: yes
  Size: 153,600 bytes

FACT: Links against libcurl.so.4 (source: rabin2 -l)
FACT: Links against libssl.so.1.1 (source: rabin2 -l)
FACT: Contains string "api.thermco.com" (source: rabin2 -zz)
FACT: Imports curl_easy_perform (source: rabin2 -i)

Capabilities inferred:
  - HTTP client (evidence: libcurl import)
  - TLS encryption (evidence: libssl import)
  - Network communication (evidence: URL string)

HYPOTHESIS: Telemetry client that reports to api.thermco.com (confidence: 0.6)

QUESTION: What data does it collect and transmit?

Next phase: static-analysis
Sysroot needed: musl ARM (extract from device or Alpine)
```

## Decision Points

After triage, determine:

1. **Sysroot selection** - Based on arch + libc
2. **Analysis tool chain** - r2 vs Ghidra vs both
3. **Dynamic analysis feasibility** - QEMU viability based on arch
4. **Initial hypotheses** - What does this binary likely do?

## Next Steps

→ Proceed to `binary-re-static-analysis` for function enumeration
→ Or `binary-re-dynamic-analysis` if behavior observation is priority

Overview

This skill provides a fast, practical triage workflow for unknown binaries, ELF files, executables, and firmware blobs. It fingerprints architecture, ABI, dependencies, strings, imports, and basic capabilities in seconds to establish facts before deeper analysis. Use it to choose tools, sysroots, and next-phase actions quickly.

How this skill works

The skill runs quick file identification and structured metadata extraction (rabin2/readelf) to report arch, bit width, endianness, link type, interpreter, and stripped/static flags. It enumerates dependencies, imports, entry points, exports, and performs a targeted string scan to surface URLs, paths, errors, and keys. Outputs are mapped to inferred capabilities and a concise JSON fact record for journaling.

When to use it

  • First contact with an unknown binary or firmware blob
  • When you need architecture/ABI info to pick analysis tools
  • Before committing time to static or dynamic analysis
  • To rapidly assess networking, crypto, or file I/O capabilities
  • When preparing a sysroot or QEMU setup for emulation

Best practices

  • Run file and rabin2 -j metadata immediately to capture arch, bits, endian, and stripped/static flags
  • Check the interpreter to infer libc implementation and float ABI for ARM
  • Map imports to capability buckets (network, crypto, threading) rather than assuming intent
  • Record structured facts (artifact metadata, identification, ABI, deps, strings) for reproducible journaling
  • Defer deep code analysis until triage yields a clear hypothesis and required sysroot/toolchain

Example use cases

  • Identify that a firmware blob is ARM 32-bit LE, musl-linked, and uses libcurl/libssl — suggests a telemetry client
  • Detect a stripped x86_64 binary that is dynamically linked to sqlite3 and zlib to prioritize data storage analysis
  • Confirm MIPS big-endian and uClibc for selecting cross-toolchain and QEMU configuration
  • Surface hard-coded URLs or API endpoints quickly to drive threat hunting or network blocking rules
  • Decide whether dynamic analysis is feasible in QEMU based on architecture and interpreter presence

FAQ

What outputs should I trust from rabin2 vs file?

Use file for a quick human-readable type and rabin2 -j for structured machine-readable metadata; cross-check interpreter and .interp via readelf for ABI details.

How do I infer libc and float ABI on ARM?

Interpreter path reveals libc (glibc/musl/uClibc). Use readelf -A to check Tag_ABI_VFP_args or VFP registers for hard vs soft float.