home / skills / plurigrid / asi / system2-attention

system2-attention skill

/skills/system2-attention

This skill validates and refines transformer attention by regenerating context and performing a deliberate second pass to improve reasoning and factual

npx playbooks add skill plurigrid/asi --skill system2-attention

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.2 KB
---
name: system2-attention
description: System 2 attention mechanisms for deliberate, slow reasoning in transformer
  architectures.
license: UNLICENSED
metadata:
  trit: 0
  source: local
---

# System 2 Attention Skill: Deliberate Reasoning Validation

**Status**: ✅ Production Ready
**Trit**: -1 (MINUS - validator/constraint)
**Color**: #2626D8 (Blue)
**Principle**: Filter noise via deliberate re-attention
**Frame**: Two-stage attention with explicit reasoning

---

## Overview

**System 2 Attention** (S2A) validates and filters transformer attention by regenerating context deliberately. Standard attention (System 1) is fast but susceptible to sycophancy and irrelevant context. S2A re-attends after explicit reasoning.

1. **Context regeneration**: LLM rewrites context removing irrelevant info
2. **Two-pass attention**: Fast then deliberate
3. **Sycophancy reduction**: Filter opinion-seeking noise
4. **Factual grounding**: Anchor to verified facts

## Core Pattern

```
S2A(x, context):
  # System 1: fast pattern matching
  context_filtered = LLM("Extract only relevant facts from: {context}")
  
  # System 2: deliberate reasoning on clean context
  return LLM(x, context=context_filtered)
```

```python
def system2_attention(query: str, context: str, model) -> str:
    # Stage 1: Regenerate context (remove sycophantic/irrelevant)
    filter_prompt = f"""Given the context below, extract only the 
    objective facts relevant to answering questions. Remove opinions,
    leading questions, and irrelevant details.
    
    Context: {context}
    
    Relevant facts only:"""
    
    clean_context = model.generate(filter_prompt)
    
    # Stage 2: Answer with filtered context
    return model.generate(query, context=clean_context)
```

## Key Concepts

### 1. Context Filtering

```python
class S2AFilter:
    def __init__(self, model):
        self.model = model
    
    def filter_sycophancy(self, context: str) -> str:
        """Remove opinion-seeking and leading content."""
        return self.model.generate(
            f"Rewrite removing any opinions or leading questions:\n{context}"
        )
    
    def filter_irrelevant(self, context: str, query: str) -> str:
        """Keep only query-relevant facts."""
        return self.model.generate(
            f"Extract facts from context relevant to: {query}\n\n{context}"
        )
```

### 2. Two-Pass Architecture

```python
class System2AttentionLayer:
    def __init__(self, base_attention, filter_model):
        self.attn = base_attention
        self.filter = filter_model
    
    def forward(self, q, k, v, context_mask=None):
        # Pass 1: Standard attention (System 1)
        attn_weights = self.attn(q, k, v)
        
        # Identify high-entropy (uncertain) positions
        entropy = -torch.sum(attn_weights * torch.log(attn_weights + 1e-9), dim=-1)
        uncertain = entropy > self.threshold
        
        # Pass 2: Deliberate re-attention on uncertain positions
        if uncertain.any():
            filtered_kv = self.filter(k, v, uncertain)
            attn_weights[uncertain] = self.attn(q[uncertain], filtered_kv)
        
        return attn_weights
```

### 3. Factual Grounding Validator

```python
def validate_factual_grounding(response: str, facts: list[str]) -> float:
    """Score response grounding in verified facts."""
    claims = extract_claims(response)
    grounded = sum(1 for c in claims if any(entails(f, c) for f in facts))
    return grounded / len(claims) if claims else 1.0
```

## Commands

```bash
# Apply S2A filtering
just s2a-filter context.txt query.txt

# Measure sycophancy reduction
just s2a-sycophancy-test model responses/

# Validate factual grounding
just s2a-grounding response.txt facts.txt
```

## Integration with GF(3) Triads

```
system2-attention (-1) ⊗ causal-inference (0) ⊗ gflownet (+1) = 0 ✓  [Deliberate Search]
system2-attention (-1) ⊗ cognitive-superposition (0) ⊗ forward-forward-learning (+1) = 0 ✓  [Local Validation]
```

## Related Skills

- **causal-inference** (0): Coordinate causal reasoning
- **forward-forward-learning** (+1): Generate local learning signals
- **proofgeneral-narya** (-1): Formal verification baseline

---

**Skill Name**: system2-attention
**Type**: Deliberate Reasoning Validator
**Trit**: -1 (MINUS)
**Color**: #2626D8 (Blue)

Overview

This skill implements System 2 attention: a two-stage attention mechanism that deliberately re-attends using a regenerated, filtered context. It reduces noise, sycophancy, and irrelevant context from standard fast attention (System 1) and anchors outputs to verified facts. The design targets safer, more reliable reasoning in transformer architectures.

How this skill works

First, a lightweight System 1 pass performs fast pattern-matching attention to produce candidate outputs and identify uncertain or high-entropy positions. For those uncertain positions, a System 2 pass regenerates the context by extracting objective, query-relevant facts and removing opinions or leading language, then re-attends over filtered key/value representations. An optional factual-grounding validator scores how well claims are supported by verified facts.

When to use it

  • When model outputs are sensitive to irrelevant context or leading language
  • When you need to reduce sycophantic or opinion-amplifying behavior
  • When higher factual grounding and verifiability are required
  • When uncertain/high-entropy attention weights indicate unstable reasoning
  • In safety-critical pipelines where a second-pass validation is warranted

Best practices

  • Use a lightweight filter model to minimize latency overhead on the second pass
  • Define clear heuristics for uncertainty (e.g., entropy thresholds) and tune per task
  • Filter context to objective, query-relevant facts rather than full summarization
  • Combine S2A with factual grounding checks to quantify claim support
  • Benchmark sycophancy reduction and grounding scores during development

Example use cases

  • Question answering over long, noisy documents where irrelevant sections mislead attention
  • Dialogue systems that must avoid repeating user-leading prompts or flattery
  • Summarization pipelines that require claims to be anchored to source facts
  • Hybrid retrieval+generation stacks where retrieved context needs sanitization
  • Model auditing to detect and mitigate hallucinations in ambiguous queries

FAQ

Does System 2 attention always increase latency?

It typically adds cost only for positions flagged as uncertain; using a lightweight filter model and selective re-attention keeps overhead minimal.

How are uncertain positions identified?

Uncertainty is commonly computed via attention entropy or variance of attention weights; thresholds should be tuned per model and task.