home / skills / aaronontheweb / dotnet-skills / slopwatch

slopwatch skill

/skills/slopwatch

This skill analyzes .NET changes with Slopwatch to prevent reward hacking by detecting disabled tests, suppressed warnings, and empty catch blocks.

npx playbooks add skill aaronontheweb/dotnet-skills --skill slopwatch

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
8.3 KB
---
name: dotnet-slopwatch
description: Use Slopwatch to detect LLM reward hacking in .NET code changes. Run after every code modification to catch disabled tests, suppressed warnings, empty catch blocks, and other shortcuts that mask real problems.
invocable: true
---

# Slopwatch: LLM Anti-Cheat for .NET

## When to Use This Skill

**Use this skill constantly.** Every time an LLM (including Claude) makes changes to:
- C# source files (.cs)
- Project files (.csproj)
- Props files (Directory.Build.props, Directory.Packages.props)
- Test files

Run slopwatch to validate the changes don't introduce "slop."

## What is Slop?

"Slop" refers to shortcuts LLMs take that make tests pass or builds succeed without actually solving the underlying problem. These are reward hacking behaviors - the LLM optimizes for apparent success rather than real fixes.

### Common Slop Patterns

| Pattern | Example | Why It's Bad |
|---------|---------|--------------|
| Disabled tests | `[Fact(Skip="flaky")]` | Hides failures instead of fixing them |
| Warning suppression | `#pragma warning disable CS8618` | Silences compiler without fixing issue |
| Empty catch blocks | `catch (Exception) { }` | Swallows errors, hides bugs |
| Arbitrary delays | `await Task.Delay(1000);` | Masks race conditions, makes tests slow |
| Project-level suppression | `<NoWarn>CS1591</NoWarn>` | Disables warnings project-wide |
| CPM bypass | `Version="1.0.0"` inline | Undermines central package management |

**Never accept these patterns.** If an LLM introduces slop, reject the change and require a proper fix.

---

## Installation

### As a Local Tool (Recommended)

Add to `.config/dotnet-tools.json`:

```json
{
  "version": 1,
  "isRoot": true,
  "tools": {
    "slopwatch.cmd": {
      "version": "0.2.0",
      "commands": ["slopwatch"],
      "rollForward": false
    }
  }
}
```

Then restore:
```bash
dotnet tool restore
```

### As a Global Tool

```bash
dotnet tool install --global Slopwatch.Cmd
```

---

## First-Time Setup: Establish a Baseline

Before using slopwatch on an existing project, create a baseline of current issues:

```bash
# Initialize baseline from existing code
slopwatch init

# This creates .slopwatch/baseline.json
git add .slopwatch/baseline.json
git commit -m "Add slopwatch baseline"
```

**Why baseline?** Legacy code may have existing issues. The baseline ensures slopwatch only catches **new** slop being introduced, not pre-existing technical debt.

---

## Usage During LLM Sessions

### After Every Code Change

Run slopwatch after any LLM-generated code modification:

```bash
# Analyze for new issues (uses baseline)
slopwatch analyze

# Use strict mode - fail on warnings too
slopwatch analyze --fail-on warning
```

### When Slopwatch Flags an Issue

**Do not ignore it.** Instead:

1. **Understand why** the LLM took the shortcut
2. **Request a proper fix** - be specific about what's wrong
3. **Verify the fix** doesn't introduce different slop

```
# Example: LLM disabled a test
❌ SW001 [Error]: Disabled test detected
   File: tests/MyApp.Tests/OrderTests.cs:45
   Pattern: [Fact(Skip="Test is flaky")]

# Correct response: Ask for actual fix
"This test was disabled instead of fixed. Please investigate why
it's flaky and fix the underlying timing/race condition issue."
```

### Updating the Baseline (Rare)

Only update the baseline when slop is **truly justified** and documented:

```bash
# Add current detections to baseline (use sparingly!)
slopwatch analyze --update-baseline
```

**Justification examples:**
- Third-party library forces a pattern (e.g., must suppress specific warning)
- Intentional delay for rate limiting (not test flakiness)
- Generated code that can't be modified

Document why in a code comment when updating baseline.

---

## Claude Code Hook Integration

Add slopwatch as a hook to automatically validate every edit. Create or update `.claude/settings.json`:

```json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit|MultiEdit",
        "hooks": [
          {
            "type": "command",
            "command": "slopwatch analyze -d . --hook",
            "timeout": 60000
          }
        ]
      }
    ]
  }
}
```

The `--hook` flag:
- Only analyzes **git dirty files** (fast, even on large repos)
- Outputs errors to stderr in readable format
- Blocks the edit on warnings/errors (exit code 2)
- Claude sees the error and can fix it immediately

---

## CI/CD Integration

Add slopwatch to your CI pipeline as a quality gate:

### GitHub Actions

```yaml
jobs:
  slopwatch:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup .NET
        uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '9.0.x'

      - name: Install Slopwatch
        run: dotnet tool install --global Slopwatch.Cmd

      - name: Run Slopwatch
        run: slopwatch analyze -d . --fail-on warning
```

### Azure Pipelines

```yaml
- task: DotNetCoreCLI@2
  displayName: 'Install Slopwatch'
  inputs:
    command: 'custom'
    custom: 'tool'
    arguments: 'install --global Slopwatch.Cmd'

- script: slopwatch analyze -d . --fail-on warning
  displayName: 'Slopwatch Analysis'
```

---

## Detection Rules

| Rule | Severity | What It Catches |
|------|----------|-----------------|
| SW001 | Error | Disabled tests (`Skip=`, `Ignore`, `#if false`) |
| SW002 | Warning | Warning suppression (`#pragma warning disable`, `SuppressMessage`) |
| SW003 | Error | Empty catch blocks that swallow exceptions |
| SW004 | Warning | Arbitrary delays in tests (`Task.Delay`, `Thread.Sleep`) |
| SW005 | Warning | Project file slop (`NoWarn`, `TreatWarningsAsErrors=false`) |
| SW006 | Warning | CPM bypass (`VersionOverride`, inline `Version` attributes) |

---

## Configuration

Create `.slopwatch/slopwatch.json` to customize:

```json
{
  "minSeverity": "warning",
  "rules": {
    "SW001": { "enabled": true, "severity": "error" },
    "SW002": { "enabled": true, "severity": "warning" },
    "SW003": { "enabled": true, "severity": "error" },
    "SW004": { "enabled": true, "severity": "warning" },
    "SW005": { "enabled": true, "severity": "warning" },
    "SW006": { "enabled": true, "severity": "warning" }
  },
  "exclude": [
    "**/Generated/**",
    "**/obj/**",
    "**/bin/**"
  ]
}
```

### Strict Mode (Recommended for LLM Sessions)

For maximum protection during LLM coding sessions, elevate all rules to errors:

```json
{
  "minSeverity": "warning",
  "rules": {
    "SW001": { "enabled": true, "severity": "error" },
    "SW002": { "enabled": true, "severity": "error" },
    "SW003": { "enabled": true, "severity": "error" },
    "SW004": { "enabled": true, "severity": "error" },
    "SW005": { "enabled": true, "severity": "error" },
    "SW006": { "enabled": true, "severity": "error" }
  }
}
```

---

## The Philosophy: Zero Tolerance for New Slop

1. **Baseline captures legacy** - Existing issues are acknowledged but isolated
2. **New slop is blocked** - Any new shortcut fails the build/edit
3. **Exceptions require justification** - If you must update baseline, document why
4. **LLMs are not special** - The same rules apply to human and AI-generated code

The goal is to prevent the gradual accumulation of technical debt that occurs when LLMs optimize for "make the test pass" rather than "fix the actual problem."

---

## Quick Reference

```bash
# First time setup
slopwatch init
git add .slopwatch/baseline.json

# After every LLM code change
slopwatch analyze

# Strict mode (recommended)
slopwatch analyze --fail-on warning

# With stats (performance debugging)
slopwatch analyze --stats

# Update baseline (rare, document why)
slopwatch analyze --update-baseline

# JSON output for tooling
slopwatch analyze --output json
```

---

## When to Override (Almost Never)

The only valid reasons to update baseline or disable a rule:

| Scenario | Action | Required |
|----------|--------|----------|
| Third-party forces pattern | Update baseline | Code comment explaining why |
| Generated code (not editable) | Add to exclude list | Document in config |
| Intentional rate limiting delay | Update baseline | Code comment, not in test |
| Legacy code cleanup | One-time baseline update | PR description |

**Invalid reasons:**
- "The test is flaky" → Fix the flakiness
- "The warning is annoying" → Fix the code
- "It works on my machine" → Fix the race condition
- "We'll fix it later" → Fix it now

Overview

This skill integrates Slopwatch into .NET workflows to detect LLM reward-hacking shortcuts introduced by code changes. It runs checks after edits to catch disabled tests, suppressed warnings, empty catch blocks, and other patterns that mask real problems. Use it to block and correct superficial fixes so code quality and test integrity remain intact.

How this skill works

Slopwatch inspects changed C# source, project and props files, and tests against a set of detection rules. It compares current findings to a stored baseline so only newly introduced issues are flagged. It can run locally, as a Claude post-edit hook, or in CI to fail edits or builds when disallowed patterns are detected.

When to use it

  • After every LLM-generated code change to C#, projects, props, or tests
  • As a post-edit hook in interactive LLM sessions for immediate feedback
  • In CI/CD pipelines as a quality gate before merging
  • When validating PRs that touch tests or project configuration
  • Before accepting fixes that silence warnings or remove failing tests

Best practices

  • Initialize a baseline once to ignore legacy technical debt, then treat all new findings as blockers
  • Run slopwatch analyze after each edit and use --fail-on warning for strict LLM sessions
  • Do not accept changes that disable tests, suppress warnings broadly, or swallow exceptions—require a proper fix
  • Only update baseline or disable a rule with clear justification documented in code/comments and PRs
  • Exclude generated, obj, and bin folders and keep rule config in .slopwatch/slopwatch.json

Example use cases

  • During a Claude-assisted refactor, automatically block edits that add [Fact(Skip=...)], forcing a real test fix
  • CI job that installs Slopwatch and fails the run when new warning-level slop appears
  • Pre-commit hook that runs slopwatch analyze --hook to evaluate only git dirty files
  • Repository onboarding: run slopwatch init to create a baseline and prevent regressions
  • Audit a PR that modifies Directory.Build.props to catch NoWarn or version overrides

FAQ

Why create a baseline?

Baseline captures existing legacy issues so Slopwatch only flags new problems introduced after adoption.

When is it acceptable to update the baseline?

Only when there is a documented, unavoidable reason (third-party constraint, uneditable generated code, or legitimate rate-limiting) and the change is recorded in code comments and PR notes.

Can Slopwatch run fast on large repos?

Yes—use the --hook flag to analyze only git dirty files for quick feedback during editing sessions.