home / skills / shotaiuchi / dotclaude / debug-environment

debug-environment skill

/dotclaude/skills/debug-environment

This skill helps diagnose environment and configuration issues causing failures by verifying settings, permissions, and platform differences across contexts.

npx playbooks add skill shotaiuchi/dotclaude --skill debug-environment

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
2.0 KB
---
name: debug-environment
description: >-
  Environment and configuration investigation. Apply when debugging
  environment-specific failures, missing configuration, permission errors,
  platform differences, and deployment issues.
user-invocable: false
---

# Environment Checker Investigation

Investigate environment and configuration issues that cause failures in specific contexts.

## Investigation Checklist

### Configuration Verification
- Compare actual configuration values against expected defaults
- Check for missing required environment variables or config keys
- Identify configuration override precedence conflicts
- Verify config file format and encoding are parsed correctly
- Look for environment-specific config that was not applied

### Platform Differences
- Identify OS-specific behavior differences causing the failure
- Check for path separator, line ending, or case sensitivity issues
- Verify runtime version compatibility across environments
- Look for locale or timezone settings that affect data processing
- Check for architecture-specific issues in native dependencies

### File System & Permissions
- Verify file and directory existence at expected paths
- Check read, write, and execute permissions for the running process
- Identify symlink resolution failures or broken links
- Look for disk space or inode exhaustion conditions
- Check for file locking conflicts with other processes

### Network & Connectivity
- Verify DNS resolution and endpoint reachability
- Check for proxy, firewall, or VPN interference
- Identify timeout values that are too aggressive for the environment
- Verify TLS certificate validity and trust chain
- Look for port conflicts or binding failures on the host

## Output Format

Report findings with confidence ratings:

| Confidence | Description |
|------------|-------------|
| High | Root cause clearly identified with supporting evidence |
| Medium | Probable cause identified but needs verification |
| Low | Hypothesis formed but insufficient evidence |
| Inconclusive | Unable to determine from available information |

Overview

This skill inspects environment and configuration issues that cause failures in specific contexts. It helps pinpoint missing variables, permission problems, platform differences, and deployment mismatches that lead to runtime errors. The output is a concise investigative report with confidence ratings and actionable next steps.

How this skill works

The skill runs a systematic checklist across configuration, platform, filesystem, and network layers to collect evidence. It compares actual values against expected defaults, validates file formats and permissions, checks runtime and OS compatibility, and tests network reachability and TLS. Findings are categorized with confidence levels (High, Medium, Low, Inconclusive) and include suggested verification steps and remediation actions.

When to use it

  • Debugging failures that only occur in a specific environment (staging, prod, CI).
  • Investigating missing or incorrect environment variables or config keys.
  • Resolving permission, file-access, or symlink-related errors.
  • Diagnosing platform-specific crashes or dependency issues.
  • Troubleshooting network connectivity, DNS, proxy, or TLS problems.

Best practices

  • Start with reproducible, minimal steps and capture environment metadata (OS, runtime, architecture).
  • Document expected config defaults and the precedence rules for overrides before testing.
  • Gather evidence: logs, exit codes, file listings, permission bits, and sample config files.
  • Avoid destructive fixes—apply read-only checks first and only change live config after confirming hypotheses. Use confidence ratings to guide escalation.
  • Automate repetitive checks in CI to detect environment drift early.

Example use cases

  • CI pipeline failing only on macOS due to case-sensitive filesystem assumptions.
  • Production deployment with permission errors preventing service startup due to incorrect user or group.
  • Service timeouts in one region caused by a corporate proxy or firewall rule.
  • Data parsing differences because of unexpected locale or timezone settings on the host.
  • Native dependency failing on ARM architecture in a new cloud instance.

FAQ

What does each confidence rating mean?

High: root cause identified with clear evidence. Medium: probable cause with supporting signs that need verification. Low: hypothesis with minimal evidence. Inconclusive: insufficient information to form a hypothesis.

Can this skill modify my environment?

No. The skill focuses on non-destructive inspections and evidence gathering. Remediation steps are suggested but applied manually or via separate automation.

What inputs are required?

Provide the failing command or scenario, logs, expected config values, and the target environment metadata (OS, runtime versions, architecture). Additional access to the host or CI logs improves accuracy.