home / skills / athola / claude-night-market / testing-quality-standards
This skill enforces testing quality standards across plugins by validating coverage thresholds, structure, maintainability, and reliability.
npx playbooks add skill athola/claude-night-market --skill testing-quality-standardsReview the files below or copy the command above to add this skill to your agents.
---
name: testing-quality-standards
description: 'Cross-plugin testing quality metrics and standards. Referenced by pensive:test-review
and parseltongue:python-testing.
testing standards, quality metrics, coverage thresholds, anti-patterns
Use when: test quality evaluation, coverage thresholds, quality standards
DO NOT use when: simple scripts without quality requirements.'
category: infrastructure
tags:
- testing
- quality
- standards
- metrics
dependencies: []
estimated_tokens: 400
provides:
patterns:
- coverage-thresholds
- quality-metrics
- anti-patterns
---
# Testing Quality Standards
Shared quality standards and metrics for testing across all plugins in the Claude Night Market ecosystem.
## When To Use
- Establishing test quality gates and coverage targets
- Validating test suite against quality standards
## When NOT To Use
- Exploratory testing or spike work
- Projects with established quality gates that meet requirements
## Table of Contents
1. [Coverage Thresholds](#coverage-thresholds)
2. [Quality Metrics](#quality-metrics)
3. [Detailed Topics](#detailed-topics)
## Coverage Thresholds
| Level | Coverage | Use Case |
|-------|----------|----------|
| Minimum | 60% | Legacy code |
| Standard | 80% | Normal development |
| High | 90% | Critical systems |
| detailed | 95%+ | Safety-critical |
## Quality Metrics
### Structure
- [ ] Clear test organization
- [ ] Meaningful test names
- [ ] Proper setup/teardown
- [ ] Isolated test cases
### Coverage
- [ ] Critical paths covered
- [ ] Edge cases tested
- [ ] Error conditions handled
- [ ] Integration points verified
### Maintainability
- [ ] DRY test code
- [ ] Reusable fixtures
- [ ] Clear assertions
- [ ] Minimal mocking
### Reliability
- [ ] No flaky tests
- [ ] Deterministic execution
- [ ] No order dependencies
- [ ] Fast feedback loop
## Detailed Topics
For implementation patterns and examples:
- **[Anti-Patterns](modules/anti-patterns.md)** - Common testing mistakes with before/after examples
- **[Best Practices](modules/best-practices.md)** - Core testing principles and exit criteria
## Integration with Plugin Testing
This skill provides foundational standards referenced by:
- `pensive:test-review` - Uses coverage thresholds and quality metrics
- `parseltongue:python-testing` - Uses anti-patterns and best practices
- `sanctum:test-*` - Uses quality checklist for test validation
Reference in your skill's frontmatter:
```yaml
dependencies: [leyline:testing-quality-standards]
```
**Verification:** Run `pytest -v` to verify tests pass.
## Troubleshooting
### Common Issues
**Tests not discovered**
Ensure test files match pattern `test_*.py` or `*_test.py`. Run `pytest --collect-only` to verify.
**Import errors**
Check that the module being tested is in `PYTHONPATH` or install with `pip install -e .`
**Async tests failing**
Install pytest-asyncio and decorate test functions with `@pytest.mark.asyncio`
This skill defines cross-plugin testing quality metrics and standards used across the Claude Night Market ecosystem. It sets coverage thresholds, core quality metrics, and anti-pattern guidance to make test suites reliable, maintainable, and actionable. Use it to align testing expectations across plugins and to drive automated quality gates.
The skill codifies coverage thresholds (minimum, standard, high, and safety-critical) and a checklist of structure, coverage, maintainability, and reliability metrics. It is referenced by test-review and Python testing plugins to validate test suites, flag anti-patterns, and enforce exit criteria. Verification is typically done by running a test runner (for example, pytest) and comparing results against the configured thresholds and checklist.
What coverage level should I pick for my project?
Choose by risk: use 60% for legacy code you cannot fully refactor, 80% for normal development, 90% for critical systems, and 95%+ for safety-critical components.
How do I verify tests meet the standards?
Run your test runner (for example, pytest -v) and compare results to the checklist and coverage thresholds; integrate checks into CI to enforce gates automatically.