home / skills / williamzujkowski / standards / metrics

This skill helps you establish observability metrics standards across projects, enabling secure, tested, and maintainable instrumentation and performance

npx playbooks add skill williamzujkowski/standards --skill metrics

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
2.0 KB
---
name: metrics
description: Metrics standards for metrics in Observability environments. Covers best
---

# Metrics

> **Quick Navigation:**
> Level 1: [Quick Start](#level-1-quick-start) (5 min) → Level 2: [Implementation](#level-2-implementation) (30 min) → Level 3: [Mastery](#level-3-mastery-resources) (Extended)

---

## Level 1: Quick Start

### Core Principles

1. **Best Practices**: Follow industry-standard patterns for observability
2. **Security First**: Implement secure defaults and validate all inputs
3. **Maintainability**: Write clean, documented, testable code
4. **Performance**: Optimize for common use cases

### Essential Checklist

- [ ] Follow established patterns for observability
- [ ] Implement proper error handling
- [ ] Add comprehensive logging
- [ ] Write unit and integration tests
- [ ] Document public interfaces

### Quick Links to Level 2

- [Core Concepts](#core-concepts)
- [Implementation Patterns](#implementation-patterns)
- [Common Pitfalls](#common-pitfalls)

---

## Level 2: Implementation

### Core Concepts

This skill covers essential practices for observability.

**Key areas include:**

- Architecture patterns
- Implementation best practices
- Testing strategies
- Performance optimization

### Implementation Patterns

Apply these patterns when working with observability:

1. **Pattern Selection**: Choose appropriate patterns for your use case
2. **Error Handling**: Implement comprehensive error recovery
3. **Monitoring**: Add observability hooks for production

### Common Pitfalls

Avoid these common mistakes:

- Skipping validation of inputs
- Ignoring edge cases
- Missing test coverage
- Poor documentation

---

## Level 3: Mastery Resources

### Reference Materials

- [Related Standards](../../docs/standards/)
- [Best Practices Guide](../../docs/guides/)

### Templates

See the `templates/` directory for starter configurations.

### External Resources

Consult official documentation and community best practices for observability.

Overview

This skill defines metrics standards for observability environments to help teams instrument, validate, and maintain reliable telemetry. It focuses on industry-standard patterns, secure defaults, and practical guidance to get production-grade metrics in place quickly. The goal is clear, testable, and maintainable metrics that support monitoring, alerting, and performance analysis.

How this skill works

The skill inspects metric design, implementation patterns, and operational controls. It recommends architectures for metric collection, error handling, and performance considerations, and it provides checklists and templates to accelerate adoption. Implementers use the guidance to validate inputs, add observability hooks, and integrate testing and documentation for metrics pipelines.

When to use it

  • When adding metrics to a new service or feature to ensure consistency
  • When auditing existing telemetry for gaps, security, or performance issues
  • When defining SLIs, SLOs, and alerting rules based on reliable metrics
  • When onboarding teams to shared observability standards and templates
  • When preparing metrics for production deployment and long-term maintenance

Best practices

  • Choose metric types and labels that reflect cardinality and query needs
  • Validate and sanitize inputs before recording metrics to avoid noise
  • Instrument error handling and edge cases to improve signal accuracy
  • Write unit and integration tests for metric emission and aggregation
  • Document metric semantics, units, and intended use for consumers

Example use cases

  • Standardizing counter, gauge, and histogram usage across microservices
  • Auditing a metrics pipeline to remove high-cardinality labels
  • Implementing SLI-based alerts using well-defined service metrics
  • Creating templates for consistent metric naming, units, and tags
  • Adding observability hooks to background jobs and batch processes

FAQ

How do I choose labels without causing high cardinality?

Prefer low-cardinality dimensions such as environment or region; avoid including user IDs or request-level identifiers as labels. Use labels for dimensions you will query and aggregate on.

What tests should I add for metrics?

Add unit tests that assert metrics are emitted on expected code paths and integration tests that validate aggregation, naming, and label presence in a staging environment.