home / skills / 404kidwiz / claude-supercode-skills / database-administrator-skill

database-administrator-skill skill

/database-administrator-skill

This skill provides senior database administration expertise for production systems, enabling high availability, performance tuning, backup strategies, and

npx playbooks add skill 404kidwiz/claude-supercode-skills --skill database-administrator-skill

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
5.3 KB
---
name: database-administrator
description: "Senior Database Administrator with expertise in PostgreSQL, MySQL, MongoDB, and enterprise database systems. Specializes in high availability architectures, performance tuning, backup strategies, and database security for production environments."
---

# Database Administrator

## Purpose

Provides senior-level database administration expertise for production database systems including PostgreSQL, MySQL, MongoDB, and enterprise databases. Specializes in high availability architectures, performance tuning, backup strategies, disaster recovery, and database security for mission-critical environments.

## When to Use

- Setting up production databases with high availability and disaster recovery
- Optimizing database performance (slow queries, indexing, configuration tuning)
- Implementing backup and recovery strategies (PITR, cross-region backups)
- Migrating databases (PostgreSQL, MySQL, MongoDB) to cloud or between versions
- Hardening database security (encryption, access control, audit logging)
- Troubleshooting database issues (locks, replication lag, corruption)
- Designing database architectures for scalability and reliability

## Quick Start

**Invoke this skill when:**
- Setting up production databases with high availability and disaster recovery
- Optimizing database performance (slow queries, indexing, configuration tuning)
- Implementing backup and recovery strategies (PITR, cross-region backups)
- Migrating databases (PostgreSQL, MySQL, MongoDB) to cloud or between versions
- Hardening database security (encryption, access control, audit logging)
- Troubleshooting database issues (locks, replication lag, corruption)

**Do NOT invoke when:**
- Only application-level ORM queries need optimization (use backend-developer)
- Data pipeline development (use data-engineer for ETL/ELT)
- Data modeling and schema design for analytics (use data-engineer)
- Database selection for new projects (use cloud-architect for strategy)
- Simple SQL queries or data analysis (use data-analyst)

## Decision Framework

### Database Selection

| Use Case | Database | Why |
|----------|----------|-----|
| **Transactional (OLTP)** | PostgreSQL | ACID, extensions, JSON support |
| **High-read web apps** | MySQL/MariaDB | Fast reads, mature replication |
| **Flexible schema** | MongoDB | Document model, horizontal scale |
| **Key-value cache** | Redis | Sub-ms latency, data structures |
| **Time-series data** | TimescaleDB/InfluxDB | Optimized for time-based queries |
| **Analytics (OLAP)** | Snowflake/BigQuery | Columnar, massive scale |

### High Availability Architecture

```
├─ Single Region HA?
│   ├─ Managed service → RDS Multi-AZ / Cloud SQL HA
│   │   Pros: Automatic failover, managed backups
│   │   Cost: 2x compute (standby instance)
│   │
│   └─ Self-managed → Patroni + etcd (PostgreSQL)
│       Pros: Full control, no vendor lock-in
│       Cost: Operational overhead
│
├─ Multi-Region HA?
│   ├─ Active-Passive → Cross-region read replicas
│   │   Pros: Simple, low cost
│   │   Cons: Manual failover, data lag
│   │
│   └─ Active-Active → CockroachDB / Spanner
│       Pros: True global distribution
│       Cons: Complexity, cost
│
└─ Horizontal Scaling?
    ├─ Read scaling → Read replicas
    ├─ Write scaling → Sharding (MongoDB, Vitess)
    └─ Both → Distributed SQL (CockroachDB, TiDB)
```

### Backup Strategy Matrix

| RPO Requirement | Strategy | Implementation |
|-----------------|----------|----------------|
| **< 1 minute** | Synchronous replication | Patroni sync mode |
| **< 5 minutes** | Continuous WAL archiving | pg_basebackup + WAL-G |
| **< 1 hour** | Automated snapshots | RDS automated backups |
| **< 24 hours** | Daily backups | pg_dump + S3 |

### Performance Tuning Priorities

1. **Query optimization** (biggest impact, lowest cost)
2. **Indexing strategy** (moderate effort, high impact)
3. **Configuration tuning** (one-time, moderate impact)
4. **Hardware upgrades** (high cost, last resort)

## Quality Checklist

### Production Readiness
- [ ] High availability configured (multi-AZ or multi-region)
- [ ] Automated backups enabled (daily + continuous WAL)
- [ ] Backup restoration tested (monthly disaster recovery drill)
- [ ] Connection pooling configured (PgBouncer/ProxySQL)
- [ ] Monitoring and alerting active (slow queries, replication lag)

### Performance
- [ ] Indexes created for all query patterns
- [ ] Table statistics up-to-date (autovacuum tuned)
- [ ] Query plans reviewed (no full table scans on large tables)
- [ ] Connection pooling optimized (min/max pool size)
- [ ] Database configuration tuned (shared_buffers, work_mem)

### Security
- [ ] Encryption at rest enabled
- [ ] Encryption in transit (SSL/TLS) enforced
- [ ] Least privilege access (no superuser for applications)
- [ ] Audit logging enabled (failed logins, DDL changes)
- [ ] Regular security patching scheduled

### Disaster Recovery
- [ ] RTO/RPO documented and tested
- [ ] Cross-region backups enabled
- [ ] Failover procedure documented and tested
- [ ] Data retention policy enforced
- [ ] Point-in-time recovery validated

## Additional Resources

- **Detailed Technical Reference**: See [REFERENCE.md](REFERENCE.md)
- **Code Examples & Patterns**: See [EXAMPLES.md](EXAMPLES.md)

Overview

This skill provides senior database administration expertise for production systems including PostgreSQL, MySQL, MongoDB, and enterprise databases. It focuses on high availability architectures, performance tuning, backup and recovery, and security hardening for mission-critical environments. The goal is to deliver reliable, performant, and secure database platforms with clear operational procedures.

How this skill works

The skill inspects architecture, configuration, and operational practices to identify risks and optimizations. It evaluates HA and DR designs, tuning opportunities (queries, indexes, configuration), backup strategies (PITR, snapshots, cross-region), and security controls (encryption, access, audit). It produces prioritized recommendations, remediation steps, and testing plans for rollout in production environments.

When to use it

  • Setting up production databases with high availability and documented failover procedures
  • Diagnosing and fixing performance issues: slow queries, index gaps, or configuration limits
  • Designing or validating backup and disaster recovery strategies, including PITR
  • Planning migrations between versions or to cloud managed services
  • Hardening database security: encryption, least-privilege, and auditing

Best practices

  • Prioritize query optimization and indexing before adding hardware
  • Automate backups and test restorations regularly (monthly DR drills)
  • Use connection pooling (PgBouncer / ProxySQL) to manage client load
  • Document RTO/RPO, failover steps, and rollback procedures for every change
  • Enforce encryption in transit and at rest, and apply least-privilege access controls

Example use cases

  • Design a multi-AZ PostgreSQL deployment with automated failover and PITR
  • Tune a MySQL cluster suffering from replication lag and slow read queries
  • Migrate a self-hosted MongoDB sharded cluster to a managed cloud service
  • Create a backup matrix mapping RPO requirements to WAL archiving and snapshots
  • Perform a security audit: enable TLS, tighten roles, and enable audit logging

FAQ

When should I add read replicas vs. sharding?

Use read replicas to scale read throughput and offload reporting. Choose sharding when write throughput or dataset size exceeds a single node's capacity and application can handle partitioning.

How often should I test backups?

Run full restore tests at least monthly and validate point-in-time recovery and cross-region restores as part of your disaster recovery drills.