home / skills / pluginagentmarketplace / custom-plugin-data-analyst / databases-sql

databases-sql skill

/skills/databases-sql

This skill helps you write efficient SQL across platforms, optimize queries, and design analytics-ready databases for data warehousing.

npx playbooks add skill pluginagentmarketplace/custom-plugin-data-analyst --skill databases-sql

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
2.4 KB
---
name: databases-sql
description: SQL database querying, optimization, and data management for analytics
version: "2.0.0"
sasmp_version: "2.0.0"
bonded_agent: 02-sql-databases-expert
bond_type: PRIMARY_BOND

# Skill Configuration
config:
  atomic: true
  retry_enabled: true
  max_retries: 3
  backoff_strategy: exponential
  query_timeout: 120

# Parameter Validation
parameters:
  database_type:
    type: string
    required: true
    enum: [postgresql, mysql, sqlserver, sqlite, bigquery, snowflake]
    default: postgresql
  skill_level:
    type: string
    required: true
    enum: [beginner, intermediate, advanced, expert]
    default: beginner
  focus_area:
    type: string
    required: false
    enum: [queries, optimization, design, warehousing, all]
    default: all

# Observability
observability:
  logging_level: info
  metrics: [query_count, execution_time, optimization_score]
  query_logging: true
---

# Databases & SQL Skill

## Overview
Master SQL and database concepts essential for data analysts, from basic queries to advanced optimization and data warehousing.

## Core Topics

### SQL Fundamentals
- SELECT, FROM, WHERE, ORDER BY
- JOINs (INNER, LEFT, RIGHT, FULL)
- GROUP BY and aggregate functions
- Subqueries and CTEs

### Advanced SQL
- Window functions (ROW_NUMBER, RANK, LAG, LEAD)
- Recursive queries
- Query optimization and execution plans
- Index strategies

### Database Concepts
- Relational database design principles
- Normalization and denormalization
- Data warehousing concepts (star schema, snowflake)
- ETL processes

### Popular Platforms
- PostgreSQL
- MySQL
- SQL Server
- BigQuery, Redshift, Snowflake

## Learning Objectives
- Write efficient SQL queries for data extraction
- Understand database design and optimization
- Work with cloud data warehouses
- Implement ETL processes for analytics pipelines

## Error Handling

| Error Type | Cause | Recovery |
|------------|-------|----------|
| Syntax error | Invalid SQL | Validate query syntax before execution |
| Timeout | Long-running query | Add indexes, optimize query |
| Connection failed | Network/auth issue | Retry with exponential backoff |
| Permission denied | Access rights | Request appropriate permissions |
| Deadlock | Concurrent transactions | Retry transaction |

## Related Skills
- foundations (for data concepts)
- programming (for SQL with Python/R)
- advanced (for big data processing)

Overview

This skill provides practical SQL database querying, optimization, and data management techniques tailored for data analysts. It covers fundamentals through advanced topics, including window functions, query tuning, and data warehousing. The focus is on producing efficient queries and reliable analytics pipelines across popular platforms like PostgreSQL, MySQL, BigQuery, Redshift, and Snowflake.

How this skill works

The skill teaches how to write and optimize SQL queries, design relational schemas, and implement ETL patterns for analytics workflows. It inspects query structure and execution plan concepts, explains indexing strategies, and shows how to apply normalization or denormalization depending on analytic needs. Error types and recovery steps are included to troubleshoot syntax errors, timeouts, connection issues, permissions, and deadlocks.

When to use it

  • Extracting, transforming, and aggregating data for reports or dashboards
  • Optimizing slow queries by reviewing execution plans and adding appropriate indexes
  • Designing or refactoring schemas for analytical workloads (star/snowflake schemas)
  • Implementing ETL jobs to move data into a cloud data warehouse
  • Handling concurrency, permissions, and stability issues in production databases

Best practices

  • Start with clear requirements and sample queries before optimizing for performance
  • Use CTEs and window functions to simplify complex aggregations while monitoring performance
  • Prefer set-based operations over row-by-row processing to leverage DB engine optimizations
  • Test changes against realistic data volumes and review execution plans after schema/index changes
  • Implement retry logic and backoff for transient connection or deadlock failures

Example use cases

  • Build a nightly ETL pipeline to consolidate transactional data into a star schema for BI tools
  • Tune a 10x-slower onboarding report by rewriting joins and adding covering indexes
  • Use window functions to calculate cohort retention and rolling averages for product analytics
  • Migrate queries from on-premise PostgreSQL to BigQuery and adapt for columnar performance characteristics

FAQ

What should I do when a query times out?

Analyze the execution plan, add or adjust indexes, rewrite heavy subqueries, and consider materialized views or batching to reduce runtime.

How do I choose between normalization and denormalization for analytics?

Normalize for OLTP to reduce redundancy; denormalize and adopt star schemas for analytics to speed up read-heavy queries and simplify joins.