home / skills / physics91 / claude-vibe / sql-optimizer

sql-optimizer skill

safe

This skill analyzes and optimizes SQL queries for performance, index usage, and N+1 detection across PostgreSQL, MySQL, and SQLite.

npx playbooks add skill physics91/claude-vibe --skill sql-optimizer

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

7.1 KB

---
name: sql-optimizer
description: |
  WHEN: SQL query review, query optimization, index usage, N+1 detection, performance analysis
  WHAT: Query plan analysis + Index recommendations + N+1 detection + Join optimization + Performance tuning
  WHEN NOT: Schema design → schema-reviewer, ORM code → orm-reviewer
---

# SQL Optimizer Skill

## Purpose
Analyzes and optimizes SQL queries for performance, index usage, and best practices.

## When to Use
- SQL query optimization
- Query performance review
- Index usage analysis
- N+1 query detection
- Slow query troubleshooting

## Project Detection
- `.sql` files
- Query strings in code
- Database migration files
- ORM query logs

## Workflow

### Step 1: Analyze Query
```
**Database**: PostgreSQL/MySQL/SQLite
**Tables**: users, orders, products
**Query Type**: SELECT with JOINs
**Estimated Rows**: 100K+
```

### Step 2: Select Review Areas
**AskUserQuestion:**
```
"Which areas to review?"
Options:
- Full query optimization (recommended)
- Index usage analysis
- Join optimization
- Subquery refactoring
- N+1 detection
multiSelect: true
```

## Detection Rules

### Index Usage
| Check | Recommendation | Severity |
|-------|----------------|----------|
| Full table scan | Add appropriate index | CRITICAL |
| Index not used | Check column order | HIGH |
| Too many indexes | Consolidate indexes | MEDIUM |
| Missing composite index | Add multi-column index | HIGH |

```sql
-- BAD: No index on filter columns
SELECT * FROM orders
WHERE created_at > '2024-01-01'
AND status = 'pending';
-- Full table scan!

-- GOOD: Add composite index
CREATE INDEX idx_orders_status_created
ON orders(status, created_at);

-- Index order matters!
-- For WHERE status = ? AND created_at > ?
-- Index(status, created_at) ✓
-- Index(created_at, status) ✗ (less effective)
```

### SELECT Optimization
| Check | Recommendation | Severity |
|-------|----------------|----------|
| SELECT * | Select specific columns | HIGH |
| Unnecessary columns | Remove unused columns | MEDIUM |
| No LIMIT | Add LIMIT for large results | HIGH |

```sql
-- BAD: SELECT * with large result
SELECT * FROM orders
WHERE user_id = 123;
-- Returns all columns, no limit

-- GOOD: Specific columns, limited results
SELECT id, status, total, created_at
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;
```

### JOIN Optimization
| Check | Recommendation | Severity |
|-------|----------------|----------|
| Cartesian product | Add join condition | CRITICAL |
| Join on non-indexed column | Add index | HIGH |
| Too many joins | Consider denormalization | MEDIUM |
| Implicit join | Use explicit JOIN syntax | LOW |

```sql
-- BAD: Implicit join (harder to read, error-prone)
SELECT o.*, u.name
FROM orders o, users u
WHERE o.user_id = u.id;

-- GOOD: Explicit JOIN
SELECT o.id, o.total, u.name
FROM orders o
INNER JOIN users u ON o.user_id = u.id;

-- BAD: Join on non-indexed column
SELECT o.*, p.name
FROM orders o
JOIN products p ON o.product_code = p.code;
-- If products.code has no index → slow!

-- FIX: Add index
CREATE INDEX idx_products_code ON products(code);
```

### Subquery Optimization
| Check | Recommendation | Severity |
|-------|----------------|----------|
| Correlated subquery | Convert to JOIN | HIGH |
| IN with subquery | Use EXISTS or JOIN | MEDIUM |
| Subquery in SELECT | Move to JOIN | HIGH |

```sql
-- BAD: Correlated subquery (runs for each row)
SELECT *
FROM orders o
WHERE total > (
    SELECT AVG(total)
    FROM orders
    WHERE user_id = o.user_id
);

-- GOOD: Use window function
SELECT *
FROM (
    SELECT *,
           AVG(total) OVER (PARTITION BY user_id) as avg_total
    FROM orders
) sub
WHERE total > avg_total;

-- BAD: IN with large subquery
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders WHERE status = 'vip');

-- GOOD: Use EXISTS or JOIN
SELECT u.* FROM users u
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.user_id = u.id AND o.status = 'vip'
);

-- Or with JOIN
SELECT DISTINCT u.*
FROM users u
INNER JOIN orders o ON o.user_id = u.id
WHERE o.status = 'vip';
```

### N+1 Query Detection
| Check | Recommendation | Severity |
|-------|----------------|----------|
| Loop with query | Batch fetch | CRITICAL |
| Lazy load in loop | Eager load | CRITICAL |

```sql
-- N+1 Pattern (application code)
-- Query 1: Get all users
SELECT * FROM users;

-- Then for each user (N queries):
SELECT * FROM orders WHERE user_id = 1;
SELECT * FROM orders WHERE user_id = 2;
SELECT * FROM orders WHERE user_id = 3;
-- ... N more queries

-- SOLUTION 1: JOIN
SELECT u.*, o.*
FROM users u
LEFT JOIN orders o ON o.user_id = u.id;

-- SOLUTION 2: IN query (for separate queries)
SELECT * FROM orders WHERE user_id IN (1, 2, 3, ...);
```

### Aggregation Optimization
| Check | Recommendation | Severity |
|-------|----------------|----------|
| COUNT(*) on large table | Use approximate count | MEDIUM |
| GROUP BY without index | Add index | HIGH |
| HAVING vs WHERE | Filter early with WHERE | MEDIUM |

```sql
-- BAD: COUNT on entire table
SELECT COUNT(*) FROM orders;
-- Scans entire table

-- GOOD: Approximate count (PostgreSQL)
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE relname = 'orders';

-- BAD: WHERE in HAVING
SELECT user_id, COUNT(*)
FROM orders
GROUP BY user_id
HAVING status = 'completed';  -- Wrong place!

-- GOOD: Filter before grouping
SELECT user_id, COUNT(*)
FROM orders
WHERE status = 'completed'  -- Filter first
GROUP BY user_id;

-- Index for GROUP BY
CREATE INDEX idx_orders_user_status
ON orders(user_id, status);
```

### EXPLAIN Analysis
```sql
-- PostgreSQL EXPLAIN
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id)
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id;

-- Look for:
-- ✗ Seq Scan (full table scan)
-- ✗ Nested Loop with high rows
-- ✗ Hash Join with large hash
-- ✓ Index Scan
-- ✓ Index Only Scan
-- ✓ Bitmap Index Scan
```

## Response Template
```
## SQL Query Optimization Results

**Database**: PostgreSQL 15
**Query Type**: SELECT with JOIN
**Estimated Impact**: ~10x improvement

### Index Usage
| Status | Issue | Recommendation |
|--------|-------|----------------|
| CRITICAL | Full table scan on orders | Add index on (status, created_at) |

### Join Analysis
| Status | Issue | Recommendation |
|--------|-------|----------------|
| HIGH | Non-indexed join column | Add index on products.code |

### Query Structure
| Status | Issue | Recommendation |
|--------|-------|----------------|
| HIGH | SELECT * with no LIMIT | Select specific columns, add LIMIT |

### Recommended Indexes
```sql
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
CREATE INDEX idx_products_code ON products(code);
```

### Optimized Query
```sql
SELECT o.id, o.total, p.name
FROM orders o
INNER JOIN products p ON o.product_id = p.id
WHERE o.status = 'pending'
ORDER BY o.created_at DESC
LIMIT 100;
```
```

## Best Practices
1. **Indexes**: Add for WHERE, JOIN, ORDER BY columns
2. **SELECT**: Only needed columns, with LIMIT
3. **JOINs**: Explicit syntax, indexed columns
4. **Subqueries**: Prefer JOINs or CTEs
5. **EXPLAIN**: Always analyze query plans

## Integration
- `schema-reviewer`: Database design
- `orm-reviewer`: ORM query patterns
- `perf-analyzer`: Application performance

Overview

This skill analyzes SQL queries to find performance bottlenecks and produce concrete optimization suggestions. It focuses on query plan analysis, index recommendations, join and subquery refactoring, N+1 detection, and practical tuning steps. Designed to deliver actionable changes that reduce query latency and resource usage.

How this skill works

The skill inspects SQL text, EXPLAIN/ANALYZE outputs, and query patterns extracted from code or logs to detect full table scans, missing or unused indexes, inefficient joins, correlated subqueries, and N+1 loops. It ranks issues by severity and returns targeted fixes: index DDL, rewritten queries, join/index ordering advice, and guidance for batching or eager loading. Recommendations include example SQL and a short rationale so you can apply and validate changes quickly.

When to use it

Review slow queries discovered in logs or profiling
Optimize SELECTs that do full table scans or read many columns
Diagnose and fix N+1 query patterns in application code
Improve join performance or recommend composite indexes
Refactor correlated subqueries or heavy aggregations

Best practices

Add indexes on WHERE, JOIN, and ORDER BY columns; prefer composite indexes matching filter order
Avoid SELECT * — request only required columns and add LIMIT for large result sets
Convert correlated subqueries to JOINs, CTEs, or window functions when possible
Detect N+1 by checking per-row queries; use IN, JOIN, or batch fetches to reduce round trips
Run EXPLAIN ANALYZE and check for sequential scans, nested loops on large data, and missing index usage

Example use cases

Analyze an EXPLAIN ANALYZE output and get a prioritized list of index and query changes
Turn a loop of per-record queries into a single batched query to eliminate N+1
Recommend composite index (status, created_at) for frequent status + date filters
Refactor a correlated subquery into a window function or JOIN to avoid repeated scanning
Suggest denormalization or selective joins when excessive joins cause large intermediate results

FAQ

Do you modify schema or application code automatically?

No. I provide explicit SQL DDL and rewritten queries you can review and apply. Schema changes should be tested in staging.

Which databases are supported?

Advice targets PostgreSQL, MySQL, and SQLite patterns. EXPLAIN outputs and features may vary; recommendations note database-specific alternatives.