home / skills / ancoleman / ai-design-components / optimizing-sql
This skill helps you optimize SQL query performance across PostgreSQL, MySQL, and SQL Server using explain analysis, indexing strategies, and query rewriting.
npx playbooks add skill ancoleman/ai-design-components --skill optimizing-sqlReview the files below or copy the command above to add this skill to your agents.
---
name: optimizing-sql
description: Optimize SQL query performance through EXPLAIN analysis, indexing strategies, and query rewriting for PostgreSQL, MySQL, and SQL Server. Use when debugging slow queries, analyzing execution plans, or improving database performance.
---
# SQL Optimization
Provide tactical guidance for optimizing SQL query performance across PostgreSQL, MySQL, and SQL Server through execution plan analysis, strategic indexing, and query rewriting.
## When to Use This Skill
Trigger this skill when encountering:
- Slow query performance or database timeouts
- Analyzing EXPLAIN plans or execution plans
- Determining index requirements
- Rewriting inefficient queries
- Identifying query anti-patterns (N+1, SELECT *, correlated subqueries)
- Database-specific optimization needs (PostgreSQL, MySQL, SQL Server)
## Core Optimization Workflow
### Step 1: Analyze Query Performance
Run execution plan analysis to identify bottlenecks:
**PostgreSQL:**
```sql
EXPLAIN ANALYZE SELECT * FROM users WHERE email = '[email protected]';
```
**MySQL:**
```sql
EXPLAIN FORMAT=JSON SELECT * FROM products WHERE category_id = 5;
```
**SQL Server:**
Use SQL Server Management Studio: Display Estimated Execution Plan (Ctrl+L)
**Key Metrics to Monitor:**
- **Cost**: Estimated resource consumption
- **Rows**: Number of rows processed (estimated vs actual)
- **Scan Type**: Sequential scan vs index scan
- **Execution Time**: Actual time spent on operation
For detailed execution plan interpretation, see `references/explain-guide.md`.
### Step 2: Identify Optimization Opportunities
**Common Red Flags:**
| Indicator | Problem | Solution |
|-----------|---------|----------|
| Seq Scan / Table Scan | Full table scan on large table | Add index on filter columns |
| High row count | Processing excessive rows | Add WHERE filter or index |
| Nested Loop with large outer table | Inefficient join algorithm | Index join columns |
| Correlated subquery | Subquery executes per row | Rewrite as JOIN or EXISTS |
| Sort operation on large result set | Expensive sorting | Add index matching ORDER BY |
For scan type interpretation, see `references/scan-types.md`.
### Step 3: Apply Indexing Strategies
**Index Decision Framework:**
```
Is column used in WHERE, JOIN, ORDER BY, or GROUP BY?
├─ YES → Is column selective (many unique values)?
│ ├─ YES → Is table frequently queried?
│ │ ├─ YES → ADD INDEX
│ │ └─ NO → Consider based on query frequency
│ └─ NO (low selectivity) → Skip index
└─ NO → Skip index
```
**Index Types by Use Case:**
**PostgreSQL:**
- **B-tree** (default): General-purpose, supports <, ≤, =, ≥, >, BETWEEN, IN
- **Hash**: Equality comparisons only (=)
- **GIN**: Full-text search, JSONB, arrays
- **GiST**: Spatial data, geometric types
- **BRIN**: Very large tables with naturally ordered data
**MySQL:**
- **B-tree** (default): General-purpose index
- **Full-text**: Text search on VARCHAR/TEXT columns
- **Spatial**: Spatial data types
**SQL Server:**
- **Clustered**: Table data sorted by index (one per table)
- **Non-clustered**: Separate index structure (multiple allowed)
For comprehensive indexing guidance, see `references/indexing-decisions.md` and `references/index-types.md`.
### Step 4: Design Composite Indexes
For queries filtering on multiple columns, use composite indexes:
**Column Order Matters:**
1. **Equality filters first** (most selective)
2. **Additional equality filters** (by selectivity)
3. **Range filters or ORDER BY** (last)
**Example:**
```sql
-- Query pattern
SELECT * FROM orders
WHERE customer_id = 123 AND status = 'shipped'
ORDER BY created_at DESC
LIMIT 10;
-- Optimal composite index
CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);
```
For composite index design patterns, see `references/composite-indexes.md`.
### Step 5: Rewrite Inefficient Queries
**Common Anti-Patterns to Avoid:**
**1. SELECT * (Over-fetching)**
```sql
-- ❌ Bad: Fetches all columns
SELECT * FROM users WHERE id = 1;
-- ✅ Good: Fetch only needed columns
SELECT id, name, email FROM users WHERE id = 1;
```
**2. N+1 Queries**
```sql
-- ❌ Bad: 1 + N queries
SELECT * FROM users LIMIT 100;
-- Then in loop: SELECT * FROM posts WHERE user_id = ?;
-- ✅ Good: Single JOIN
SELECT users.*, posts.id AS post_id, posts.title
FROM users
LEFT JOIN posts ON users.id = posts.user_id;
```
**3. Non-Sargable Queries** (functions on indexed columns)
```sql
-- ❌ Bad: Function prevents index usage
SELECT * FROM orders WHERE YEAR(created_at) = 2025;
-- ✅ Good: Sargable range condition
SELECT * FROM orders
WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01';
```
**4. Correlated Subqueries**
```sql
-- ❌ Bad: Subquery executes per row
SELECT name,
(SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id)
FROM users;
-- ✅ Good: JOIN with GROUP BY
SELECT users.name, COUNT(orders.id) AS order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
GROUP BY users.id, users.name;
```
For complete anti-pattern reference, see `references/anti-patterns.md`.
For efficient query patterns, see `references/efficient-patterns.md`.
## Quick Reference Tables
### Index Selection Guide
| Query Pattern | Index Type | Example |
|--------------|------------|---------|
| `WHERE column = value` | Single-column B-tree | `CREATE INDEX ON table (column)` |
| `WHERE col1 = ? AND col2 = ?` | Composite B-tree | `CREATE INDEX ON table (col1, col2)` |
| `WHERE text_col LIKE '%word%'` | Full-text (GIN/Full-text) | `CREATE INDEX ON table USING GIN (to_tsvector('english', text_col))` |
| `WHERE geom && box` | Spatial (GiST) | `CREATE INDEX ON table USING GIST (geom)` |
| `WHERE json_col @> '{"key":"value"}'` | JSONB (GIN) | `CREATE INDEX ON table USING GIN (json_col)` |
### Join Optimization Checklist
- [ ] Index foreign key columns on both sides of JOIN
- [ ] Order joins starting with table returning fewest rows
- [ ] Use INNER JOIN when possible (more efficient than OUTER JOIN)
- [ ] Avoid joining more than 5 tables (break into CTEs or subqueries)
- [ ] Consider denormalization for frequently joined tables in read-heavy systems
### Execution Plan Performance Targets
| Scan Type | Performance | When Acceptable |
|-----------|-------------|-----------------|
| Index-Only Scan | Best | Always preferred |
| Index Scan | Excellent | Small-medium result sets |
| Bitmap Heap Scan | Good | Medium result sets (PostgreSQL) |
| Sequential Scan | Poor | Only for small tables (<1000 rows) or full table queries |
| Table Scan | Poor | Only for small tables or unavoidable full scans |
## Database-Specific Optimizations
### PostgreSQL-Specific Features
**Partial Indexes** (index subset of rows):
```sql
CREATE INDEX idx_active_users_login
ON users (last_login)
WHERE status = 'active';
```
**Expression Indexes** (index computed values):
```sql
CREATE INDEX idx_users_email_lower
ON users (LOWER(email));
```
**Covering Indexes** (avoid heap access):
```sql
CREATE INDEX idx_users_email_covering
ON users (email) INCLUDE (id, name);
```
For comprehensive PostgreSQL optimization, see `references/postgresql.md`.
### MySQL-Specific Features
**Index Hints** (override optimizer):
```sql
SELECT * FROM orders USE INDEX (idx_orders_customer)
WHERE customer_id = 123;
```
**Storage Engine Selection:**
- **InnoDB** (default): Transactional, row-level locks, clustered primary key
- **MyISAM**: Faster reads, no transactions, table-level locks
For comprehensive MySQL optimization, see `references/mysql.md`.
### SQL Server-Specific Features
**Query Store** (track query performance over time):
```sql
ALTER DATABASE YourDatabase SET QUERY_STORE = ON;
```
**Execution Plan Warnings:**
- Look for yellow exclamation marks in graphical execution plans
- Thick arrows indicate high row counts
For comprehensive SQL Server optimization, see `references/sqlserver.md`.
## Advanced Optimization Techniques
### Common Table Expressions (CTEs)
Break complex queries into readable, maintainable parts:
```sql
WITH active_customers AS (
SELECT id, name FROM customers WHERE status = 'active'
),
recent_orders AS (
SELECT customer_id, COUNT(*) as order_count
FROM orders
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY customer_id
)
SELECT ac.name, COALESCE(ro.order_count, 0) as orders
FROM active_customers ac
LEFT JOIN recent_orders ro ON ac.id = ro.customer_id;
```
### EXISTS vs IN for Subqueries
Use EXISTS for better performance with large datasets:
```sql
-- ✅ Good: EXISTS stops at first match
SELECT * FROM users
WHERE EXISTS (SELECT 1 FROM orders WHERE orders.user_id = users.id);
-- ❌ Less efficient: IN builds full list
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders);
```
### Denormalization Decision Framework
Consider denormalization when:
- Query joins 3+ tables frequently
- Data is relatively static (infrequent updates)
- Read performance is critical
- Write overhead is acceptable
**Denormalization Strategies:**
1. **Duplicate columns**: Copy foreign key data into main table
2. **Summary tables**: Pre-aggregate data
3. **Materialized views**: Database-maintained denormalized views
4. **Application caching**: Redis/Memcached for frequently accessed data
## Optimization Workflow Example
**Scenario:** API endpoint taking 2 seconds to load
**Step 1: Identify Slow Query**
```
Use APM/observability tools to identify database query causing delay
```
**Step 2: Run EXPLAIN ANALYZE**
```sql
EXPLAIN ANALYZE SELECT * FROM orders
WHERE customer_id = 123
ORDER BY created_at DESC
LIMIT 10;
```
**Step 3: Analyze Output**
```
Seq Scan on orders (cost=0.00..2500.00 rows=10)
Filter: (customer_id = 123)
Rows Removed by Filter: 99990
```
**Problem**: Sequential scan filtering 99,990 rows
**Step 4: Add Composite Index**
```sql
CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);
```
**Step 5: Verify Improvement**
```sql
EXPLAIN ANALYZE SELECT * FROM orders
WHERE customer_id = 123
ORDER BY created_at DESC
LIMIT 10;
```
```
Index Scan using idx_orders_customer_created (cost=0.42..12.44 rows=10)
Index Cond: (customer_id = 123)
```
**Result**: 200x faster (2000ms → 10ms)
## Monitoring and Maintenance
**Regular Optimization Tasks:**
- Review slow query logs weekly
- Update database statistics regularly (ANALYZE in PostgreSQL, UPDATE STATISTICS in SQL Server)
- Monitor index usage (drop unused indexes)
- Archive old data to keep tables manageable
- Review execution plans for critical queries quarterly
**PostgreSQL Statistics Update:**
```sql
ANALYZE table_name;
```
**MySQL Statistics Update:**
```sql
ANALYZE TABLE table_name;
```
**SQL Server Statistics Update:**
```sql
UPDATE STATISTICS table_name;
```
## Related Skills
- **databases-relational**: Schema design and database fundamentals
- **observability**: Performance monitoring and slow query detection
- **api-patterns**: API-level optimization (pagination, caching)
- **performance-engineering**: Application performance profiling
## Additional Resources
For comprehensive documentation, reference these files:
- `references/explain-guide.md` - Detailed EXPLAIN plan interpretation
- `references/scan-types.md` - Scan type meanings and performance implications
- `references/indexing-decisions.md` - When and how to add indexes
- `references/index-types.md` - Database-specific index types
- `references/composite-indexes.md` - Multi-column index design
- `references/anti-patterns.md` - Common anti-patterns with solutions
- `references/efficient-patterns.md` - Efficient query patterns
- `references/postgresql.md` - PostgreSQL-specific optimizations
- `references/mysql.md` - MySQL-specific optimizations
- `references/sqlserver.md` - SQL Server-specific optimizations
For working SQL examples, see `examples/` directory.
This skill optimizes SQL query performance for PostgreSQL, MySQL, and SQL Server through EXPLAIN/plan analysis, targeted indexing strategies, and query rewriting. I provide a pragmatic workflow to detect bottlenecks, design composite and covering indexes, and rewrite anti-patterns to reduce latency and resource usage.
I inspect execution plans (EXPLAIN ANALYZE, FORMAT=JSON, SSMS plans) to identify expensive operations like sequential scans, nested loops, and high row counts. Then I recommend concrete changes: which indexes to add or drop, optimal composite index ordering, sargable rewrites, and database-specific techniques (partial, expression, GIN/GiST, clustered/non-clustered). Finally, I validate improvements by re-running plans and measuring execution time.
How do I know if an index will help?
Check the execution plan for sequential scans, high rows removed by filter, or nested loops; add an index on filter/join columns that are selective and re-run EXPLAIN to confirm reduced cost.
When should I denormalize instead of indexing?
Consider denormalization when joins on 3+ tables are frequent, data changes rarely, read performance is critical, and the write overhead is acceptable; otherwise prefer indexes and materialized views.