home / skills / aj-geddes / useful-ai-prompts / database-indexing-strategy

database-indexing-strategy skill

This skill designs and implements database indexing strategies to boost query performance and reduce contention in PostgreSQL and MySQL.

npx playbooks add skill aj-geddes/useful-ai-prompts --skill database-indexing-strategy

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

7.8 KB

---
name: database-indexing-strategy
description: Design and implement database indexing strategies. Use when creating indexes, choosing index types, or optimizing index performance in PostgreSQL and MySQL.
---

# Database Indexing Strategy

## Overview

Design comprehensive indexing strategies to improve query performance, reduce lock contention, and maintain data integrity. Covers index types, design patterns, and maintenance procedures.

## When to Use

- Index creation and planning
- Query performance optimization through indexing
- Index type selection (B-tree, Hash, GiST, BRIN)
- Composite and partial index design
- Index maintenance and monitoring
- Storage optimization with indexes
- Full-text search index design

## Index Types and Use Cases

### PostgreSQL Index Types

**B-tree Indexes (Default):**

```sql
-- Standard equality and range queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

-- Composite indexes for multi-column queries
CREATE INDEX idx_orders_user_status
ON orders(user_id, status)
WHERE cancelled_at IS NULL;
```

**Hash Indexes:**

```sql
-- Exact match queries only
CREATE INDEX idx_product_sku USING hash ON products(sku);

-- Good for equality lookups on large text fields
CREATE INDEX idx_uuid_hash USING hash ON sessions(session_id);
```

**BRIN Indexes (Block Range):**

```sql
-- For large tables with monotonically increasing columns
CREATE INDEX idx_events_timestamp USING brin ON events(created_at)
WITH (pages_per_range = 128);

-- Excellent for time-series data
CREATE INDEX idx_logs_timestamp USING brin
ON application_logs(log_timestamp);
```

**GiST & GIN Indexes:**

```sql
-- GiST for spatial data and complex types
CREATE INDEX idx_locations_geom USING gist ON locations(geom);

-- GIN for JSONB and array columns
CREATE INDEX idx_products_metadata USING gin ON products(metadata);
CREATE INDEX idx_user_tags USING gin ON users(tags);
```

### MySQL Index Types

**B-tree Indexes:**

```sql
-- Standard index for most queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at);

-- Prefix indexes for large columns
CREATE INDEX idx_description_prefix
ON products(description(100));
```

**FULLTEXT Indexes:**

```sql
-- Full-text search on text columns
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);

-- Query using MATCH...AGAINST
SELECT * FROM products
WHERE MATCH(name, description) AGAINST('laptop' IN BOOLEAN MODE);
```

**Spatial Indexes:**

```sql
-- For geographic data
CREATE SPATIAL INDEX idx_locations
ON locations(geom);
```

## Index Design Patterns

### Single Column Indexes

**PostgreSQL:**

```sql
-- Filtered index for active records only
CREATE INDEX idx_users_active
ON users(created_at)
WHERE deleted_at IS NULL;

-- Descending order for LIMIT queries
CREATE INDEX idx_posts_published DESC
ON posts(published_at DESC)
WHERE status = 'published';
```

**MySQL:**

```sql
-- Simple equality lookup
CREATE INDEX idx_users_verified ON users(email_verified);

-- Range queries on numeric columns
CREATE INDEX idx_products_price ON products(price);
```

### Composite Indexes

**PostgreSQL - Optimal Ordering:**

```sql
-- Order: equality columns, then range, then sort
-- Query: WHERE user_id = X AND created_at > Y ORDER BY id
CREATE INDEX idx_optimal_composite
ON orders(user_id, created_at, id);

-- Covering index to eliminate table access
CREATE INDEX idx_covering_orders
ON orders(user_id, status, created_at)
INCLUDE (total, currency);
```

**MySQL - Leftmost Prefix:**

```sql
-- MySQL uses leftmost prefix matching
-- Can be used by: (user_id), (user_id, status), (user_id, status, created_at)
CREATE INDEX idx_users_complex
ON users(user_id, status, created_at);

-- For queries: user_id + status + created_at
SELECT * FROM orders
WHERE user_id = 1 AND status = 'completed' AND created_at > '2024-01-01';
```

### Partial/Filtered Indexes

**PostgreSQL:**

```sql
-- Only index active products
CREATE INDEX idx_active_products
ON products(category_id)
WHERE active = true;

-- Reduce index size and improve performance
CREATE INDEX idx_not_cancelled_orders
ON orders(user_id, created_at)
WHERE status != 'cancelled';

-- Complex filter conditions
CREATE INDEX idx_vip_orders
ON orders(total DESC)
WHERE total > 10000 AND customer_type = 'vip';
```

### Expression Indexes

**PostgreSQL:**

```sql
-- Index on computed values
CREATE INDEX idx_users_email_lower
ON users(LOWER(email));

-- Enable case-insensitive searches
SELECT * FROM users WHERE LOWER(email) = '[email protected]';

-- Date extraction indexes
CREATE INDEX idx_orders_year
ON orders(EXTRACT(YEAR FROM created_at));
```

## Index Maintenance

**PostgreSQL Index Analysis:**

```sql
-- Check index size and usage
SELECT schemaname, tablename, indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size,
  idx_scan as scans,
  idx_tup_read as tuples_read
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Find unused indexes
SELECT schemaname, tablename, indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE 'pg_toast%';

-- Rebuild fragmented indexes
REINDEX INDEX idx_users_email;
```

**MySQL Index Statistics:**

```sql
-- Check index cardinality
SELECT object_schema, object_name, count_star
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE object_schema != 'mysql'
ORDER BY count_star DESC;

-- Update table statistics
ANALYZE TABLE users;
ANALYZE TABLE orders;
```

## Concurrent Index Creation

**PostgreSQL - Non-blocking Index Creation:**

```sql
-- Create index without locking table (PostgreSQL 9.2+)
CREATE INDEX CONCURRENTLY idx_new_column
ON large_table(new_column);

-- Safe for production
REINDEX INDEX CONCURRENTLY idx_products_price;
```

**MySQL - Concurrent Index Creation:**

```sql
-- MySQL 8.0 supports ALGORITHM=INPLACE with LOCK=NONE
ALTER TABLE users ADD INDEX idx_created (created_at),
ALGORITHM=INPLACE, LOCK=NONE;

-- Check online DDL progress
SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;
```

## Performance Monitoring

**PostgreSQL - Index Performance:**

```sql
-- Top 10 most scanned indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC LIMIT 10;

-- Indexes with high read/scan ratio
SELECT indexname, idx_scan, idx_tup_read,
  CASE WHEN idx_scan = 0 THEN 0
    ELSE ROUND(idx_tup_read::numeric / idx_scan, 2) END as efficiency
FROM pg_stat_user_indexes
WHERE idx_scan > 0
ORDER BY efficiency DESC;
```

**MySQL - Index Statistics:**

```sql
-- Show table index information
SHOW INDEX FROM products;

-- Check cardinality (distribution)
SELECT * FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
  AND TABLE_NAME = 'products'
ORDER BY SEQ_IN_INDEX;
```

## Index Creation Checklist

- Identify slow queries with EXPLAIN/EXPLAIN ANALYZE
- Check filter columns, JOIN conditions, ORDER BY clauses
- Consider index order (equality → range → sort)
- Use partial indexes to reduce size on large tables
- Include columns for covering indexes
- Monitor index usage after creation
- Drop unused indexes to save space
- Rebuild fragmented indexes periodically

## Common Mistakes

❌ Don't create too many indexes (write performance impact)
❌ Don't create indexes without testing first
❌ Don't ignore index size and storage impact
❌ Don't forget to update table statistics after bulk operations
❌ Don't create duplicate indexes

✅ DO create indexes on foreign keys
✅ DO test index impact on INSERT/UPDATE performance
✅ DO use covering indexes for common queries
✅ DO drop unused indexes regularly
✅ DO monitor index fragmentation

## Resources

- [PostgreSQL Indexes Documentation](https://www.postgresql.org/docs/current/indexes.html)
- [MySQL Indexes Documentation](https://dev.mysql.com/doc/refman/8.0/en/optimization-indexes.html)
- [PostgreSQL Index Types](https://www.postgresql.org/docs/current/indexes-types.html)

Overview

This skill designs and implements database indexing strategies to improve query performance, reduce lock contention, and control storage cost for PostgreSQL and MySQL. It covers index type selection, composite and partial index patterns, concurrent index creation, and ongoing maintenance guidance. Use it to plan indexes before deployment and to optimize existing schemas with measurable outcomes.

How this skill works

The skill inspects query patterns (WHERE, JOIN, ORDER BY), table cardinality, and workload characteristics to recommend index types (B-tree, BRIN, GIN/GiST, Hash, FULLTEXT, spatial) and column ordering for composite indexes. It generates creation statements, suggests concurrency-safe operations, and provides monitoring queries to track index usage, size, and efficiency. It also offers a checklist for testing, maintenance, and removing unused indexes.

When to use it

Planning indexes for new schemas or features
Optimizing slow queries by adding or adjusting indexes
Choosing index types for specific data (JSONB, spatial, time-series)
Designing composite, partial/filtered, or expression indexes
Performing concurrent index creation in production
Maintaining and auditing index usage and fragmentation

Best practices

Prioritize equality columns first, then range and sort keys when ordering composite indexes
Use partial/filtered indexes to reduce size on large tables and target common filters
Prefer covering indexes (INCLUDE) to avoid table lookups for frequent queries
Monitor idx_scan, idx_tup_read and index size; drop indexes with no activity
Create indexes concurrently/online to avoid locking large production tables
Test index impact on write performance (INSERT/UPDATE/DELETE) before deploying widely

Example use cases

Add a B-tree composite index for queries filtering by user_id and a recent created_at range
Create a GIN index for JSONB metadata and common containment queries in PostgreSQL
Use BRIN on very large time-series tables where timestamps are monotonic to save space
Add a FULLTEXT index in MySQL for product search and tune queries with MATCH...AGAINST
Rebuild fragmented indexes and run ANALYZE/ANALYZE TABLE after large bulk loads to refresh planner statistics

FAQ

How do I choose between B-tree and BRIN?

Use B-tree for selective, point and range queries; use BRIN for very large, naturally clustered data (time-series) where index size and maintenance cost must be minimal.

When should I use partial or expression indexes?

Use partial indexes to index only relevant rows (e.g., active=true) to reduce size. Use expression indexes for computed lookups (LOWER(col), date parts) to support case-insensitive or derived-value queries.