home / skills / ratacat / claude-skills / postgres-query-expert

postgres-query-expert skill

safe

This skill helps you construct and optimize PostgreSQL 16 queries, diagnose performance issues, and introspect schemas for robust database work.

npx playbooks add skill ratacat/claude-skills --skill postgres-query-expert

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

7.2 KB

---
name: postgres-query-expert
description: A comprehensive guide for interacting with PostgreSQL 16 databases. Use this skill for constructing standard and advanced SQL queries, optimizing performance, debugging errors, managing schema objects, and introspecting database structure.
allowed-tools: Read, Grep, Glob
---
# PostgreSQL Query Expert

This skill is a definitive reference for PostgreSQL 16, covering query construction, optimization, schema management, and system introspection.

## Instructions

### 1. General Query Standards

- **Syntax**: Adhere to ANSI SQL standards, but prefer PostgreSQL extensions (e.g., `DISTINCT ON`, `RETURNING`, `LATERAL`, `FILTER` clauses) when they provide cleaner logic or better performance.
- **Identifiers**: Use `snake_case` for all identifiers. Only quote identifiers (`"MyTable"`) if absolutely necessary; prefer lowercase unquoted names.
- **Safety**:
  - **Parameterization**: Always use parameters (`$1`, `$2`, …) for literal values. Never inject user input directly.
  - **Timeouts**: For exploratory queries on large databases, prepend `SET LOCAL statement_timeout = '30s';`.
  - **Transactions**: Use explicit `BEGIN` and `COMMIT` blocks for multi-step operations.

### 2. Performance & Optimization

- **Explain plans**: Use `EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)` to diagnose bottlenecks.
- **Red flags**: `Seq Scan` on large tables, high `Buffers: shared hit` (RAM usage), or `Disk: read` (I/O).
- **Indexing**: Recommend specific index types based on usage:
  - **B-tree**: Standard equality/range (`=`, `<`, `>`) queries.
  - **GIN**: For composite types like `JSONB` (`@>`) or arrays (`&&`), and full-text search.
  - **GiST**: For geometric data and ranges.
- **CTEs**: Use Common Table Expressions (`WITH`) for readability. In PG16+, these are optimized (inlined) by default unless `MATERIALIZED` is specified.

## Introspection (Agent Capabilities)

When exploring a new database, use these queries to understand the schema.

### List All Tables

```sql
SELECT n.nspname AS schema,
       c.relname AS table,
       obj_description(c.oid) AS description
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY 1, 2;
```

### Get Table Columns & Types

```sql
SELECT a.attname AS column,
       format_type(a.atttypid, a.atttypmod) AS type,
       a.attnotnull AS not_null,
       col_description(a.attrelid, a.attnum) AS comment
FROM pg_attribute a
WHERE a.attrelid = 'public.target_table_name'::regclass
  AND a.attnum > 0
  AND NOT a.attisdropped
ORDER BY a.attnum;
```

## Reference: Data Querying (DQL)

### Advanced Aggregations

- **Filter clause**: `count(*) FILTER (WHERE status = 'active')`
- **Grouping sets**: `GROUP BY GROUPING SETS ((brand), (brand, category), ())`
- **Any value**: `any_value(col)` (PG16+) returns an arbitrary value from the group.

### Window Functions

Perform calculations across a set of table rows related to the current row.

```sql
SELECT dept,
       emp_no,
       salary,
       -- Rank employees by salary within department
       dense_rank() OVER (PARTITION BY dept ORDER BY salary DESC) AS rank,
       -- Running total of salaries
       sum(salary) OVER (
         PARTITION BY dept
         ORDER BY salary
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) AS running_total
FROM employees;
```

### Pattern Matching

- **LIKE**: `col LIKE 'foo%'` (simple wildcard).
- **ILIKE**: `col ILIKE 'foo%'` (case-insensitive).
- **SIMILAR TO**: `col SIMILAR TO '[a-c]%'` (SQL-regex style).
- **POSIX regex**:
  - Case-sensitive: `col ~ '^[a-z]+$'`
  - Case-insensitive: `col ~* 'foo'`

## Reference: Data Modification (DML)

### MERGE (Upsert / Conditional Ops)

Standard SQL method for inserting, updating, or deleting based on join conditions (PG15+).

```sql
MERGE INTO wine_stock ws
USING wine_shipments s
  ON s.winery_id = ws.winery_id
 AND s.year = ws.year
WHEN MATCHED THEN
  UPDATE SET stock = ws.stock + s.count
WHEN NOT MATCHED THEN
  INSERT (winery_id, year, stock)
  VALUES (s.winery_id, s.year, s.count);
```

### INSERT ... ON CONFLICT (Legacy Upsert)

Postgres-specific, often more concise for simple unique-key conflicts.

```sql
INSERT INTO kv_store (key, value)
VALUES ('config', '{"a":1}')
ON CONFLICT (key)
DO UPDATE SET value = EXCLUDED.value;
```

### RETURNING Clause

Return data from modified rows immediately.

```sql
DELETE FROM archived_logs
WHERE created_at < NOW() - INTERVAL '1 year'
RETURNING id, created_at;
```

## Reference: Special Data Types

### JSONB (Binary JSON)

Prefer `jsonb` over `json` for storage and indexing.

| Operator | Description | Example |
|---|---|---|
| `->` / `->>` | Get element (JSON / text) | `data->'key'` |
| `@>` | Contains (indexable) | `data @> '{"tag": "urgent"}'` |
| `?` | Key exists | `data ? 'error'` |
| `#-` | Delete path | `data #- '{info, sensitive}'` |

SQL/JSON path (PG12+):

```sql
-- Find all items with price > 10
SELECT jsonb_path_query(data, '$.items[*] ? (@.price > 10)')
FROM orders;
```

### Arrays

```sql
SELECT ARRAY[1,2,3];           -- Creation
SELECT (ARRAY[1,2,3])[1];      -- Access (1-based index)
SELECT 1 = ANY(arr_col);       -- Check if value exists in array
SELECT unnest(arr_col) FROM t; -- Expand array to rows
```

### Range Types

Useful for scheduling and validity periods.

- `tstzrange`: timestamp with time zone range.
- `int4range`, `daterange`: integer and date ranges.
- Overlap operator (`&&`): checks if two ranges overlap.

```sql
SELECT *
FROM reservations
WHERE duration && tstzrange('2023-01-01 10:00', '2023-01-01 12:00');
```

## Reference: System Administration & Stats

### Kill Long-Running Query

```sql
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'active'
  AND pid <> pg_backend_pid()
  AND query_start < NOW() - INTERVAL '5 minutes';
```

### Check Table Size (Disk Usage)

```sql
SELECT relname,
       pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
       pg_size_pretty(pg_relation_size(relid)) AS data_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;
```

## Examples

### Scenario 1: Recursive CTE for Graph/Tree Data

Navigating an organizational hierarchy.

```sql
WITH RECURSIVE subordinates AS (
    -- Base case: the manager
    SELECT employee_id, manager_id, full_name, 0 AS level
    FROM employees
    WHERE employee_id = $1

    UNION ALL

    -- Recursive step: direct reports
    SELECT e.employee_id, e.manager_id, e.full_name, s.level + 1
    FROM employees e
    INNER JOIN subordinates s ON s.employee_id = e.manager_id
)
SELECT *
FROM subordinates;
```

### Scenario 2: Lateral Join for "Top N per Category"

Efficiently getting the latest 3 posts for each user.

```sql
SELECT u.username, p.title, p.created_at
FROM users u
CROSS JOIN LATERAL (
    SELECT title, created_at
    FROM posts
    WHERE user_id = u.id
    ORDER BY created_at DESC
    LIMIT 3
) p
WHERE u.status = 'active';
```

### Scenario 3: Full Text Search with Ranking

Searching a blog table.

```sql
SELECT id,
       title,
       ts_rank(to_tsvector('english', title || ' ' || content), query) AS rank
FROM articles,
     to_tsquery('english', 'postgres | optimization') query
WHERE to_tsvector('english', title || ' ' || content) @@ query
ORDER BY rank DESC;
```

Overview

This skill is a practical, compact reference for interacting with PostgreSQL 16. It focuses on building and optimizing SQL queries, debugging problems, managing schema objects, and inspecting database structure. Use it to produce safe, performant SQL and to interpret server-side metrics and plans.

How this skill works

The skill recommends PostgreSQL-specific syntax and ANSI-compliant patterns, enforces parameterization, and suggests protective settings like statement timeouts. It supplies diagnostic queries for schema introspection and server stats, examples for advanced SQL constructs (CTEs, window functions, MERGE, JSONB and arrays), and guidance on indexing and explain-plan analysis.

When to use it

Writing or refactoring queries for PostgreSQL 16 to improve clarity or performance
Diagnosing slow queries using EXPLAIN ANALYZE and buffer/I/O indicators
Exploring an unfamiliar database schema and extracting column/type metadata
Implementing safe data changes with transactions, MERGE, or INSERT ... ON CONFLICT
Designing indexes and choosing appropriate index types (B-tree, GIN, GiST)

Best practices

Always parameterize user input (use $1, $2, ...); never inject raw values
Use SET LOCAL statement_timeout for exploratory queries on large tables
Prefer snake_case unquoted identifiers and lowercase names unless quoting is required
Use EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) to find Seq Scan or heavy I/O
Choose index types by access patterns: B-tree for ranges/equality, GIN for jsonb/arrays, GiST for geometric/range

Example use cases

Inspecting all non-system tables and their descriptions to map a database
Fetching column names and types for a target table to scaffold migrations or API models
Optimizing a reporting query with window functions and GROUPING SETS for aggregated rolls
Implementing safe upserts with MERGE or INSERT ... ON CONFLICT and returning affected rows
Killing runaway queries and checking table disk usage during maintenance

FAQ

When should I use CTEs versus subqueries?

Use CTEs for readability and logical separation. In PostgreSQL 16 CTEs inline by default, so prefer them unless you need MATERIALIZED for caching.

How do I choose between MERGE and INSERT ... ON CONFLICT?

Use MERGE for complex conditional workflows (update/delete/insert). Use INSERT ... ON CONFLICT for concise upserts on a single unique key.