home / skills / openclaw / skills / duckdb-cli-ai-skills

duckdb-cli-ai-skills skill

/skills/camelsprout/duckdb-cli-ai-skills

This skill helps you perform SQL analysis and data conversion with the DuckDB CLI, enabling fast queries, file reads, and format exports.

npx playbooks add skill openclaw/skills --skill duckdb-cli-ai-skills

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
6.4 KB
---
name: duckdb-en
description: DuckDB CLI specialist for SQL analysis, data processing and file conversion. Use for SQL queries, CSV/Parquet/JSON analysis, database queries, or data conversion. Triggers on "duckdb", "sql", "query", "data analysis", "parquet", "convert data".
---

# DuckDB CLI Specialist

Helps with data analysis, SQL queries and file conversion via DuckDB CLI.

## Quick Start

### Read data files directly with SQL
```bash
# CSV
duckdb -c "SELECT * FROM 'data.csv' LIMIT 10"

# Parquet
duckdb -c "SELECT * FROM 'data.parquet'"

# Multiple files with glob
duckdb -c "SELECT * FROM read_parquet('logs/*.parquet')"

# JSON
duckdb -c "SELECT * FROM read_json_auto('data.json')"
```

### Open persistent databases
```bash
# Create/open database
duckdb my_database.duckdb

# Read-only mode
duckdb -readonly existing.duckdb
```

## Command Line Arguments

### Output formats (as flags)
| Flag | Format |
|------|--------|
| `-csv` | Comma-separated |
| `-json` | JSON array |
| `-table` | ASCII table |
| `-markdown` | Markdown table |
| `-html` | HTML table |
| `-line` | One value per line |

### Execution arguments
| Argument | Description |
|----------|-------------|
| `-c COMMAND` | Run SQL and exit |
| `-f FILENAME` | Run script from file |
| `-init FILE` | Use alternative to ~/.duckdbrc |
| `-readonly` | Open in read-only mode |
| `-echo` | Show commands before execution |
| `-bail` | Stop on first error |
| `-header` / `-noheader` | Show/hide column headers |
| `-nullvalue TEXT` | Text for NULL values |
| `-separator SEP` | Column separator |

## Data Conversion

### CSV to Parquet
```bash
duckdb -c "COPY (SELECT * FROM 'input.csv') TO 'output.parquet' (FORMAT PARQUET)"
```

### Parquet to CSV
```bash
duckdb -c "COPY (SELECT * FROM 'input.parquet') TO 'output.csv' (HEADER, DELIMITER ',')"
```

### JSON to Parquet
```bash
duckdb -c "COPY (SELECT * FROM read_json_auto('input.json')) TO 'output.parquet' (FORMAT PARQUET)"
```

### Convert with filtering
```bash
duckdb -c "COPY (SELECT * FROM 'data.csv' WHERE amount > 1000) TO 'filtered.parquet' (FORMAT PARQUET)"
```

## Dot Commands

### Schema inspection
| Command | Description |
|---------|-------------|
| `.tables [pattern]` | Show tables (with LIKE pattern) |
| `.schema [table]` | Show CREATE statements |
| `.databases` | Show attached databases |

### Output control
| Command | Description |
|---------|-------------|
| `.mode FORMAT` | Change output format |
| `.output file` | Send output to file |
| `.once file` | Next output to file |
| `.headers on/off` | Show/hide column headers |
| `.separator COL ROW` | Set separators |

### Queries
| Command | Description |
|---------|-------------|
| `.timer on/off` | Show execution time |
| `.echo on/off` | Show commands before execution |
| `.bail on/off` | Stop on error |
| `.read file.sql` | Run SQL from file |

### Editing
| Command | Description |
|---------|-------------|
| `.edit` or `\e` | Open query in external editor |
| `.help [pattern]` | Show help |

## Output Formats (18 available)

### Data export
- **csv** - Comma-separated for spreadsheets
- **tabs** - Tab-separated
- **json** - JSON array
- **jsonlines** - Newline-delimited JSON (streaming)

### Readable formats
- **duckbox** (default) - Pretty ASCII with unicode box-drawing
- **table** - Simple ASCII table
- **markdown** - For documentation
- **html** - HTML table
- **latex** - For academic papers

### Specialized
- **insert TABLE** - SQL INSERT statements
- **column** - Columns with adjustable width
- **line** - One value per line
- **list** - Pipe-separated
- **trash** - Discard output

## Keyboard Shortcuts (macOS/Linux)

### Navigation
| Shortcut | Action |
|----------|--------|
| `Home` / `End` | Start/end of line |
| `Ctrl+Left/Right` | Jump word |
| `Ctrl+A` / `Ctrl+E` | Start/end of buffer |

### History
| Shortcut | Action |
|----------|--------|
| `Ctrl+P` / `Ctrl+N` | Previous/next command |
| `Ctrl+R` | Search history |
| `Alt+<` / `Alt+>` | First/last in history |

### Editing
| Shortcut | Action |
|----------|--------|
| `Ctrl+W` | Delete word backward |
| `Alt+D` | Delete word forward |
| `Alt+U` / `Alt+L` | Uppercase/lowercase word |
| `Ctrl+K` | Delete to end of line |

### Autocomplete
| Shortcut | Action |
|----------|--------|
| `Tab` | Autocomplete / next suggestion |
| `Shift+Tab` | Previous suggestion |
| `Esc+Esc` | Undo autocomplete |

## Autocomplete

Context-aware autocomplete activated with `Tab`:
- **Keywords** - SQL commands
- **Table names** - Database objects
- **Column names** - Fields and functions
- **File names** - Path completion

## Database Operations

### Create table from file
```sql
CREATE TABLE sales AS SELECT * FROM 'sales_2024.csv';
```

### Insert data
```sql
INSERT INTO sales SELECT * FROM 'sales_2025.csv';
```

### Export table
```sql
COPY sales TO 'backup.parquet' (FORMAT PARQUET);
```

## Analysis Examples

### Quick statistics
```sql
SELECT
    COUNT(*) as count,
    AVG(amount) as average,
    SUM(amount) as total
FROM 'transactions.csv';
```

### Grouping
```sql
SELECT
    category,
    COUNT(*) as count,
    SUM(amount) as total
FROM 'data.csv'
GROUP BY category
ORDER BY total DESC;
```

### Join on files
```sql
SELECT a.*, b.name
FROM 'orders.csv' a
JOIN 'customers.parquet' b ON a.customer_id = b.id;
```

### Describe data
```sql
DESCRIBE SELECT * FROM 'data.csv';
```

## Pipe and stdin

```bash
# Read from stdin
cat data.csv | duckdb -c "SELECT * FROM read_csv('/dev/stdin')"

# Pipe to another command
duckdb -csv -c "SELECT * FROM 'data.parquet'" | head -20

# Write to stdout
duckdb -c "COPY (SELECT * FROM 'data.csv') TO '/dev/stdout' (FORMAT CSV)"
```

## Configuration

Save common settings in `~/.duckdbrc`:
```sql
.timer on
.mode duckbox
.maxrows 50
.highlight on
```

### Syntax highlighting colors
```sql
.keyword green
.constant yellow
.comment brightblack
.error red
```

## External Editor

Open complex queries in your editor:
```sql
.edit
```

Editor is chosen from: `DUCKDB_EDITOR` → `EDITOR` → `VISUAL` → `vi`

## Safe Mode

Secure mode that restricts file access. When enabled:
- No external file access
- Disables `.read`, `.output`, `.import`, `.sh` etc.
- **Cannot** be disabled in the same session

## Tips

- Use `LIMIT` on large files for quick preview
- Parquet is faster than CSV for repeated queries
- `read_csv_auto` and `read_json_auto` guess column types
- Arguments are processed in order (like SQLite CLI)
- WSL2 may show incorrect `memory_limit` values on some Ubuntu versions

Overview

This skill is a DuckDB CLI specialist that helps run SQL queries, inspect files, and convert between CSV, Parquet, and JSON. It focuses on fast, file-based analysis and lightweight database operations from the command line. Use it to preview large datasets, export tables, and automate data conversion workflows.

How this skill works

The skill constructs and runs DuckDB CLI commands to read files directly with SQL, open persistent databases, and export results in many formats. It uses DuckDB functions like read_parquet and read_json_auto and supports COPY statements to convert data between formats. Command flags and dot-commands control output format, headers, timing, and session behavior.

When to use it

  • Quickly preview large CSV, Parquet, or JSON files with SQL LIMITs
  • Convert data formats (CSV ↔ Parquet, JSON → Parquet) for faster downstream processing
  • Run ad-hoc analytics and aggregations without loading data into another engine
  • Attach or open a persistent .duckdb database for repeated queries
  • Pipe data into or out of other shell tools for ETL scripts

Best practices

  • Use LIMIT when sampling large files to avoid long scans
  • Prefer Parquet for repeated reads and analytical queries for speed and smaller I/O
  • Use read_..._auto functions to let DuckDB infer schema, then materialize if needed
  • Save commonly used settings in ~/.duckdbrc for consistent CLI behavior
  • Use COPY with WHERE to create filtered, lightweight Parquet subsets for downstream jobs

Example use cases

  • Inspect the first rows of a 50GB CSV: duckdb -c "SELECT * FROM 'big.csv' LIMIT 10"
  • Convert log JSON to Parquet for analytics: duckdb -c "COPY (SELECT * FROM read_json_auto('logs.json')) TO 'logs.parquet' (FORMAT PARQUET)'"
  • Join a CSV and Parquet file without loading into memory for enrichment and export
  • Open an existing database read-only for safe interactive exploration: duckdb -readonly archive.duckdb
  • Pipe CLI output to other tools: duckdb -csv -c "SELECT * FROM data.parquet" | head -20

FAQ

Can I run DuckDB queries directly on local files?

Yes. DuckDB can query CSV, Parquet, and JSON files directly using SQL functions like read_parquet and read_json_auto or by selecting from the file path.

How do I convert CSV to Parquet with filters?

Use COPY with a SELECT that applies a WHERE clause, for example: duckdb -c "COPY (SELECT * FROM 'input.csv' WHERE amount>1000) TO 'out.parquet' (FORMAT PARQUET)"