home / skills / windmill-labs / windmill / write-script-duckdb

This skill helps you write DuckDB queries with parameter binding, external connections, and ducklake integration for streamlined data access.

npx playbooks add skill windmill-labs/windmill --skill write-script-duckdb

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.3 KB
---
name: write-script-duckdb
description: MUST use when writing DuckDB queries.
---

## CLI Commands

Place scripts in a folder. After writing, run:
- `wmill script generate-metadata` - Generate .script.yaml and .lock files
- `wmill sync push` - Deploy to Windmill

Use `wmill resource-type list --schema` to discover available resource types.

# DuckDB

Arguments are defined with comments and used with `$name` syntax:

```sql
-- $name (text) = default
-- $age (integer)
SELECT * FROM users WHERE name = $name AND age > $age;
```

## Ducklake Integration

Attach Ducklake for data lake operations:

```sql
-- Main ducklake
ATTACH 'ducklake' AS dl;

-- Named ducklake
ATTACH 'ducklake://my_lake' AS dl;

-- Then query
SELECT * FROM dl.schema.table;
```

## External Database Connections

Connect to external databases using resources:

```sql
ATTACH '$res:path/to/resource' AS db (TYPE postgres);
SELECT * FROM db.schema.table;
```

## S3 File Operations

Read files from S3 storage:

```sql
-- Default storage
SELECT * FROM read_csv('s3:///path/to/file.csv');

-- Named storage
SELECT * FROM read_csv('s3://storage_name/path/to/file.csv');

-- Parquet files
SELECT * FROM read_parquet('s3:///path/to/file.parquet');

-- JSON files
SELECT * FROM read_json('s3:///path/to/file.json');
```

Overview

This skill provides concise guidance for writing DuckDB queries inside scripts used by the platform. It explains how to declare parameters, attach Ducklake and external databases, read S3 files, and deploy scripts with the CLI. Use it to author safe, parameterized queries that integrate with external resources and data lakes.

How this skill works

Arguments are declared in SQL comments and referenced with the $name syntax, letting the runtime substitute values safely. You attach Ducklake instances or external database resources using ATTACH statements, then run SELECT or read_* functions (read_csv, read_parquet, read_json) to query local files or S3. After writing scripts, generate metadata and push deployments using the CLI so scripts become callable webhooks or workflows.

When to use it

  • When writing DuckDB queries that need runtime parameters
  • When querying data lakes or attaching a Ducklake instance
  • When reading files stored on S3 (CSV, Parquet, JSON)
  • When connecting to external databases (Postgres etc.) via resources
  • When preparing scripts to deploy as webhooks, workflows, or UIs

Best practices

  • Declare all inputs as comment arguments (e.g., -- $name (text) = default) to ensure safe, documented parameters
  • Use ATTACH 'ducklake' AS dl for data-lake workloads and reference dl.schema.table to keep queries explicit
  • Reference external DBs through resource paths (ATTACH '$res:path' AS db (TYPE postgres)) instead of embedding credentials
  • Use read_csv/read_parquet/read_json for S3 files and include named storage when needed (s3://storage_name/path)
  • Run `wmill script generate-metadata` after edits and `wmill sync push` to deploy changes
  • Discover resource types and schemas with `wmill resource-type list --schema` before use

Example use cases

  • Ad-hoc analysis: parameterized SELECT queries against tables in a Ducklake
  • ETL step: read Parquet from S3, transform in DuckDB, and write results back to a lake
  • Cross-source joins: attach a Postgres resource and join its tables with lake tables
  • Webhook-backed report: deploy a script that accepts parameters and returns filtered query results
  • Data validation: run JSON/CSV reads from S3 to validate incoming file schemas

FAQ

How do I declare and use a query parameter?

Add a comment like -- $name (text) = default above the query and reference it as $name in SQL; the runtime will substitute the provided value.

How do I connect to my data lake or an external DB?

Use ATTACH 'ducklake' AS dl or ATTACH 'ducklake://my_lake' AS dl for Ducklake. For external DBs, ATTACH '$res:path/to/resource' AS db (TYPE postgres) and query db.schema.table.