home / skills / jeremylongshore / claude-code-plugins-plus-skills / airflow-dag-generator

airflow-dag-generator skill

/skills/11-data-pipelines/airflow-dag-generator

This skill guides you through airflow dag generator tasks, generating production-ready configurations and validating results for reliable data pipelines.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill airflow-dag-generator

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
2.2 KB
---
name: "airflow-dag-generator"
description: |
  Generate airflow dag generator operations. Auto-activating skill for Data Pipelines.
  Triggers on: airflow dag generator, airflow dag generator
  Part of the Data Pipelines skill category. Use when working with airflow dag generator functionality. Trigger with phrases like "airflow dag generator", "airflow generator", "airflow".
allowed-tools: "Read, Write, Edit, Bash(cmd:*), Grep"
version: 1.0.0
license: MIT
author: "Jeremy Longshore <[email protected]>"
---

# Airflow Dag Generator

## Overview

This skill provides automated assistance for airflow dag generator tasks within the Data Pipelines domain.

## When to Use

This skill activates automatically when you:
- Mention "airflow dag generator" in your request
- Ask about airflow dag generator patterns or best practices
- Need help with data pipeline skills covering etl, data transformation, workflow orchestration, and streaming data processing.

## Instructions

1. Provides step-by-step guidance for airflow dag generator
2. Follows industry best practices and patterns
3. Generates production-ready code and configurations
4. Validates outputs against common standards

## Examples

**Example: Basic Usage**
Request: "Help me with airflow dag generator"
Result: Provides step-by-step guidance and generates appropriate configurations


## Prerequisites

- Relevant development environment configured
- Access to necessary tools and services
- Basic understanding of data pipelines concepts


## Output

- Generated configurations and code
- Best practice recommendations
- Validation results


## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| Configuration invalid | Missing required fields | Check documentation for required parameters |
| Tool not found | Dependency not installed | Install required tools per prerequisites |
| Permission denied | Insufficient access | Verify credentials and permissions |


## Resources

- Official documentation for related tools
- Best practices guides
- Community examples and tutorials

## Related Skills

Part of the **Data Pipelines** skill category.
Tags: etl, airflow, spark, streaming, data-engineering

Overview

This skill accelerates creation and validation of Apache Airflow DAGs for data pipelines. It generates production-ready DAG code, configurations, and step-by-step guidance tailored to ETL, transformation, and streaming workflows. It activates when you mention airflow dag generator or similar triggers to streamline orchestration work.

How this skill works

The skill inspects your pipeline requirements (tasks, schedules, dependencies, operators, and resource constraints) and outputs Python DAG code, Docker/Kubernetes snippets, and configuration files. It applies common patterns and best practices, runs basic validation checks against style and syntactic issues, and suggests runtime and security configurations. You can iterate on generated DAGs with incremental prompts to refine operators, retries, and monitoring hooks.

When to use it

  • You need a new Airflow DAG scaffold for ETL or streaming jobs
  • You want production-ready DAG code with retries, logging, and monitoring hooks
  • You need standards-based DAG configurations (schedules, pools, task dependencies)
  • You want to migrate informal scripts into orchestrated tasks
  • You need validation and remediation suggestions for failing DAGs

Best practices

  • Define clear task boundaries and idempotent operators to avoid side effects
  • Use sensible retries, timeouts, and SLA settings for resilience
  • Parameterize connections and secrets via Airflow Variables/Connections or a secrets backend
  • Include logging, metrics hooks, and task-level alerts for observability
  • Prefer small, testable tasks and use sensors or external triggers sparingly

Example use cases

  • Generate a daily ETL DAG that extracts from S3, transforms with Spark, and loads to a data warehouse
  • Create a streaming ingestion DAG integrating Kafka consumers with checkpointing
  • Scaffold DAGs for CI/CD: linting, unit tests, and deployment via Helm or CI pipelines
  • Refactor a monolithic script into discrete Airflow tasks with XComs and retries
  • Validate an existing DAG for common misconfigurations and suggest fixes

FAQ

What inputs do I need to provide?

Provide task descriptions, data sources/destinations, schedule, preferred operators (e.g., BashOperator, SparkSubmitOperator), and resource constraints. Minimal prompts can produce a scaffold to refine.

Can it generate DAGs for KubernetesExecutor or Celery?

Yes. Specify the executor and runtime environment and the skill will tailor operator configurations, pod templates, and resource requests accordingly.