home / skills / dagster-io / skills / integrations-index
/plugins/dagster-integrations/skills/integrations-index
This skill helps you explore Dagster integrations across AI, ETL, storage, compute, and BI, enabling quick tool selection for a use case.
npx playbooks add skill dagster-io/skills --skill integrations-indexReview the files below or copy the command above to add this skill to your agents.
---
name: integrations-index
description:
Comprehensive index of 82+ Dagster integrations organized by official tags.yml taxonomy including
AI (OpenAI, Anthropic), ETL (dbt, Fivetran, Airbyte, PySpark), Storage (Snowflake, BigQuery),
Compute (AWS, Databricks, Spark), BI (Looker, Tableau), Monitoring, Alerting, and Testing. Use
when discovering integrations or finding the right tool for a use case.
---
# Dagster Integrations Index
Navigate 82+ Dagster integrations organized by Dagster's official taxonomy. Find AI/ML tools, ETL
platforms, data storage, compute services, BI tools, and monitoring integrations.
## When to Use This Skill vs. Others
| If User Says... | Use This Skill/Command | Why |
| ------------------------- | ------------------------------------ | ---------------------------------------- |
| "which integration for X" | `/dagster-integrations` | Need to discover appropriate integration |
| "does dagster support X" | `/dagster-integrations` | Check integration availability |
| "snowflake vs bigquery" | `/dagster-integrations` | Compare integrations in same category |
| "best practices for X" | `/dagster-conventions` | Implementation patterns needed |
| "implement X integration" | `/dg:prototype` | Ready to build with specific integration |
| "how do I use dbt" | `/dagster-conventions` (dbt section) | dbt-specific implementation patterns |
| "make this code better" | `/dignified-python` | Python code review needed |
| "create new project" | `/dg:create-project` | Project initialization needed |
## Quick Reference by Category
| Category | Count | Common Tools | Reference |
| ---------------------- | ----- | ------------------------------------- | -------------------------- |
| **AI & ML** | 6 | OpenAI, Anthropic, MLflow, W&B | `references/ai.md` |
| **ETL/ELT** | 9 | dbt, Fivetran, Airbyte, PySpark | `references/etl.md` |
| **Storage** | 35+ | Snowflake, BigQuery, Postgres, DuckDB | `references/storage.md` |
| **Compute** | 15+ | AWS, Databricks, Spark, Docker, K8s | `references/compute.md` |
| **BI & Visualization** | 7 | Looker, Tableau, PowerBI, Sigma | `references/bi.md` |
| **Monitoring** | 3 | Datadog, Prometheus, Papertrail | `references/monitoring.md` |
| **Alerting** | 6 | Slack, PagerDuty, MS Teams, Twilio | `references/alerting.md` |
| **Testing** | 2 | Great Expectations, Pandera | `references/testing.md` |
| **Other** | 2+ | Pandas, Polars | `references/other.md` |
## Category Taxonomy
This index aligns with Dagster's official documentation taxonomy from tags.yml:
- **ai**: Artificial intelligence and machine learning integrations (LLM APIs, experiment tracking)
- **etl**: Extract, transform, and load tools including data replication and transformation
frameworks
- **storage**: Databases, data warehouses, object storage, and table formats
- **compute**: Cloud platforms, container orchestration, and distributed processing frameworks
- **bi**: Business intelligence and visualization platforms
- **monitoring**: Observability platforms and metrics systems for tracking performance
- **alerting**: Notification and incident management systems for pipeline alerts
- **testing**: Data quality validation and testing frameworks
- **other**: Miscellaneous integrations including DataFrame libraries
**Note**: Support levels (dagster-supported, community-supported) are shown inline in each
integration entry.
Last verified: 2026-01-27
## Finding the Right Integration
### I need to...
**Load data from external sources**
- SaaS applications → [ETL](#etl) (Fivetran, Airbyte)
- Files/databases → [ETL](#etl) (dlt, Sling, Meltano)
- Cloud storage → [Storage](#storage) (S3, GCS, Azure Blob)
**Transform data**
- SQL transformations → [ETL](#etl) (dbt)
- Distributed transformations → [ETL](#etl) (PySpark)
- DataFrame operations → [Other](#other) (Pandas, Polars)
- Large-scale processing → [Compute](#compute) (Spark, Dask, Ray)
**Store data**
- Cloud data warehouse → [Storage](#storage) (Snowflake, BigQuery, Redshift)
- Relational database → [Storage](#storage) (Postgres, MySQL)
- File/object storage → [Storage](#storage) (S3, GCS, Azure, LakeFS)
- Analytics database → [Storage](#storage) (DuckDB)
- Vector embeddings → [Storage](#storage) (Weaviate, Chroma, Qdrant)
**Validate data quality**
- Schema validation → [Testing](#testing) (Pandera)
- Quality checks → [Testing](#testing) (Great Expectations)
**Run ML workloads**
- LLM integration → [AI](#ai) (OpenAI, Anthropic, Gemini)
- Experiment tracking → [AI](#ai) (MLflow, W&B)
- Distributed training → [Compute](#compute) (Ray, Spark)
**Execute computation**
- Cloud compute → [Compute](#compute) (AWS, Azure, GCP, Databricks)
- Containers → [Compute](#compute) (Docker, Kubernetes)
- Distributed processing → [Compute](#compute) (Spark, Dask, Ray)
**Monitor pipelines**
- Team notifications → [Alerting](#alerting) (Slack, MS Teams, PagerDuty)
- Metrics tracking → [Monitoring](#monitoring) (Datadog, Prometheus)
- Log aggregation → [Monitoring](#monitoring) (Papertrail)
**Visualize data**
- BI dashboards → [BI](#bi) (Looker, Tableau, PowerBI)
- Analytics platform → [BI](#bi) (Sigma, Hex, Evidence)
## Integration Categories
### AI & ML
Artificial intelligence and machine learning platforms, including LLM APIs and experiment tracking.
**Key integrations:**
- **OpenAI** - GPT models and embeddings API
- **Anthropic** - Claude AI models
- **Gemini** - Google's multimodal AI
- **MLflow** - Experiment tracking and model registry
- **Weights & Biases** - ML experiment tracking
- **NotDiamond** - LLM routing and optimization
See `references/ai.md` for all AI/ML integrations.
### ETL/ELT
Extract, transform, and load tools for data ingestion, transformation, and replication.
**Key integrations:**
- **dbt** - SQL-based transformation with automatic dependencies
- **Fivetran** - Automated SaaS data ingestion (component-based)
- **Airbyte** - Open-source ELT platform
- **dlt** - Python-based data loading (component-based)
- **Sling** - High-performance data replication (component-based)
- **PySpark** - Distributed data transformation
- **Meltano** - ELT for the modern data stack
See `references/etl.md` for all ETL/ELT integrations.
### Storage
Data warehouses, databases, object storage, vector databases, and table formats.
**Key integrations:**
- **Snowflake** - Cloud data warehouse with IO managers
- **BigQuery** - Google's serverless data warehouse
- **DuckDB** - In-process SQL analytics
- **Postgres** - Open-source relational database
- **Weaviate** - Vector database for AI search
- **Delta Lake** - ACID transactions for data lakes
- **DataHub** - Metadata catalog and lineage
See `references/storage.md` for all storage integrations.
### Compute
Cloud platforms, container orchestration, and distributed processing frameworks.
**Key integrations:**
- **AWS** - Cloud compute services (Glue, EMR, Lambda)
- **Databricks** - Unified analytics platform
- **GCP** - Google Cloud compute (Dataproc, Cloud Run)
- **Spark** - Distributed data processing engine
- **Dask** - Parallel computing framework
- **Docker** - Container execution with Pipes
- **Kubernetes** - Cloud-native orchestration
- **Ray** - Distributed computing for ML
See `references/compute.md` for all compute integrations.
### BI & Visualization
Business intelligence and visualization platforms for analytics and reporting.
**Key integrations:**
- **Looker** - Google's BI platform
- **Tableau** - Interactive dashboards
- **PowerBI** - Microsoft's BI tool
- **Sigma** - Cloud analytics platform
- **Hex** - Collaborative notebooks
- **Evidence** - Markdown-based BI
- **Cube** - Semantic layer platform
See `references/bi.md` for all BI integrations.
### Monitoring
Observability platforms and metrics systems for tracking pipeline performance.
**Key integrations:**
- **Datadog** - Comprehensive observability platform
- **Prometheus** - Time-series metrics collection
- **Papertrail** - Centralized log management
See `references/monitoring.md` for all monitoring integrations.
### Alerting
Notification and incident management systems for pipeline alerts.
**Key integrations:**
- **Slack** - Team messaging and alerts
- **PagerDuty** - Incident management for on-call
- **MS Teams** - Microsoft Teams notifications
- **Twilio** - SMS and voice notifications
- **Apprise** - Universal notification platform
- **DingTalk** - Team communication for Asian markets
See `references/alerting.md` for all alerting integrations.
### Testing
Data quality validation and testing frameworks for ensuring data reliability.
**Key integrations:**
- **Great Expectations** - Data validation with expectations
- **Pandera** - Statistical data validation for DataFrames
See `references/testing.md` for all testing integrations.
### Other
Miscellaneous integrations including DataFrame libraries and utility tools.
**Key integrations:**
- **Pandas** - In-memory DataFrame library
- **Polars** - Fast DataFrame library with columnar storage
See `references/other.md` for other integrations.
## References
Integration details are organized in the following files:
- **AI & ML**: `references/ai.md` - AI and ML platforms, LLM APIs, experiment tracking
- **ETL/ELT**: `references/etl.md` - Data ingestion, transformation, and replication tools
- **Storage**: `references/storage.md` - Warehouses, databases, object storage, vector DBs
- **Compute**: `references/compute.md` - Cloud platforms, containers, distributed processing
- **BI & Visualization**: `references/bi.md` - Business intelligence and analytics platforms
- **Monitoring**: `references/monitoring.md` - Observability and metrics systems
- **Alerting**: `references/alerting.md` - Notifications and incident management
- **Testing**: `references/testing.md` - Data quality and validation frameworks
- **Other**: `references/other.md` - DataFrame libraries and miscellaneous tools
## Using Integrations
Most Dagster integrations follow a common pattern:
1. **Install the package**:
```bash
pip install dagster-<integration>
```
2. **Import and configure a resource**:
```python
from dagster_<integration> import <Integration>Resource
resource = <Integration>Resource(
config_param=dg.EnvVar("ENV_VAR")
)
```
3. **Use in your assets**:
```python
@dg.asset
def my_asset(integration: <Integration>Resource):
# Use the integration
pass
```
For component-based integrations (dbt, Fivetran, dlt, Sling), see the specific reference files for
scaffolding and configuration patterns.
This skill provides a searchable index of 82+ Dagster integrations organized by Dagster's official tags.yml taxonomy. It helps teams discover integrations across AI/ML, ETL, storage, compute, BI, monitoring, alerting, and testing. Use it to quickly find supported tools, compare options in the same category, or verify whether Dagster has an integration for a specific vendor.
The skill maps integrations to Dagster taxonomy categories (ai, etl, storage, compute, bi, monitoring, alerting, testing, other) and lists common tools per category with support-level hints (dagster-supported vs community-supported). It exposes quick decision guidance for common tasks (load, transform, store, validate, monitor, alert, visualize) and links to per-category reference content containing installation and configuration patterns. Last verification date is included so users know currency.
Does this index show support level for each integration?
Yes — entries indicate whether an integration is dagster-supported or community-supported when available.
How do I install an integration listed here?
Most packages follow pip installs like pip install dagster-<integration>; reference files include configuration and usage patterns.