home / skills / sfc-gh-dflippo / snowflake-dbt-demo / dbt-projects-snowflake-setup

dbt-projects-snowflake-setup skill

/.claude/skills/dbt-projects-snowflake-setup

This skill guides you through setting up dbt projects on Snowflake from prerequisites to automated scheduling and monitoring.

npx playbooks add skill sfc-gh-dflippo/snowflake-dbt-demo --skill dbt-projects-snowflake-setup

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
12.1 KB
---
name: dbt-projects-snowflake-setup
description:
  Step-by-step setup guide for dbt Projects on Snowflake including prerequisites, external access
  integration, Git API integration, event table configuration, and automated scheduling. Use this
  skill when setting up dbt Projects on Snowflake for the first time or troubleshooting setup
  issues.
---

# dbt Projects on Snowflake Setup

Complete step-by-step guide for setting up dbt Projects on Snowflake from beginning to end.

## When to Use This Skill

Activate this skill when users ask about:

- Setting up dbt Projects on Snowflake for the first time
- Configuring external access integrations for dbt packages
- Setting up Git API integrations (GitHub, GitLab, Azure DevOps)
- Creating workspaces in Snowsight
- Configuring event table monitoring for dbt Projects
- Scheduling automated dbt runs with Snowflake Tasks
- Troubleshooting dbt Projects setup issues

## Prerequisites

**1. Snowflake Account**

- Account with ACCOUNTADMIN permissions for initial setup
- Personal database enabled (default for new accounts)

**2. Git Repository**

- GitHub, GitLab, or Azure DevOps repository
- Personal Access Token (PAT) for authentication

---

## Setup Steps

### Step 1: Enable Personal Database

```sql
ALTER ACCOUNT SET ENABLE_PERSONAL_DATABASE = TRUE;
```

---

### Step 2: Create External Access Integration

For `dbt deps` to work, allow external access to dbt packages:

```sql
-- Create NETWORK RULE
CREATE OR REPLACE NETWORK RULE dbt_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = (
    'hub.getdbt.com',
    'codeload.github.com'
  );

-- Create EXTERNAL ACCESS INTEGRATION
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION dbt_ext_access
  ALLOWED_NETWORK_RULES = (dbt_network_rule)
  ENABLED = TRUE;
```

**Purpose:** Allows dbt to download packages from hub.getdbt.com and GitHub during `dbt deps`
execution.

---

### Step 3: Create Git API Integration

Choose the appropriate integration for your Git provider:

#### GitHub

```sql
CREATE OR REPLACE API INTEGRATION git_api_integration
  API_PROVIDER = git_https_api
  API_ALLOWED_PREFIXES = (
    'https://github.com/',
    'https://github.com/organization/'
  );
```

#### GitLab

```sql
CREATE OR REPLACE API INTEGRATION git_api_integration
  API_PROVIDER = git_https_api
  API_ALLOWED_PREFIXES = (
    'https://gitlab.com/'
  );
```

#### Azure DevOps

```sql
CREATE OR REPLACE API INTEGRATION git_api_integration
  API_PROVIDER = git_https_api
  API_ALLOWED_PREFIXES = (
    'https://dev.azure.com/'
  );
```

**Purpose:** Allows Snowflake to connect to your Git repository for workspace creation and project
deployment.

---

### Step 4: Create Workspace in Snowsight

1. Navigate to Projects → My Workspace
2. Click My Workspace → Create Workspace → From Git repository
3. Enter:
   - Repository URL
   - API integration name (`git_api_integration`)
   - Authentication (PAT or OAuth)

**Note:** Workspace creation is only available through the Snowsight UI. The Snowflake CLI does not
have commands for creating workspaces.

---

### Step 5: Configure profiles.yml

In your workspace, configure `profiles.yml`:

```yaml
my_dbt_project:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: "" # Uses current account context
      user: "" # Uses current user context
      warehouse: MY_WAREHOUSE
      database: MY_DATABASE
      schema: MY_SCHEMA
      role: MY_ROLE
```

**Important Notes:**

- Leave `account` and `user` empty - Snowflake provides these automatically
- Specify your warehouse, database, schema, and role
- For multiple environments, add additional outputs (staging, prod)

---

### Step 6: Deploy as DBT PROJECT Object

**UI Method:**

- Use the Deploy button in workspace

**CLI Method:**

```bash
snow dbt deploy my_project --source .
```

**Verify Deployment:**

```sql
SHOW DBT PROJECTS IN DATABASE MY_DATABASE;
```

---

## Event Table Monitoring Configuration (Optional but Recommended)

Monitor dbt Projects execution using event tables that capture telemetry data (logs, traces,
metrics) via the [OpenTelemetry data model](https://opentelemetry.io/).

### Critical Pattern: Database-Level Configuration

**Always set event tables at the DATABASE level** (not schema, not account-wide):

```sql
-- Step 1: Create event table (can be in different database)
CREATE EVENT TABLE IF NOT EXISTS MY_LOGGING_DATABASE.MY_LOGGING_SCHEMA.EVENT_LOG;

-- Step 2: Set event table where dbt Projects are deployed at DATABASE level
ALTER DATABASE MY_DBT_PROJECT_DATABASE
  SET EVENT_TABLE = MY_LOGGING_DATABASE.MY_LOGGING_SCHEMA.EVENT_LOG;

-- Step 3: Configure logging levels for the schema where dbt Project is deployed
ALTER SCHEMA MY_DBT_PROJECT_DATABASE.MY_DBT_PROJECT_SCHEMA SET LOG_LEVEL = 'INFO';
ALTER SCHEMA MY_DBT_PROJECT_DATABASE.MY_DBT_PROJECT_SCHEMA SET TRACE_LEVEL = 'ALWAYS';
ALTER SCHEMA MY_DBT_PROJECT_DATABASE.MY_DBT_PROJECT_SCHEMA SET METRIC_LEVEL = 'ALL';
```

### Why DATABASE Level?

✅ **DO:**

- Set at DATABASE level for project-level isolation
- Captures all dbt Project executions in that database
- Avoids account-wide noise
- Provides clear project boundaries

❌ **DON'T:**

- Set at account level (too much noise from all databases)
- Set at schema level (misses cross-schema operations)

### Verify Event Capture

After configuration, verify events are being captured:

```sql
-- Check recent events
SELECT
    TIMESTAMP,
    RESOURCE_ATTRIBUTES['snow.executable.name']::VARCHAR AS project_name,
    RECORD_TYPE,
    RECORD['severity_text']::VARCHAR AS severity,
    VALUE::VARCHAR AS message
FROM MY_LOGGING_DATABASE.MY_LOGGING_SCHEMA.EVENT_LOG
WHERE RESOURCE_ATTRIBUTES['snow.executable.type']::VARCHAR = 'DBT_PROJECT'
  AND TIMESTAMP >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY TIMESTAMP DESC
LIMIT 10;
```

**For complete monitoring guide**, see the **`dbt-projects-on-snowflake` skill** for:

- Ready-to-use monitoring SQL scripts
- Best practices for event table management
- Performance metrics queries
- Alerting strategies
- Troubleshooting guide

---

## Scheduling Automated Runs (Optional)

Create a Snowflake task to run dbt on a schedule:

```sql
CREATE OR REPLACE TASK my_dbt_daily_task
  WAREHOUSE = 'MY_WAREHOUSE'
  SCHEDULE = 'USING CRON 0 6 * * * UTC'  -- Daily at 6 AM UTC
AS
  EXECUTE DBT PROJECT MY_DATABASE.MY_SCHEMA.MY_DBT_PROJECT
    args='build';

-- Enable the task
ALTER TASK my_dbt_daily_task RESUME;
```

### Customization Options

| Parameter               | Purpose                      | Example                               |
| ----------------------- | ---------------------------- | ------------------------------------- |
| Task name               | Identifies the scheduled job | `my_dbt_daily_task`                   |
| Warehouse               | Compute resources            | `MY_WAREHOUSE`                        |
| Schedule                | CRON expression              | `0 6 * * * UTC` (daily 6 AM)          |
| Database/Schema/Project | Target dbt project           | `MY_DB.MY_SCHEMA.MY_PROJECT`          |
| Args                    | dbt command arguments        | `'build'`, `'run --select tag:daily'` |

### Common Schedules

```sql
-- Hourly
SCHEDULE = 'USING CRON 0 * * * * UTC'

-- Daily at 2 AM
SCHEDULE = 'USING CRON 0 2 * * * UTC'

-- Every 15 minutes
SCHEDULE = '15 MINUTE'

-- Weekly on Monday at 8 AM
SCHEDULE = 'USING CRON 0 8 * * 1 UTC'
```

### Monitor Task Execution

```sql
-- View task history
SELECT *
FROM TABLE(INFORMATION_SCHEMA.TASK_HISTORY(
  SCHEDULED_TIME_RANGE_START => DATEADD('day', -7, CURRENT_TIMESTAMP()),
  TASK_NAME => 'MY_DBT_DAILY_TASK'
))
ORDER BY SCHEDULED_TIME DESC;
```

---

## Troubleshooting

### SSH/Network Issues

**Problem:** Can't download dbt packages or connect to Git

**Solutions:**

1. Verify external access integration exists:

   ```sql
   SHOW EXTERNAL ACCESS INTEGRATIONS;
   ```

2. Check network rules include required hosts:

   ```sql
   DESCRIBE EXTERNAL ACCESS INTEGRATION dbt_ext_access;
   ```

3. Ensure required hosts are in VALUE_LIST:
   - `hub.getdbt.com` (for dbt packages)
   - `codeload.github.com` (for GitHub packages)

### Authentication Failures

**Problem:** Git authentication fails in workspace creation

**Solutions:**

1. Verify PAT has correct scopes:
   - GitHub: `repo` scope
   - GitLab: `read_repository` scope
   - Azure DevOps: `Code (Read)` permission

2. Check API integration is created:

   ```sql
   SHOW API INTEGRATIONS;
   ```

3. Verify API allowed prefixes match your repository URL

### Package Installation Issues

**Problem:** `dbt deps` fails in workspace

**Solutions:**

1. Run `dbt deps` manually in workspace before deployment
2. Ensure external access integration is enabled:

   ```sql
   ALTER EXTERNAL ACCESS INTEGRATION dbt_ext_access SET ENABLED = TRUE;
   ```

3. Check package versions are compatible with dbt version in Snowflake

### Event Table Not Capturing Data

**Problem:** No events appearing in event table

**Solutions:**

1. Verify event table is set at DATABASE level:

   ```sql
   SHOW PARAMETERS LIKE 'EVENT_TABLE' IN DATABASE MY_DATABASE;
   ```

2. Check logging levels are set for schema:

   ```sql
   SHOW PARAMETERS LIKE '%_LEVEL' IN SCHEMA MY_DATABASE.MY_SCHEMA;
   ```

3. Ensure dbt Project has executed at least once after configuration

4. Query with correct filter:

   ```sql
   WHERE RESOURCE_ATTRIBUTES['snow.executable.type']::VARCHAR = 'DBT_PROJECT'
   ```

### Workspace Creation Fails

**Problem:** Can't create workspace from Git repository

**Solutions:**

1. Verify personal database is enabled:

   ```sql
   SHOW PARAMETERS LIKE 'ENABLE_PERSONAL_DATABASE' IN ACCOUNT;
   ```

2. Check you have required role (ACCOUNTADMIN or sufficient grants)

3. Ensure Git repository URL is correct and accessible

4. Verify Git API integration exists and has correct allowed prefixes:

   ```sql
   SHOW API INTEGRATIONS;
   DESCRIBE API INTEGRATION git_api_integration;
   ```

5. Check PAT/OAuth token has correct permissions for the repository

---

## Best Practices

### Security

✅ **DO:**

- Use key pair authentication for production deployments
- Rotate PATs regularly
- Use minimal scopes on PATs
- Set up separate integrations for dev/prod
- Use role-based access control

❌ **DON'T:**

- Share PATs between team members
- Use ACCOUNTADMIN for routine operations
- Grant excessive permissions to API integrations
- Hardcode credentials in profiles.yml

### Organization

✅ **DO:**

- Use consistent naming conventions (e.g., `{env}_dbt_project`)
- Organize projects by database
- Document integration configurations
- Set up event tables from the start
- Use separate warehouses for dev/prod

❌ **DON'T:**

- Mix development and production in same database
- Skip event table configuration
- Use default warehouse for all environments
- Deploy without testing in workspace first

### Monitoring

✅ **DO:**

- Configure event tables at database level
- Set appropriate log/trace/metric levels
- Query event tables regularly to verify capture
- Set up alerts for failures
- Archive old event data periodically

❌ **DON'T:**

- Set event tables at account level (too noisy)
- Ignore event table configuration
- Set all levels to DEBUG (storage bloat)
- Keep event data indefinitely

---

## Quick Setup Checklist

- [ ] ✅ Enable personal database
- [ ] ✅ Create external access integration (for dbt deps)
- [ ] ✅ Create Git API integration
- [ ] ✅ Create workspace from Git repository
- [ ] ✅ Configure profiles.yml
- [ ] ✅ Test in workspace
- [ ] ✅ Deploy as DBT PROJECT object
- [ ] ✅ Configure event table (recommended)
- [ ] ✅ Verify deployment with `SHOW DBT PROJECTS`
- [ ] ✅ Test execution with `EXECUTE DBT PROJECT`
- [ ] ✅ Set up scheduled tasks (if needed)
- [ ] ✅ Configure monitoring queries

---

## Related Skills

- `dbt-projects-on-snowflake` skill - Complete monitoring, execution, and management guide
- `dbt-core` skill - dbt-core setup and profiles.yml configuration
- `snowflake-connections` skill - Snowflake authentication and connection configuration
- `snowflake-cli` skill - Snowflake CLI commands and operations

---

**Goal:** Transform AI agents into experts at setting up dbt Projects on Snowflake from scratch with
proper integrations, monitoring, and automation configured from day one.

Overview

This skill is a step-by-step setup guide for dbt Projects on Snowflake, covering prerequisites, integrations, event table monitoring, and automated scheduling. It focuses on practical SQL and UI steps to enable package downloads, Git access, workspace creation, deployment, and task scheduling. Use it to get a production-ready dbt on Snowflake environment with monitoring and secure integrations.

How this skill works

The guide walks through enabling a personal database, creating an external access integration for dbt package downloads, and creating API integrations for Git providers. It shows how to create a Snowsight workspace from a Git repo, configure profiles.yml for Snowflake execution, deploy a DBT PROJECT object, and set up event tables and Snowflake Tasks for scheduling. Troubleshooting tips and verification queries are included to validate each step.

When to use it

  • Setting up dbt Projects on Snowflake for the first time
  • Configuring external network access for dbt deps and package installs
  • Integrating GitHub/GitLab/Azure DevOps with Snowflake workspaces
  • Adding event table monitoring for dbt Project telemetry
  • Scheduling automated dbt runs with Snowflake Tasks
  • Troubleshooting workspace creation, package downloads, or event capture

Best practices

  • Enable personal databases and use ACCOUNTADMIN only for initial setup
  • Create external access and API integrations with minimal allowed prefixes
  • Set event tables at the DATABASE level for project isolation and clear telemetry
  • Keep profiles.yml account/user empty to rely on Snowflake context; specify warehouse/database/schema/role
  • Use separate integrations/warehouses for dev and prod; rotate PATs and use minimal scopes

Example use cases

  • CI/CD: Deploy a dbt project from Git and run dbt build via a Snowflake Task on a schedule
  • Monitoring: Route dbt execution logs to a centralized event table and query recent failures
  • Onboarding: Create Snowsight workspaces for developers using a shared API integration and PAT
  • Troubleshooting: Verify external access and API integrations when dbt deps or Git access fails
  • Environment segregation: Maintain separate databases and event tables for dev and prod dbt projects

FAQ

Why must event tables be set at the database level?

Database-level event tables capture all project executions for that database, provide project isolation, and avoid account-wide noise or missed cross-schema events.

What hosts must external access allow for dbt deps?

Include hub.getdbt.com and codeload.github.com (or provider-specific hosts) so dbt can download packages and Git-hosted sources.

What scopes are required for a PAT?

GitHub: repo scope. GitLab: read_repository. Azure DevOps: Code (Read). Use minimal scopes and rotate tokens regularly.