home / skills / sfc-gh-dflippo / snowflake-dbt-demo / dbt-migration

dbt-migration skill

needs review

This skill guides you through migrating database objects to dbt on Snowflake, coordinating discovery, planning, placeholders, conversion, testing, and

npx playbooks add skill sfc-gh-dflippo/snowflake-dbt-demo --skill dbt-migration

Review the files below or copy the command above to add this skill to your agents.

Files (21)

SKILL.md

18.5 KB

---
name: dbt-migration
description:
  Complete workflow for migrating database tables, views, and stored procedures to dbt projects on
  Snowflake. Orchestrates discovery, planning, placeholder creation, view/procedure conversion,
  testing, and deployment. Delegates platform-specific syntax translation to source-specific skills.
---

# Database to dbt Migration Workflow

## Purpose and When to Use

Guide AI agents through the complete migration lifecycle from Snowflake or legacy database systems
(SQL Server, Oracle, Teradata, etc.) to production-quality dbt projects on Snowflake. This skill
defines a structured, repeatable process while delegating platform-specific syntax translation to
dedicated source-specific skills.

Activate this skill when users ask about:

- Planning a database migration to dbt
- Organizing legacy scripts for migration
- Converting views and stored procedures to dbt models
- Testing migration results against source systems
- Deploying migrated dbt projects to production

---

## Snowflake Migration Tools

### Recommended Two-Step Approach

1. **Convert to Snowflake first**: Use SnowConvert AI and AI Powered Code Conversion to convert
   source database objects (from SQL Server, Oracle, Teradata, etc.) to Snowflake tables, views, and
   stored procedures.
2. **Then convert to dbt**: Use the $dbt-migration-snowflake skill to migrate Snowflake objects to
   dbt models.

### SnowConvert AI (Recommended for Supported Platforms)

[SnowConvert AI](https://docs.snowflake.com/en/migrations/snowconvert-docs/general/about) converts
source DDL, views, stored procedures, functions, and additional objects (triggers, sequences,
indexes) to Snowflake-compatible SQL.
[Download SnowConvert AI](https://docs.snowflake.com/en/migrations/snowconvert-docs/general/getting-started/download-and-access)

#### Supported Platforms

- **Full support** (tables, views, procedures, functions): SQL Server, Oracle, Teradata, Redshift,
  Azure Synapse, IBM DB2
- **Partial support** (tables, views only): Sybase IQ, BigQuery, PostgreSQL, Spark SQL/Databricks,
  Hive, Vertica, Greenplum/Netezza

#### Platform-Specific Features

- SQL Server: Direct DB connection, data migration, SSIS replatform
- Oracle, Azure Synapse, Sybase IQ, BigQuery: DDL Extraction script
- Teradata: BTEQ/MLOAD/TPUMP support
- Redshift: Direct DB connection, data migration

### Additional Snowflake Migration Tools

| Tool                                                    | Purpose                                                  |
| ------------------------------------------------------- | -------------------------------------------------------- |
| [AI Code Conversion](resources/ai-code-conversion.md)   | AI-powered validation and repair of converted code       |
| [Migration Assistant](resources/migration-assistant.md) | VS Code extension for resolving conversion issues (EWIs) |
| [Data Migration](resources/data-migration.md)           | Transfer data to Snowflake (SQL Server, Redshift)        |
| [Data Validation](resources/data-validation.md)         | GUI-based validation (SQL Server)                        |
| [Data Validation CLI](resources/data-validation-cli.md) | CLI validation (SQL Server, Teradata, Redshift)          |
| [ETL Replatform](resources/etl-replatform.md)           | Convert SSIS packages to dbt projects                    |
| [Power BI Repointing](resources/power-bi-repointing.md) | Redirect Power BI reports to Snowflake                   |

---

## Migration Workflow Overview

The migration process follows seven sequential phases. Each phase has entry criteria, deliverables,
and validation gates that must pass before advancing.

```text
1-Discovery → 2-Planning → 3-Placeholders → 4-Views → 5-Table Logic → 6-Testing → 7-Deployment
```

---

## Phase 1: Discovery and Assessment

Create a complete inventory of source database objects and understand dependencies, volumes, and
complexity to inform migration planning.

**SnowConvert AI Option**: If your platform is supported, SnowConvert AI provides extraction scripts
that automate object inventory, dependency mapping, and initial code conversion.

### Phase 1 Activities

1. **Inventory source objects**: Query system catalogs for tables, views, procedures, functions
2. **Document dependencies**: Map object dependencies to determine migration order
3. **Document volumes**: Record row counts and data sizes
4. **Assess complexity**: Categorize objects as Low/Medium/High/Custom complexity
5. **Create migration tracker**: Document objects in spreadsheet or issue tracker

### Complexity Assessment

| Complexity | Criteria                               | Examples                      |
| ---------- | -------------------------------------- | ----------------------------- |
| **Low**    | Simple SELECT, no/minimal joins        | Lookup tables, simple views   |
| **Medium** | Multiple joins, aggregations, CASE     | Summary views, report queries |
| **High**   | Procedural logic, cursors, temp tables | SCD procedures, bulk loads    |
| **Custom** | Platform-specific features             | Wrapped code, CLR functions   |

### Phase 1 Checklist

- [ ] All tables, views, procedures inventoried
- [ ] Row counts documented
- [ ] Object dependencies mapped
- [ ] Complexity assessment complete
- [ ] Migration tracker created
- [ ] Refresh frequencies identified

---

## Phase 2: Planning and Organization

Organize legacy scripts, map objects to the dbt medallion architecture, and establish naming
conventions before any conversion begins.

### Phase 2 Activities

1. **Organize legacy scripts**: Create folder structure (tables/, views/, stored_procedures/,
   functions/)
2. **Map to medallion layers**: Assign objects to Bronze/Silver/Gold with appropriate prefixes
3. **Define naming conventions**: Follow $dbt-architecture skill patterns
4. **Create dependency graph**: Visualize migration order
5. **Establish validation criteria**: Define success metrics per object

### Layer Mapping Reference

| Source Object Type   | Target Layer | dbt Prefix | Materialization |
| -------------------- | ------------ | ---------- | --------------- |
| Source tables (raw)  | Bronze       | `stg_`     | ephemeral       |
| Simple views         | Bronze       | `stg_`     | ephemeral       |
| Complex views        | Silver       | `int_`     | ephemeral/table |
| Dimension procedures | Gold         | `dim_`     | table           |
| Fact procedures      | Gold         | `fct_`     | incremental     |

### Phase 2 Checklist

- [ ] Legacy scripts organized in folders
- [ ] All objects mapped to medallion layers
- [ ] Naming conventions documented
- [ ] Dependency graph created
- [ ] Migration order established
- [ ] Validation criteria defined

---

## Phase 3: Create Placeholder Models

Create empty dbt models with correct column names, data types, and schema documentation **before**
adding any transformation logic. This establishes the contract for downstream consumers.

### Phase 3 Activities

1. **Generate placeholder models**: Create SQL files with `null::datatype as column_name` pattern
   and `where false`
2. **Map datatypes**: Use platform-specific skill for datatype conversion to Snowflake types
3. **Create schema documentation**: Generate `_models.yml` with column descriptions and tests
4. **Validate compilation**: Run `dbt compile --select tag:placeholder`
5. **Track status**: Add `placeholder` tag to config for tracking

### Placeholder Model Pattern

```sql
{{ config(materialized='ephemeral', tags=['placeholder', 'bronze']) }}

select
    null::integer as column_id,
    null::varchar(100) as column_name,
    -- ... additional columns with explicit types
where false
```

### Phase 3 Checklist

- [ ] Placeholder model created for each target table
- [ ] All columns have explicit datatype casts
- [ ] Column names follow naming conventions
- [ ] `_models.yml` created with descriptions and tests
- [ ] All placeholder models compile successfully
- [ ] Placeholder tag applied for tracking

---

## Phase 4: Convert Views

Convert source database views to dbt models, starting with simple views before tackling complex
ones. Views are typically easier than stored procedures as they contain declarative SQL.

### Phase 4 Activities

1. **Prioritize by complexity**: Simple views (no joins) → Join views → Aggregate views → Complex
   views
2. **Apply syntax translation**: Delegate to platform-specific skill (see Related Skills)
3. **Structure with CTEs**: Use standard CTE pattern from $dbt-modeling skill
4. **Add tests**: Define tests in `_models.yml` using $dbt-testing skill patterns
5. **Replace placeholder logic**: Update placeholder SELECT with converted logic

### Phase 4 Checklist

- [ ] Views prioritized by complexity
- [ ] Platform-specific syntax translated (delegate to source skills)
- [ ] CTE pattern applied consistently
- [ ] dbt tests added for each view
- [ ] Converted views compile successfully
- [ ] Inline comments document syntax changes

---

## Phase 5: Convert Table Logic from Stored Procedures

Transform procedural stored procedure logic into declarative dbt models, selecting appropriate
materializations for different ETL patterns.

### Phase 5 Activities

1. **Analyze ETL patterns**: Identify Full Refresh, SCD Type 1/2, Append, Delete+Insert patterns
2. **Map to materializations**: Use pattern-to-materialization mapping from $dbt-materializations
   skill
3. **Break complex procedures**: Split single procedures into multiple intermediate/final models
4. **Convert procedural constructs**: Replace cursors, temp tables, variables with declarative SQL
5. **Document decisions**: Add header comments explaining conversion approach

### Pattern Mapping Reference

| Source Pattern         | dbt Approach                                |
| ---------------------- | ------------------------------------------- |
| TRUNCATE + INSERT      | `materialized='table'`                      |
| UPDATE + INSERT (SCD1) | `materialized='incremental'` with merge     |
| SCD Type 2             | dbt snapshot or custom incremental          |
| INSERT only            | `materialized='incremental'` append         |
| DELETE range + INSERT  | `incremental` with `delete+insert` strategy |

### Procedural to Declarative Conversion

| Procedural Pattern | dbt Equivalent                   |
| ------------------ | -------------------------------- |
| CURSOR loop        | Window function or recursive CTE |
| Temp tables        | CTEs or intermediate models      |
| Variables          | Jinja variables or macros        |
| IF/ELSE branches   | CASE expressions or `{% if %}`   |
| TRY/CATCH          | Pre-validation tests             |

### Phase 5 Checklist

- [ ] All stored procedures analyzed for patterns
- [ ] ETL patterns mapped to dbt materializations
- [ ] Complex procedures broken into multiple models
- [ ] Procedural logic converted to declarative SQL
- [ ] Conversion decisions documented in model headers
- [ ] All converted models compile successfully

---

## Phase 6: End-to-End Testing and Validation

Verify that migrated dbt models produce identical results to source system, using multiple
validation techniques to ensure data integrity.

**Snowflake Data Validation CLI**: For SQL Server, Teradata, or Redshift migrations, the
[Data Validation CLI](https://docs.snowflake.com/en/migrations/snowconvert-docs/data-validation-cli/index)
provides automated schema validation (columns, data types, row counts) and metrics validation (MIN,
MAX, AVG, NULL count, DISTINCT count).

### Phase 6 Activities

1. **Row count validation**: Compare total counts between source and target
2. **Column checksum validation**: Compare row-level hashes to identify differences
3. **Business rule validation**: Verify calculated fields match source logic
4. **Aggregate validation**: Compare summary metrics (sums, counts, averages)
5. **Mock data testing**: Create seed fixtures for complex transformation testing
6. **Incremental validation**: Test both full-refresh and incremental runs
7. **Document results**: Create validation report for each migrated object

### Validation Techniques

| Technique      | Purpose                   | Implementation                |
| -------------- | ------------------------- | ----------------------------- |
| Row counts     | Detect missing/extra rows | Compare COUNT(\*)             |
| Checksums      | Detect value differences  | SHA2 hash comparison          |
| Business rules | Verify logic accuracy     | Singular tests                |
| Aggregates     | Validate totals           | SUM/AVG comparisons           |
| Mock data      | Test transformations      | Seed files + expected outputs |

### Phase 6 Checklist

- [ ] Row count validation queries created
- [ ] Checksum comparison implemented
- [ ] Business rule tests written
- [ ] Aggregate metrics compared
- [ ] Incremental models tested (full refresh + incremental)
- [ ] All validation queries pass
- [ ] Discrepancies documented and resolved
- [ ] Validation report completed

---

## Phase 7: Deployment and Cutover

Deploy validated dbt models to production with a clear cutover plan and monitoring strategy.

### Phase 7 Activities

1. **Deploy to Development**: Run `dbt build --target dev` and validate
2. **Deploy to Test/UAT**: Run full validation suite with `--store-failures`
3. **Create cutover plan**: Document pre-cutover, cutover, post-cutover, and rollback steps
4. **Deploy to Production**: Execute deployment with production data
5. **Configure scheduled runs**: Set up Snowflake tasks or dbt Cloud scheduling
6. **Monitor post-deployment**: Track run duration, row counts, test failures, performance

### Cutover Plan Template

| Phase              | Activities                                                                     |
| ------------------ | ------------------------------------------------------------------------------ |
| Pre-Cutover (T-1)  | Final validation, stakeholder sign-off, rollback docs, user communication      |
| Cutover (T-0)      | Disable source ETL, final sync, deploy, build, validate, update BI connections |
| Post-Cutover (T+1) | Monitor performance, verify schedules, confirm access, close tickets           |
| Rollback           | Re-enable source ETL, revert BI connections, document issues                   |

### Phase 7 Checklist

- [ ] Development deployment successful
- [ ] Test/UAT deployment successful
- [ ] Cutover plan documented
- [ ] Rollback procedure documented
- [ ] Stakeholder sign-off obtained
- [ ] Production deployment successful
- [ ] Scheduled runs configured
- [ ] Monitoring set up
- [ ] Migration marked complete

---

## Related Skills

### Platform-Specific Translation Skills

For syntax translation, delegate to the appropriate source-specific skill:

| Source Platform                  | Skill                        | Key Considerations                   |
| -------------------------------- | ---------------------------- | ------------------------------------ |
| Snowflake                        | $dbt-migration-snowflake     | Convert Snowflake objects to dbt     |
| SQL Server / Azure Synapse       | $dbt-migration-ms-sql-server | T-SQL, IDENTITY, TOP, #temp tables   |
| Oracle                           | $dbt-migration-oracle        | PL/SQL, ROWNUM, CONNECT BY, packages |
| Teradata                         | $dbt-migration-teradata      | QUALIFY, BTEQ, volatile tables       |
| BigQuery                         | $dbt-migration-bigquery      | UNNEST, STRUCT/ARRAY, backticks      |
| Redshift                         | $dbt-migration-redshift      | DISTKEY/SORTKEY, COPY/UNLOAD         |
| PostgreSQL / Greenplum / Netezza | $dbt-migration-postgres      | Array expressions, psql commands     |
| IBM DB2                          | $dbt-migration-db2           | SQL PL, FETCH FIRST, handlers        |
| Hive / Spark / Databricks        | $dbt-migration-hive          | External tables, PARTITIONED BY      |
| Vertica                          | $dbt-migration-vertica       | Projections, flex tables             |
| Sybase IQ                        | $dbt-migration-sybase        | T-SQL variant, SELECT differences    |

---

## Quick Reference: Phase Summary

<!-- AGENT_WORKFLOW_METADATA: Machine-parseable phase definitions -->

| Phase           | Key Deliverable                             | Exit Criteria                            | Primary Skill                         | Validation Focus                              | Validation Command                     |
| --------------- | ------------------------------------------- | ---------------------------------------- | ------------------------------------- | --------------------------------------------- | -------------------------------------- |
| 1. Discovery    | `migration_inventory.csv`, dependency graph | Inventory complete, dependencies mapped  | This skill                            | Object counts, dependency completeness        | Manual review                          |
| 2. Planning     | Folder structure, `_naming_conventions.md`  | Folder structure created, naming defined | $dbt-architecture                     | Folder hierarchy, naming conventions          | `ls -la models/`                       |
| 3. Placeholders | `.sql` files, `_models.yml`                 | All models compile with `where false`    | This skill                            | YAML structure, column definitions, naming    | `dbt compile --select tag:placeholder` |
| 4. Views        | Converted view models                       | All views converted and compile          | dbt-migration-{source}, $dbt-modeling | Syntax translation, CTE patterns, ref() usage | `dbt build --select tag:view`          |
| 5. Table Logic  | Converted procedure models                  | All procedures converted                 | $dbt-materializations                 | Incremental configs, materialization patterns | `dbt build --select tag:procedure`     |
| 6. Testing      | Validation queries, test results            | All validation queries pass              | $dbt-testing, $dbt-performance        | Test coverage, constraint definitions         | `dbt test --store-failures`            |
| 7. Deployment   | Production models, monitoring               | Production deployment successful         | $dbt-commands, $snowflake-cli         | Run success, schedule configuration           | `dbt build --target prod`              |

### General Skills

- $dbt-core: Local installation, configuration, package management
- $snowflake-connections: Connection setup for Snowflake CLI, Streamlit, dbt

---

## Validation Requirements

**CRITICAL: Agents must not advance to the next phase until all validations pass.**

Before proceeding to each phase, verify:

1. `dbt compile` succeeds
2. `dbt test` passes
3. Validation hooks report no errors

Hook configuration is defined in `.claude/settings.local.json`.

Overview

This skill provides a complete, repeatable workflow for migrating tables, views, and stored procedures into production-quality dbt projects on Snowflake. It orchestrates discovery, planning, placeholder creation, view/procedure conversion, testing, and deployment while delegating platform-specific syntax translation to source-specific conversion helpers. The goal is predictable migrations with clear validation gates and a safe cutover path.

How this skill works

The workflow runs seven sequential phases: Discovery, Planning, Placeholders, Views, Table Logic, Testing, and Deployment. It first inventories and assesses source objects, creates contract-first placeholder models, converts views and procedural ETL into declarative dbt models, then validates results against the source using checksums, counts, and business-rule tests before cutover. Platform-specific SQL translation (e.g., Oracle→Snowflake) is handled by dedicated conversion utilities before dbt conversion.

When to use it

Migrating an on-prem or cloud RDBMS (SQL Server, Oracle, Teradata, etc.) into dbt on Snowflake
Replatforming legacy stored procedures and ETL jobs into declarative dbt models
Standardizing naming, layering, and materializations across a newly migrated warehouse
Validating parity between source systems and migrated dbt outputs before cutover
Creating a repeatable process for large-scale, multi-object migrations

Best practices

Start with a full inventory and dependency graph to define safe migration order
Create placeholder models with explicit datatypes and where false before adding logic
Convert simple views first to build confidence, then tackle complex views and procedures
Map procedural ETL patterns to dbt materializations (incremental, table, snapshots)
Automate validation: row counts, checksums, aggregates, and business-rule tests
Define a clear cutover and rollback plan with post-deployment monitoring

Example use cases

Use SnowConvert AI to translate Oracle procedures to Snowflake, then convert to dbt models
Break a monolithic SCD Type 2 procedure into snapshot + incremental dbt models with tests
Create placeholders for 1,000+ target tables to enable parallel development and downstream testing
Run checksum and aggregate validations to prove parity before switching BI to Snowflake
Implement scheduled production runs with Snowflake tasks or dbt Cloud after cutover

FAQ

Do I need to convert source SQL to Snowflake first?

Yes. A two-step approach is recommended: convert source platform SQL to Snowflake syntax, then migrate Snowflake objects into dbt models.

How do I handle complex stored procedures?

Analyze ETL patterns, split procedures into intermediate models, replace procedural constructs with CTEs, window functions, snapshots, or incremental strategies, and document conversion decisions.