home / skills / sfc-gh-dflippo / snowflake-dbt-demo / dbt-architecture
This skill guides you to structure dbt projects using medallion layers (bronze, silver, gold) with naming, folder, and dependency patterns for production-grade
npx playbooks add skill sfc-gh-dflippo/snowflake-dbt-demo --skill dbt-architectureReview the files below or copy the command above to add this skill to your agents.
---
name: dbt-architecture
description:
dbt project structure using medallion architecture (bronze/silver/gold layers). Use this skill
when planning project organization, establishing folder structure, defining naming conventions,
implementing layer-based configuration, or ensuring proper model dependencies and architectural
patterns.
---
# dbt Architecture
## Purpose
Transform AI agents into experts on dbt project architecture and medallion layer patterns, providing
guidance on structuring production-grade dbt projects with proper layer separation, naming
conventions, and configuration strategies.
## When to Use This Skill
Activate this skill when users ask about:
- Planning dbt project structure and folder organization
- Implementing medallion architecture (bronze/silver/gold)
- Establishing naming conventions for models and columns
- Configuring folder-level settings in dbt_project.yml
- Ensuring proper model dependencies and data flow
- Understanding layer separation and architectural patterns
- Setting up tag inheritance strategies
## Core Philosophy: Medallion Architecture + Best Practices Integration
Medallion architecture demonstrates how dbt best practices seamlessly integrate with a layered data
approach:
- **Bronze Layer** = **Staging Models** (`stg_`) - One-to-one source relationships
- **Silver Layer** = **Intermediate Models** (`int_`) - Business logic transformations
- **Gold Layer** = **Marts** (`dim_`, `fct_`) - Business-ready data products
Every recommendation follows both architectural principles and dbt best practices simultaneously.
---
## Medallion Architecture Quick Reference
### Three Layers
**Bronze (Staging):**
- Naming: `stg_{source}__{table}`
- Materialization: `ephemeral`
- Purpose: One-to-one source cleaning
- Rules: No joins, no business logic
**Silver (Intermediate):**
- Naming: `int_{entity}__{description}`
- Materialization: `ephemeral` or `table`
- Purpose: Business logic, enrichment
- Rules: No direct source references
**Gold (Marts):**
- Naming: `dim_{entity}` or `fct_{process}`
- Materialization: `table` or `incremental`
- Purpose: Business-ready data products
- Rules: Fully tested, documented, optimized
---
## Critical Architectural Rules
Always enforce these patterns:
1. ✅ **No Direct Joins to Source** - Models reference staging (`ref('stg_*')`), never `source()`
directly
2. ✅ **One-to-One Staging** - Each source table has exactly ONE staging model
3. ✅ **Proper Layering** - Clear flow: staging → intermediate → marts
4. ✅ **Standardized Naming** - Consistent `stg_`, `int_`, `dim_`, `fct_` prefixes
5. ✅ **Use ref() and source()** - No hard-coded table references
6. ✅ **Folder-Level Configuration** - Set common settings in dbt_project.yml
**Official dbt Documentation**:
[How we structure our dbt projects](https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview)
---
## Bronze Layer: Staging Models
**Purpose**: One-to-one relationship with source tables. Light cleaning and standardization only.
**Materialization**: `ephemeral` (compiled as CTEs)
**Naming**: `stg_{source}__{table}.sql`
### Bronze Template
```sql
-- models/bronze/stg_tpc_h__customers.sql
{{ config(materialized='ephemeral') }}
select
-- Primary key (renamed)
c_custkey as customer_id,
-- Attributes (cast and renamed)
c_name as customer_name,
c_address as customer_address,
c_phone as phone_number,
c_acctbal as account_balance,
-- Metadata
current_timestamp() as dbt_loaded_at
from {{ source('tpc_h', 'customer') }}
```
### Bronze Rules
✅ **DO**:
- One source table → One staging model
- Reference sources using `{{ source() }}`
- Rename columns to standard naming
- Cast data types
- Basic cleaning (trim, upper/lower)
❌ **DON'T**:
- Join between sources
- Add business logic
- Aggregate data
- Hard-code table names
---
## Silver Layer: Intermediate Models
**Purpose**: Reusable business logic and complex transformations. Sits between staging and marts.
**Materialization**: `ephemeral` (reusable logic) or `table` (complex computations)
**Naming**: `int_{entity}__{description}.sql`
### Silver Template
```sql
-- models/silver/int_customers__with_orders.sql
{{ config(materialized='ephemeral') }}
with customers as (
select * from {{ ref('stg_tpc_h__customers') }}
),
orders as (
select * from {{ ref('stg_tpc_h__orders') }}
),
customer_metrics as (
select
customer_id,
count(*) as total_orders,
sum(order_total) as lifetime_value,
min(order_date) as first_order_date
from orders
group by customer_id
)
select
c.customer_id,
c.customer_name,
coalesce(m.total_orders, 0) as total_orders,
coalesce(m.lifetime_value, 0) as lifetime_value,
m.first_order_date
from customers c
left join customer_metrics m on c.customer_id = m.customer_id
```
### Silver Rules
✅ **DO**:
- Reference staging + other intermediate models
- Add business logic and aggregations
- Create reusable components
- Use CTEs for clarity
❌ **DON'T**:
- Reference sources directly
- Add final presentation logic
- Create one-time-use models
---
## Gold Layer: Marts Models
**Purpose**: Business-ready data products optimized for BI tools and end users.
**Materialization**: `table` (dimensions) or `incremental` (large facts)
**Naming**: `dim_{entity}` (dimensions), `fct_{process}` (facts)
### Dimension Template
```sql
-- models/gold/dim_customers.sql
{{ config(materialized='table') }}
with customers as (
select * from {{ ref('int_customers__with_orders') }}
)
select
-- Primary key
customer_id,
-- Attributes
customer_name,
customer_email,
-- Metrics
total_orders,
lifetime_value,
first_order_date,
-- Business classification
case
when lifetime_value >= 5000 then 'gold'
when lifetime_value >= 1000 then 'silver'
else 'bronze'
end as customer_tier,
-- Metadata
current_timestamp() as dbt_updated_at
from customers
```
### Fact Template
```sql
-- models/gold/fct_orders.sql
{{ config(
materialized='incremental',
unique_key='order_id',
cluster_by=['order_date', 'customer_id']
) }}
select
order_id,
customer_id,
order_date,
order_status,
order_total,
current_timestamp() as dbt_updated_at
from {{ ref('stg_tpc_h__orders') }}
{% if is_incremental() %}
where order_date > (select max(order_date) from {{ this }})
{% endif %}
```
### Gold Rules
✅ **DO**:
- Reference staging, intermediate, and other marts
- Add final business logic
- Optimize for query performance (clustering)
- Test comprehensively
- Document for business users
❌ **DON'T**:
- Reference sources directly
- Create unnecessary complexity
---
## Naming Conventions
### Model Naming
| Layer | Prefix | Example | Purpose |
| ----------------------- | ------ | ---------------------------- | ----------------- |
| **Bronze/Staging** | `stg_` | `stg_tpc_h__customers` | Clean source data |
| **Silver/Intermediate** | `int_` | `int_customers__with_orders` | Business logic |
| **Gold/Dimensions** | `dim_` | `dim_customers` | Business entities |
| **Gold/Facts** | `fct_` | `fct_orders` | Business events |
### Column Naming Standards
**Primary & Foreign Keys**:
- `{entity}_id` - customer_id, order_id, product_id
- Foreign keys use same naming as primary key in related table
**Boolean Flags**:
- `is_{condition}` - is_active, is_deleted, is_first_order
- `has_{attribute}` - has_orders, has_discount
**Dates & Timestamps**:
- `{event}_date` - order_date, created_date
- `{event}_at` - created_at, updated_at, deleted_at
- Always use UTC timezone suffix if needed - created_at_utc
**Metrics & Aggregates**:
- `{metric}_count` - order_count, customer_count
- `{metric}_amount` - total_amount, discount_amount
- Include currency suffix if applicable - amount_usd, price_eur
**Row Numbers & Sequences**:
- `{entity}_row_number` - order_row_number
- `{entity}_seq_number` - sequence_number
### Consistency Rules
✅ **DO:**
- Use snake_case for all column names
- Use consistent entity names across models
- Include currency/units in column names when relevant
- Keep names concise but descriptive
❌ **DON'T:**
- Mix naming styles (camelCase vs snake_case)
- Use abbreviations inconsistently
- Create ambiguous names without context
- Use reserved SQL keywords
---
## Folder Structure
```sql
models/
├── bronze/ # Staging layer - one-to-one with sources
│ ├── stg_tpc_h__customers.sql
│ ├── stg_tpc_h__orders.sql
│ └── stg_tpc_h__lineitem.sql
├── silver/ # Intermediate layer - business logic
│ ├── int_customers__with_orders.sql
│ ├── int_fx_rates__daily.sql
│ └── customer_segments.sql
└── gold/ # Marts layer - business-ready analytics
├── dim_customers.sql
├── dim_products.sql
├── fct_orders.sql
└── fct_order_lines.sql
```
---
## Configuration in dbt_project.yml
### Folder-Level Configuration (Reduces Repetition)
Configure common settings at the folder level to minimize model-level overrides:
```yaml
models:
your_project:
bronze:
+materialized: ephemeral
+tags: ["bronze", "staging"]
+schema: bronze
silver:
+materialized: ephemeral
+tags: ["silver"]
+schema: silver
gold:
+materialized: table
+tags: ["gold", "marts"]
+schema: gold
```
**Model-Level Configuration**: Override folder defaults only for unique requirements (incremental
settings, clustering, etc.)
---
## Tag Inheritance Strategy
✅ **LEVERAGE**: dbt's additive tag inheritance
Tags accumulate hierarchically per the
[dbt documentation](https://docs.getdbt.com/reference/resource-configs/tags). Child folders inherit
all parent tags automatically.
```yaml
# ✅ GOOD: Avoid duplicate tags
bronze:
+tags: ["bronze", "staging"]
subfolder:
+tags: ["subfolder"] # Inherits: bronze, staging, subfolder
# ❌ BAD: Redundant parent tags
bronze:
+tags: ["bronze", "staging"]
subfolder:
+tags: ["bronze", "staging", "subfolder"] # Duplicates parent tags
```
**Common Selection Patterns**:
```bash
dbt run --select tag:bronze # All bronze models
dbt run --select tag:gold # All gold models
dbt run --select tag:staging # Alternative to bronze
```
---
## Helping Users with Architecture
### Strategy for Assisting Users
When users ask for architectural guidance:
1. **Identify the layer**: Which medallion layer (bronze/silver/gold)?
2. **Clarify purpose**: What transformation or business logic is needed?
3. **Apply naming conventions**: Follow `stg_`, `int_`, `dim_`, `fct_` patterns
4. **Recommend materialization**: Based on layer and reusability
5. **Provide working examples**: Show complete, tested code patterns
6. **Validate dependencies**: Ensure proper layer flow (staging → intermediate → marts)
### Common User Questions
**"How should I structure my project?"**
- Explain medallion architecture layers
- Show folder organization by layer
- Demonstrate model dependencies flow
- Provide naming convention standards
- Show configuration strategy (folder-level first)
**"Where does this model belong?"**
- Ask: Is it cleaning source data? → Bronze
- Ask: Does it add business logic? → Silver
- Ask: Is it for end-user consumption? → Gold
**"What should I name this model?"**
- Bronze: `stg_{source}__{table}`
- Silver: `int_{entity}__{description}`
- Gold dimensions: `dim_{entity}`
- Gold facts: `fct_{process}`
---
## Related Official Documentation
- [dbt Best Practices: How We Structure Our dbt Projects](https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview)
- [dbt Best Practices: Structuring Project](https://docs.getdbt.com/guides/best-practices)
- [dbt Resource Configurations: Tags](https://docs.getdbt.com/reference/resource-configs/tags)
---
**Goal**: Transform AI agents into expert dbt architects who guide users through project structure
with confidence, clarity, and production-ready patterns.
This skill helps design production-grade dbt projects using the medallion (bronze/silver/gold) architecture. It focuses on folder structure, naming conventions, layer-based configuration, and enforcing model dependency patterns. Use it to standardize projects for maintainability, testability, and clear data flow.
The skill inspects the transformation intent and places models into three layers: bronze (staging), silver (intermediate), and gold (marts). It recommends materializations, naming prefixes (stg_, int_, dim_, fct_), folder-level dbt_project.yml settings, and tag inheritance to enforce consistent behavior. It also validates rules like no direct joins to sources and one-to-one staging models.
Where should I put a model that only renames and casts source columns?
Bronze/staging. Use stg_{source}__{table} naming and materialize as ephemeral for light cleaning and standardization.
Can silver models reference sources directly?
No. Silver models should reference staging or other intermediate models via ref() to preserve clear layering and reusability.