logo
Back to Blog
Better Data Engineering Part 5: Mastering Snapshots in dbt
dbtsnapshotsdata engineering

Better Data Engineering Part 5: Mastering Snapshots in dbt

A practical guide to dbt snapshots and how to track historical changes in your warehouse.

Why?

Because data changes, and if you do not track those changes, you lose the truth.

Introduction

Most warehouses store only the latest version of a record. But businesses often need to answer questions like:

  • What was the customer's plan last month?
  • When did this price change?
  • How many users downgraded over time?

This is where dbt snapshots shine.

Snapshots let you track historical changes automatically, without writing complex SCD logic.

Rule 1: Understand the Problem - SCDs Are Everywhere

Slowly Changing Dimensions (SCDs) appear in:

  • customer profiles
  • subscription plans
  • product catalogs
  • pricing tables
  • employee records

If you overwrite data, you lose history. Snapshots preserve it.

Rule 2: Use dbt's Built-In Snapshot Framework

dbt snapshots give you:

  • automatic change detection
  • versioning of records
  • start and end timestamps
  • simple configuration

A typical snapshot looks like:

{% snapshot customers_snapshot %}
  {{
    config(
      target_schema='snapshots',
      unique_key='customer_id',
      strategy='timestamp',
      updated_at='updated_at'
    )
  }}

  select * from {{ source('crm', 'customers') }}

{% endsnapshot %}

dbt handles the rest.

Rule 3: Choose the Right Strategy

dbt supports two strategies:

  1. timestamp
    Use when the source table has a reliable updated_at column.

  2. check
    Use when you want dbt to compare columns directly.

Example:

strategy='check',
check_cols=['email', 'plan', 'status']

Rule 4: Store Snapshots in Their Own Schema

Keep snapshots separate:

  • easier to manage
  • easier to query
  • easier to clean up

Use a dedicated schema like:

snapshots/

Rule 5: Use Snapshots for Business Logic, Not Everything

Snapshots are powerful, but do not snapshot:

  • high-volume event tables
  • logs
  • metrics
  • ephemeral data

Use them only for slowly changing business entities.


Part 6 covers CI/CD, testing, and observability.

Related Posts

Better Data Engineering Part 4: Performance & Optimization

Better Data Engineering Part 4: Performance & Optimization

How to optimize dbt models, improve warehouse performance, and scale your transformations.

dbtperformancedata engineering
Read More
Better Data Engineering Part 6: CI/CD, Testing & Observability

Better Data Engineering Part 6: CI/CD, Testing & Observability

How to build reliable, production-grade data pipelines with dbt, CI/CD, and observability tools.

dbtci/cdtesting+2 more
Read More
Better Data Engineering Part 1: Why dbt Matters

Better Data Engineering Part 1: Why dbt Matters

A practical introduction to dbt and why it changes the way we build data pipelines.

dbtdata engineeringanalytics engineering
Read More

Design & Developed by Marcellin
© 2026. All rights reserved.