dbt Semantic Layer cookbook recipes

Updated

Semantic Layer

dbt Cloud

Cookbook

Intermediate

Introduction

These cookbook recipes are a collection of scenario-based, real-world examples for building with the dbt Semantic Layer. They provide practical guidance on how to define semantic models and metrics in dbt to track and report on key business metrics.

The recipes cover a range of use cases, including calculating active users, annual recurring revenue (ARR), and customer acquisition cost (CAC). Each recipe includes step-by-step instructions on how to define semantic models and metrics in dbt, as well as how to validate and query your metrics.

These recipes are designed to help you get started with the dbt Semantic Layer and cover the following:

How to calculate active users and active users WoW growth

Use this cookbook recipe to understand how to calculate “Active users” and “Active users (Week-over-Week growth (WoW) growth), as well as other time-related attributes in a database with dbt using YAML configurations.

Objectives

Understand how to configure and use semantic models in dbt to track and report daily user activity.
Learn how to define and calculate metrics for active users and their WoW growth. As well as utility models for fiscal periods and other date-related flags.

Use case

Track and analyze user activity on a daily basis, including page views, time-on-site, and other key metrics. This helps in monitoring user engagement and understanding trends over time.

tip

💡 Note that we’re using the parallel subfolder approach for better organization in large projects, which means you’re creating a semantic model folder containing all your semantic layer code.

Alternatively, you can use the one-YAML-per-marts-model approach which puts documentation, data tests, unit tests, semantic models, and metrics into a unified file corresponding to a dbt-modeled mart.

Step 1: Define semantic models

Create a YAML file under the semantic_models folder: (models/semantic_models/sem_user_activity_daily.yml) to create a semantic model that tracks daily user activity. This semantic model refs the fct_user_activity_daily model.

Note that if you’re using the one-YAML-file-per-mart, then save this as user_activity_daily.yml under the models folder.

models/semantic_models/sem_user_activity_daily.yml

semantic_models:
  - name: sem_user_activity_daily
    description: |
      This model reports on daily user activity. Activity is broken down into
      pageview + time-on-site metrics for each use case that is addressed in
      dbt Cloud (ie. Deployment and Development)
    model: ref('fct_user_activity_daily')
    defaults:
      agg_time_dimension: date_day

Add the following YAML configurations in that same semantic model file:
- entities to define user-related entities (such as user or user_daily)
- dimensions to specify time dimensions (such as date_day)
- measures for a distinct count of active users (count_activity_users) aggregated by day.

models/semantic_models/sem_user_activity_daily.yml

semantic_models:
- name: sem_user_activity_daily
description: >
  This model reports on daily user activity. Activity is broken down into
  pageview + time-on-site metrics for each use case that is addressed in
  dbt Cloud (ie. Deployment and Development)
model: ref('fct_user_activity_daily')
defaults:
  agg_time_dimension: date_day
  
# Newly added blocks below 

entities:
#--- user entities
  - name: user_daily
    type: primary
    expr: user_activity_daily_id
  - name: user
    type: foreign
    expr: user_id
  #--- util entities
  - name: date
    type: foreign
    expr: date_day

dimensions:
  - name: date_day
    type: time
    type_params:
      time_granularity: day
  - name: t7d_usage
    type: categorical

measures:
  - name: count_activity_users
    description: >
      Distinct count of users via `fct_user_activity_daily`.
      To be used in metrics related to User-level activities
      (e.g. Active Users)
    agg: count_distinct
    expr: user_id
    agg_time_dimension: date_day

Create another semantic model YAML file (models/semantic_models/sem_all_days.yml). This semantic model captures company-specific fiscal periods and date flags (like fiscal_quarter). It helps in managing fiscal dates and other time-related dimensions.

models/semantic_models/sem_all_days.yml

- name: sem_all_days
    description: |
      Utility semantic model to bring in company-specific fiscal quarter and fiscal year, or date flags such as `is_end_of_week` or `is_end_of_fiscal_quarter`.
    model: ref('all_days')
    defaults:
      agg_time_dimension: date_day

    entities:
      - name: date
        type: primary
        expr: date_day
      - name: date_month
        type: foreign
        expr: month_start_date

    dimensions:
      - name: date_day
        type: time
        type_params:
          time_granularity: day
      - name: date_month
        type: time
        type_params:
          time_granularity: month
      - name: fiscal_quarter
        type: categorical
      - name: fiscal_year
        type: categorical
      - name: is_end_of_week
        type: categorical
      - name: is_yesterday
        type: categorical

Step 2: Create metrics

Create a YAML file under the metrics folder: (models/metrics/sem_user_activity_daily.yml) to create a metric that tracks active users. Note that if you’re using the one-YAML-file-per-mart, then save this as active_users.yml under the models folder.

models/metrics/sem_user_activity_daily.yml

metrics:
  - name: users_active
    label: 'Users: active'
    description: >
      Users are considered active when they reach:
      - at least 3+ actions in the week
      - in a given reporting period (weekly or monthly)
      
      **Use-case:**
      **Time Aggregations:** Weekly, Monthly
      - This is helpful for reporting counts for the current/unfinished week because we base this metric on the trailing 7 _complete_ days.
      - For example: if you query this metric on Friday, it'll look at the count based on Wed (last week) to Thurs (this week). Numbers can shift once the week is fully complete.
    type: simple
    type_params:
      measure: count_users
    filter: |
      {{ Dimension('user_daily__t7d_usage') }} >= 3 and
      (
        {{ Dimension('date__is_end_of_week') }} = True or
        {{ Dimension('date__is_yesterday') }} = True
      )
    
  - name: users_active_wow_growth
    label: 'Users: active (% WoW growth)'
    type: derived
    type_params:
      expr: div0(users_active,users_active_lw) - 1
      metrics: 
        - name: users_active
          offset_window: 7 days
          alias: users_active_lw
        - name: users_active

Step 3: Test your metrics

After saving your new or updated metric, try the following process in your development tool to test your metrics:

Rebuild your semantic_manifest.json file to ensure that your new metrics are included in the manifest. Run the following command:
```
dbt parse
```
If you see an error where the metric doesn't exist, it could be because you were developing on a separate branch and then switched back to this one so the manifest is different. If so, then use the --no-defer flag
```
dbt parse --no-defer
```

Query your metrics to confirm the results are as expected. Run the following command:

dbt sl query --metrics users_active,users_active_wow_growth --group-by metric_time__week --where "metric_time__week >= '2023-01-01'" --order-by metric_time__week

If the results looks different than your "source of truth", then look at the compiled code. Run the following command:

dbt sl query --metrics users_active,users_active_wow_growth --group-by metric_time__week --where "metric_time__week >= '2023-01-01'" --order-by metric_time__week --compile

How to calculate ARR using metrics

Use this cookbook recipe to understand how to calculate annual recurring revenue (ARR) using derived metrics in dbt, specifically leveraging the monthly recurring revenue (MRR) metric.

Objectives

Understand how to configure derived metrics in dbt.
Learn how to calculate ARR based on MRR.

Use case

Calculate and analyze ARR to monitor the financial health and growth of subscription-based businesses.

tip

💡 Note that we’re using the parallel subfolder approach for better organization in large projects, which means you’re creating a semantic model folder containing all your semantic layer code.

Step 1: Define semantic models

Create a YAML file under the semantic_models folder: (models/semantic_models/sem_revenue_models.yml) to define a semantic model for revenue data.

models/semantic_models/sem_revenue_models.yml

semantic_models:
  - name: sem_revenue
    description: |
      This model aggregates revenue data, providing metrics for business to analyze the financial 
      health of the fictional Santiago's snacks org.
    model: ref('fct_revenue')
    defaults:
      agg_time_dimension: date_month
    entities:
      - name: date
        type: primary
        expr: date_month
    measures:
      - name: mrr
        agg: sum
        expr: monthly_recurring_revenue
        agg_time_dimension: date_month
    dimensions:
      - name: date_day
        type: time
        type_params:
          time_granularity: day
      - name: date_month
        type: time
        type_params:
          time_granularity: month
      - name: is_last_day_of_month
        type: categorical

Step 2: Define the ARR metric

Create another YAML file under the metrics folder: (models/metrics/sem_revenue_metrics.yml) to define the ARR metric.

models/metrics/sem_revenue_metrics.yml

metrics:
  - name: arr
    label: 'ARR'
    description: |
      Annual recurring revenue calculated based on MRR.
    type: derived
    type_params:
      expr: mrr * 12
      metrics:
        - name: mrr

This model calculates ARR by multiplying the MRR (defined in the sem_revenue_models.yml file) by 12 to get the annual recurring revenue.

Step 3: Validate and query metric

Confirm and validate the new metric definition in your development by running the following command:
```
dbt parse
```

Query your metrics to confirm the results are as expected. Run the following command:

dbt sl query --metrics arr --group-by metric_time__year --where "metric_time__year >= '2023'" --order-by metric_time__year

How to calculate CAC using metrics

Use this cookbook recipe to understand how to calculate customer acquisition cost (CAC) using ratio metrics in dbt.

Objectives

Understand how to configure and use semantic models in dbt to track and report customer acquisition costs.
Learn how to define and calculate CAC by dividing the total acquisition cost by the number of new customers.

Use case

Track and analyze the cost of acquiring new customers to optimize marketing and sales strategies.

tip

💡 Note that we’re using the parallel subfolder approach for better organization in large projects, which means you’re creating a semantic model folder containing all your semantic layer code.

Step 1: Define semantic models

Create a YAML file under the semantic_models folder: (models/semantic_models/sem_customer_acquisition.yml) to define a semantic model for customer acquisition data.

models/semantic_models/sem_customer_acquisition.yml

semantic_models:
  - name: sem_customer_acquisition
    description: |
      This model aggregates customer acquisition data, providing metrics for business to analyze the cost of acquiring new customers.
    model: ref('fct_customer_acquisition')
    defaults:
      agg_time_dimension: date_month
    entities:
      - name: date
        type: primary
        expr: date_month
    measures:
      - name: total_acquisition_cost
        agg: sum
        expr: acquisition_cost
        agg_time_dimension: date_month
      - name: new_customers
        agg: sum
        expr: customer_count
        agg_time_dimension: date_month
    dimensions:
      - name: date_month
        type: time
        type_params:
          time_granularity: month
      - name: acquisition_channel
        type: categorical

Step 2: Define the CAC metric

Create another YAML file under the metrics folder: (models/metrics/sem_customer_acquisition_metrics.yml) to define the CAC metric.

models/metrics/sem_customer_acquisition_metrics.yml

metrics:
  - name: cac
    description: "Customer acquisition cost"
    label: 'CAC'        
    type: ratio
    type_params:
      numerator: total_acquisition_cost
      denominator: new_customers

Step 3: Validate and query metric

Confirm and validate the new metric definition in your development by running the following command:
```
dbt parse
```

Query your metrics to confirm the results are as expected. Run the following command:

dbt sl query --metrics cac --group-by metric_time__year --where "metric_time__year >= '2023'" --order-by metric_time__year

How to join multiple fact metrics

Use this cookbook recipe to understand how to join multiple fact metrics using the dbt Cloud Semantic Layer. This guide will walk you through a practical example of comparing sales and inventory metrics without creating additional views or dbt models.

Objectives

Learn how to utilize the dbt Semantic Layer to join multiple fact metrics from different semantic models.

Use case

This is useful if you want to compare fact metrics to each other in downstream tools, without needing to create anything new or configure anything new.

tip

Note that we’re using the parallel subfolder approach for better organization in large projects, which means you’re creating a semantic model folder containing all your semantic layer code.

Step 1: Define semantic models

Create a YAML file under the semantic_models folder (models/semantic_models/sem_sales.yml) to define a semantic model for tracking sales. This model should reference the fct_sales model and include relevant entities, dimensions, and measures. Note that if you’re using the one-YAML-file-per-mart, then save this as sales.yml under the models folder.

models/semantic_models/sem_sales.yml

semantic_models:
  - name: sem_sales
    description: >
      This model reports on sales activities, focusing on store-level and date-level metrics.
    model: ref('fct_sales')
    defaults:
      agg_time_dimension: date_day
    
    entities:
      - name: store
        type: primary
        expr: store_id
      - name: date
        type: foreign
        expr: date_day
    
    dimensions:
      - name: date_day
        type: time
        type_params:
          time_granularity: day
      - name: store_region
        type: categorical
    
    measures:
      - name: total_sales
        description: Sum of sales amount.
        agg: sum
        expr: sales_amount
        agg_time_dimension: date_day

Similarly, create another YAML file (models/semantic_models/sem_inventory.yml) to define a semantic model for tracking inventory. This model should reference the fct_inventory model.

models/semantic_models/sem_inventory.yml

semantic_models:
  - name: sem_inventory
    description: >
      This model reports on inventory levels, focusing on store-level and date-level metrics.
    model: ref('fct_inventory')
    defaults:
      agg_time_dimension: date_day
    
    entities:
      - name: store
        type: primary
        expr: store_id
      - name: date
        type: foreign
        expr: date_day
    
    dimensions:
      - name: date_day
        type: time
        type_params:
          time_granularity: day
      - name: store_region
        type: categorical
    
    measures:
      - name: stock_available
        description: Sum of available stock.
        agg: sum
        expr: stock_amount
        agg_time_dimension: date_day

Step 2: Create metrics

Create a YAML file under the metrics folder (models/metrics/sales_metrics.yml) to define metrics for sales, such as total sales. Note that if you’re using the one-YAML-file-per-mart, then save this as sales_metrics.yml under the corresponding models folder.

models/metrics/sales_metrics.yml

metrics:
  - name: total_sales
    label: 'Total Sales'
    description: >
      The sum of sales amount across all stores and dates.
    type: simple
    type_params:
      measure: total_sales

Similarly, create a metrics YAML file (models/metrics/inventory_metrics.yml) for inventory, such as stock on hand.

models/metrics/inventory_metrics.yml

metrics:
  - name: stock_available
    label: 'Available stock'
    description: >
      The sum of stock available across all stores and dates.
    type: simple
    type_params:
      measure: stock_available

Step 3: Join metrics

When you add a metric from both semantic models (like total_sales and stock_available) in a downstream tool, the dbt Semantic Layer automatically handles the join between these metrics based on the shared dimensions, such as date and store.

Note that:

The join operation leverages a full outer join, grabbing the necessary fields in subqueries.
Both metrics must share common dimensions (in this case, date and store) for the join to work seamlessly.

Step 4: Test your metrics

After defining your metrics, test them by querying in development to ensure the joins between metrics work as expected.

In your development tool, run a command query that includes both total_sales and stock_available.
Verify that the results reflect a correct comparison of the sales and inventory metrics across the selected dimensions.

## Rebuild your semantic_manifest.json file
dbt parse

## If you see an error where the metric doesn't exist it could be because you were developing on a separate branch and then switched back to this one so the manifest is different. If so, then use the `--no-defer` flag
dbt parse --no-defer

## Query your metric:
dbt sl query --metrics total_sales,stock_available --group-by date_day

How to test your metrics

After defining your metrics in your dbt project, test them in your development tool by running the following commands to ensure they work as expected. For a full list of available commands, refer to MetricFlow commands.

Rebuild your semantic_manifest.json file

If you've made edits to your semantic models or metrics, you need to rebuild your semantic_manifest.json file to ensure that your new metrics are included in the manifest.

Run the following command:

dbt parse

If you see an error where the metric doesn't exist, it could be because you were developing on a separate branch and then switched back to the working one so the manifest is different. If that's the case, use the --no-defer flag:

dbt parse --no-defer

Query your metric

To query the users_active and users_active_wow_growth metrics, run the following command:

dbt sl query --metrics users_active,users_active_wow_growth --group-by metric_time__week --where "metric_time__week >= '2023-01-01'" --order-by metric_time__week

This command will return the results for the users_active and users_active_wow_growth metrics, grouped by metric_time__week and ordered by metric_time__week.

Different output

If the output looks different than your "source of truth", review the compiled code using the following command (notice the --compile flag at the end of the command):

dbt sl query --metrics users_active,users_active_wow_growth --group-by metric_time__week --where "metric_time__week >= '2023-01-01'" --order-by metric_time__week --compile

Introduction​

How to calculate active users and active users WoW growth​

Objectives​

Use case​

Step 1: Define semantic models​

Step 2: Create metrics​

Step 3: Test your metrics​

How to calculate ARR using metrics​

Objectives​

Use case​

Step 1: Define semantic models​

Step 2: Define the ARR metric​

Step 3: Validate and query metric​

How to calculate CAC using metrics​

Objectives​

Use case​

Step 1: Define semantic models​

Step 2: Define the CAC metric​

Step 3: Validate and query metric​

How to join multiple fact metrics​

Objectives​

Use case​

Step 1: Define semantic models​

Step 2: Create metrics​

Step 3: Join metrics​

Step 4: Test your metrics​

How to test your metrics​

Rebuild your semantic_manifest.json file​

Query your metric​

Different output​

Introduction

How to calculate active users and active users WoW growth

Objectives

Use case

Step 1: Define semantic models

Step 2: Create metrics

Step 3: Test your metrics

How to calculate ARR using metrics

Objectives

Use case

Step 1: Define semantic models

Step 2: Define the ARR metric

Step 3: Validate and query metric

How to calculate CAC using metrics

Objectives

Use case

Step 1: Define semantic models

Step 2: Define the CAC metric

Step 3: Validate and query metric

How to join multiple fact metrics

Objectives

Use case

Step 1: Define semantic models

Step 2: Create metrics

Step 3: Join metrics

Step 4: Test your metrics

How to test your metrics

Rebuild your semantic_manifest.json file

Query your metric

Different output