Skip to content

Commit

Permalink
Merge branch 'dbt-labs:current' into patch-9
Browse files Browse the repository at this point in the history
  • Loading branch information
benc-db authored Oct 23, 2024
2 parents 6b7db37 + 60b4106 commit 9b7a28f
Show file tree
Hide file tree
Showing 54 changed files with 677 additions and 630 deletions.
1 change: 1 addition & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ To learn more about the writing conventions used in the dbt Labs docs, see the [
- [ ] I have reviewed the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I have versioned it according to the [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and/or [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content) guidelines.
- [ ] I have added checklist item(s) to this list for anything anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch."
- [ ] The content in this PR requires a dbt release note, so I added one to the [release notes page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,9 @@ measures:

## Reviewing our work

Our completed code will look like this, our first semantic model!
Our completed code will look like this, our first semantic model! Here are two examples showing different organizational approaches:

<Expandable alt_header="Co-located approach">

<File name="models/marts/orders.yml" />

Expand Down Expand Up @@ -288,6 +290,68 @@ semantic_models:
description: The total tax paid on each order.
agg: sum
```
</Expandable>

<Expandable alt_header="Parallel sub-folder approach">

<File name="models/semantic_models/sem_orders.yml" />

```yml
semantic_models:
- name: orders
defaults:
agg_time_dimension: ordered_at
description: |
Order fact table. This table is at the order grain with one row per order.
model: ref('stg_orders')
entities:
- name: order_id
type: primary
- name: location
type: foreign
expr: location_id
- name: customer
type: foreign
expr: customer_id
dimensions:
- name: ordered_at
expr: date_trunc('day', ordered_at)
# use date_trunc(ordered_at, DAY) if using BigQuery
type: time
type_params:
time_granularity: day
- name: is_large_order
type: categorical
expr: case when order_total > 50 then true else false end
measures:
- name: order_total
description: The total revenue for each order.
agg: sum
- name: order_count
description: The count of individual orders.
expr: 1
agg: sum
- name: tax_paid
description: The total tax paid on each order.
agg: sum
```
</Expandable>

As you can see, the content of the semantic model is identical in both approaches. The key differences are:

1. **File location**
- Co-located approach: `models/marts/orders.yml`
- Parallel sub-folder approach: `models/semantic_models/sem_orders.yml`

2. **File naming**
- Co-located approach: Uses the same name as the corresponding mart (`orders.yml`)
- Parallel sub-folder approach: Prefixes the file with `sem_` (`sem_orders.yml`)

Choose the approach that best fits your project structure and team preferences. The co-located approach is often simpler for new projects, while the parallel sub-folder approach can be clearer for migrating large existing projects to the Semantic Layer.

## Next steps

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,6 @@ Is dbt Mesh a good fit in this scenario? Absolutely! There is no other way to sh
- Onboarding hundreds of people and dozens of projects is full of friction! The challenges of a scaled, global organization are not to be underestimated. To start the migration, prioritize teams that have strong dbt familiarity and fundamentals. dbt Mesh is an advancement of core dbt deployments, so these teams are likely to have a smoother transition.

Additionally, prioritize teams that manage strategic data assets that need to be shared widely. This ensures that dbt Mesh will help your teams deliver concrete value quickly.
- Bi-directional project dependencies -- currently, projects in dbt Mesh are treated like dbt resources in that they cannot depend on each other. However, many teams may want to be able to share data assets back and forth between teams.

We've added support for [enabling bidirectional dependencies](/best-practices/how-we-mesh/mesh-3-structures#cycle-detection) across projects. <Lifecycle status="beta"/>

If this sounds like your organization, dbt Mesh is the architecture you should pursue. ✅

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Since the launch of dbt Mesh, the most common pattern we've seen is one where pr

Users may need to contribute models across multiple projects and this is fine. There will be some friction doing this, versus a single repo, but this is _useful_ friction, especially if upstreaming a change from a “spoke” to a “hub.” This should be treated like making an API change, one that the other team will be living with for some time to come. You should be concerned if your teammates find they need to make a coordinated change across multiple projects very frequently (every week), or as a key prerequisite for ~20%+ of their work.

### Cycle detection <Lifecycle status="beta"/>
### Cycle detection

import CycleDetection from '/snippets/_mesh-cycle-detection.md';

Expand Down
2 changes: 1 addition & 1 deletion website/docs/best-practices/how-we-mesh/mesh-5-faqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ There’s model-level access within dbt, role-based access for users and groups

First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions.

[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/enterprise-permissions#how-to-set-up-rbac-groups-in-dbt-cloud) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider.
[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/about-user-access#role-based-access-control-) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider.

[Model access](/docs/collaborate/govern/model-access) defines where models can be referenced. It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc).

Expand Down
53 changes: 36 additions & 17 deletions website/docs/docs/build/cumulative-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,21 @@ Note that we use the double colon (::) to indicate whether a parameter is nested

<VersionBlock firstVersion="1.9">

| Parameter | <div style={{width:'350px'}}>Description</div> | Type |
| --------- | ----------- | ---- |
| `name` | The name of the metric. | Required |
| `description` | The description of the metric. | Optional |
| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required |
| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required |
| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required |
| `type_params::cumulative_type_params` | Allows you to add a `window`, `period_agg`, and `grain_to_date` configuration. Nested under `type_params`. | Optional |
| `cumulative_type_params::window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional |
| `cumulative_type_params::grain_to_date` | Sets the accumulation grain, such as `month`, which will accumulate data for one month and then restart at the beginning of the next. This can't be used with `window`. | Optional |
| `cumulative_type_params::period_agg` | Specifies how to aggregate the cumulative metric when summarizing data to a different granularity. Can be used with grain_to_date. Options are <br /> - `first` (Takes the first value within the period) <br /> - `last` (Takes the last value within the period <br /> - `average` (Calculates the average value within the period). <br /> <br /> Defaults to `first` if no `window` is specified. | Optional |
| `type_params::measure` | A dictionary describing the measure you will use. | Required |
| `measure::name` | The measure you are referencing. | Optional |
| `measure::fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
| `measure::join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |
| Parameter | <div style={{width:'350px'}}>Description</div> | Type |
|-------------|---------------------------------------------------|-----------|
| `name` | The name of the metric. | Required |
| `description` | The description of the metric. | Optional |
| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required |
| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required |
| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required |
| `type_params::measure` | The measure associated with the metric. Supports both shorthand (string) and object syntax. The shorthand is used if only the name is needed, while the object syntax allows specifying additional attributes. | Required |
| `measure::name` | The name of the measure being referenced. Required if using object syntax for `type_params::measure`. | Optional |
| `measure::fill_nulls_with` | Sets a value (for example, 0) to replace nulls in the metric definition. | Optional |
| `measure::join_to_timespine` | Boolean indicating if the aggregated measure should be joined to the time spine table to fill in missing dates. Default is `false`. | Optional |
| `type_params::cumulative_type_params` | Configures the attributes like `window`, `period_agg`, and `grain_to_date` for cumulative metrics. | Optional |
| `cumulative_type_params::window` | Specifies the accumulation window, such as `1 month`, `7 days`, or `1 year`. Cannot be used with `grain_to_date`. | Optional |
| `cumulative_type_params::grain_to_date` | Sets the accumulation grain, such as `month`, restarting accumulation at the beginning of each specified grain period. Cannot be used with `window`. | Optional |
| `cumulative_type_params::period_agg` | Defines how to aggregate the cumulative metric when summarizing data to a different granularity: `first`, `last`, or `average`. Defaults to `first` if `window` is not specified. | Optional |

</VersionBlock>

Expand All @@ -45,15 +45,34 @@ Note that we use the double colon (::) to indicate whether a parameter is nested
| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required |
| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required |
| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required |
| `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional |
| `window` | The accumulation window, such as `1 month`, `7 days`, or `1 year`. This can't be used with `grain_to_date`. | Optional |
| `grain_to_date` | Sets the accumulation grain, such as `month`, which will accumulate data for one month and then restart at the beginning of the next. This can't be used with `window`. | Optional |
| `type_params::measure` | A list of measure inputs | Required |
| `measure:name` | The measure you are referencing. | Optional |
| `measure:name` | The name of the measure being referenced. Required if using object syntax for `type_params::measure`. | Optional |
| `measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional |
| `measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |

</VersionBlock>

<Expandable alt_header="Explanation of type_params::measure">

The`type_params::measure` configuration can be written in different ways:
- Shorthand syntax &mdash; To only specify the name of the measure, use a simple string value. This is a shorthand approach when no other attributes are required.
```yaml
type_params:
measure: revenue
```
- Object syntax &mdash; To add more details or attributes to the measure (such as adding a filter, handling `null` values, or specifying whether to join to a time spine), you need to use the object syntax. This allows for additional configuration beyond just the measure's name.

```yaml
type_params:
measure:
name: order_total
fill_nulls_with: 0
join_to_timespine: true
```
</Expandable>

### Complete specification
The following displays the complete specification for cumulative metrics, along with an example:

Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs

Read and participate in the discussion: [dbt-core#10672](https://github.com/dbt-labs/dbt-core/discussions/10672)

Refer to [Supported incremental strategies by adapter](/docs/build/incremental-strategy#supported-incremental-strategies-by-adapter) for a list of supported adapters.

:::

## What is "microbatch" in dbt?
Expand Down
8 changes: 4 additions & 4 deletions website/docs/docs/build/measures.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ semantic_models:
description: A record of every transaction that takes place. Carts are considered multiple transactions for each SKU.
model: ref('schema.transactions')
defaults:
agg_time_dimensions: metric_time
agg_time_dimension: transaction_date
# --- entities ---
entities:
Expand Down Expand Up @@ -167,7 +167,7 @@ semantic_models:
# --- dimensions ---
dimensions:
- name: metric_time
- name: transaction_date
type: time
expr: date_trunc('day', ts) # expr refers to underlying column ts
type_params:
Expand Down Expand Up @@ -204,15 +204,15 @@ semantic_models:
description: A subscription table with one row per date for each active user and their subscription plans.
model: ref('your_schema.subscription_table')
defaults:
agg_time_dimension: metric_time
agg_time_dimension: subscription_date
entities:
- name: user_id
type: foreign
primary_entity: subscription_table
dimensions:
- name: metric_time
- name: subscription_date
type: time
expr: date_transaction
type_params:
Expand Down
14 changes: 12 additions & 2 deletions website/docs/docs/build/metricflow-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,28 @@ In dbt Cloud, run MetricFlow commands directly in the [dbt Cloud IDE](/docs/clou

For dbt Cloud CLI users, MetricFlow commands are embedded in the dbt Cloud CLI, which means you can immediately run them once you install the dbt Cloud CLI and don't need to install MetricFlow separately. You don't need to manage versioning because your dbt Cloud account will automatically manage the versioning for you.

<!--remove when fixed -->
Note: The **Defer to staging/production** [toggle](/docs/cloud/about-cloud-develop-defer#defer-in-the-dbt-cloud-ide) button doesn't apply when running Semantic Layer commands in the dbt Cloud IDE. To use defer for Semantic layer commands in the IDE, toggle the button on and manually add the `--defer` flag to the command. This is a temporary workaround and will be available soon.
</TabItem>

<TabItem value="core" label="MetricFlow with dbt Core">

You can install [MetricFlow](https://github.com/dbt-labs/metricflow#getting-started) from [PyPI](https://pypi.org/project/dbt-metricflow/). You need to use `pip` to install MetricFlow on Windows or Linux operating systems:

<VersionBlock lastVersion="1.7">

1. Create or activate your virtual environment `python -m venv venv`
2. Run `pip install dbt-metricflow`
* You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[your_adapter_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[snowflake]"`

</VersionBlock>

<VersionBlock firstVersion="1.8">

1. Create or activate your virtual environment `python -m venv venv`
2. Run `pip install dbt-metricflow`
* You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[adapter_package_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[dbt-snowflake]"`

</VersionBlock>

**Note**, you'll need to manage versioning between dbt Core, your adapter, and MetricFlow.

Something to note, MetricFlow `mf` commands return an error if you have a Metafont latex package installed. To run `mf` commands, uninstall the package.
Expand Down
Loading

0 comments on commit 9b7a28f

Please sign in to comment.