Skip to content

Commit

Permalink
Fix #9672: Artifact Schema Check
Browse files Browse the repository at this point in the history
  • Loading branch information
aranke committed May 1, 2024
1 parent a36057d commit 6137fc3
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 23 deletions.
50 changes: 28 additions & 22 deletions .github/workflows/schema-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,18 @@
name: Artifact Schema Check

on:
pull_request:
types: [ opened, reopened, labeled, unlabeled, synchronize ]
paths-ignore: [ '.changes/**', '.github/**', 'tests/**', '**.md', '**.yml' ]

workflow_dispatch:
pull_request: #TODO: remove before merging
push:
branches:
- "develop"
- "*.latest"
- "releases/*"

# no special access is needed
permissions: read-all

env:
LATEST_SCHEMA_PATH: ${{ github.workspace }}/new_schemas
SCHEMA_DIFF_ARTIFACT: ${{ github.workspace }}/schema_schanges.txt
SCHEMA_DIFF_ARTIFACT: ${{ github.workspace }}/schema_changes.txt
DBT_REPO_DIRECTORY: ${{ github.workspace }}/dbt
SCHEMA_REPO_DIRECTORY: ${{ github.workspace }}/schemas.getdbt.com

Expand All @@ -46,15 +44,32 @@ jobs:
with:
path: ${{ env.DBT_REPO_DIRECTORY }}

- name: Check for changes in core/dbt/artifacts
# https://github.com/marketplace/actions/paths-changes-filter
uses: dorny/paths-filter@v3
id: check_artifact_changes
with:
filters: |
artifacts_changed:
- 'core/dbt/artifacts/**'
list-files: shell
working-directory: ${{ env.DBT_REPO_DIRECTORY }}

- name: Succeed if no artifacts have changed
if: steps.check_artifact_changes.outputs.artifacts_changed == 'false'
run: |
echo "No artifact changes found in core/dbt/artifacts. CI check passed."
- name: Checkout schemas.getdbt.com repo
if: steps.check_artifact_changes.outputs.artifacts_changed == 'true'
uses: actions/checkout@v4
with:
repository: dbt-labs/schemas.getdbt.com
ref: 'main'
ssh-key: ${{ secrets.SCHEMA_SSH_PRIVATE_KEY }}
path: ${{ env.SCHEMA_REPO_DIRECTORY }}

- name: Generate current schema
if: steps.check_artifact_changes.outputs.artifacts_changed == 'true'
run: |
cd ${{ env.DBT_REPO_DIRECTORY }}
python3 -m venv env
Expand All @@ -65,26 +80,17 @@ jobs:
# Copy generated schema files into the schemas.getdbt.com repo
# Do a git diff to find any changes
# Ignore any date or version changes though
# Ignore any lines with date-like (yyyy-mm-dd) or version-like (x.y.z) changes
- name: Compare schemas
if: steps.check_artifact_changes.outputs.artifacts_changed == 'true'
run: |
cp -r ${{ env.LATEST_SCHEMA_PATH }}/dbt ${{ env.SCHEMA_REPO_DIRECTORY }}
cd ${{ env.SCHEMA_REPO_DIRECTORY }}
diff_results=$(git diff -I='*[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])T' \
-I='*[0-9]{1}.[0-9]{2}.[0-9]{1}(rc[0-9]|b[0-9]| )' --compact-summary)
if [[ $(echo diff_results) ]]; then
echo $diff_results
echo "Schema changes detected!"
git diff -I='*[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])T' \
-I='*[0-9]{1}.[0-9]{2}.[0-9]{1}(rc[0-9]|b[0-9]| )' > ${{ env.SCHEMA_DIFF_ARTIFACT }}
exit 1
else
echo "No schema changes detected"
fi
git diff -I='*[0-9]{4}-[0-9]{2}-[0-9]{2}' -I='*[0-9]+\.[0-9]+\.[0-9]+' --exit-code > ${{ env.SCHEMA_DIFF_ARTIFACT }}
- name: Upload schema diff
uses: actions/upload-artifact@v4
if: ${{ failure() }}
if: ${{ failure() && steps.check_artifact_changes.outputs.artifacts_changed == 'true' }}
with:
name: 'schema_schanges.txt'
name: 'schema_changes.txt'
path: '${{ env.SCHEMA_DIFF_ARTIFACT }}'
3 changes: 2 additions & 1 deletion core/dbt/artifacts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
## Overview
This directory is meant to be a lightweight module that is independent (and upstream of) the rest of dbt-core internals.

It's primary responsibility is to define simple data classes that represent the versioned artifact schemas that dbt writes as JSON files throughout execution.
Its primary responsibility is to define simple data classes that represent the versioned artifact schemas that dbt
writes as JSON files throughout execution.

Long term, this module may be released as a standalone package (e.g. dbt-artifacts) to support stable parsing dbt artifacts programmatically.

Expand Down

0 comments on commit 6137fc3

Please sign in to comment.