Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mart_gtfs.fct_vehicle_locations_dwell table #3646

Closed
wants to merge 19 commits into from
Closed

Conversation

tiffanychu90
Copy link
Member

Description

Describe your changes and why you're making them. Please include the context, motivation, and relevant dependencies.

Part of #3645

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

Include commands/logs/screenshots as relevant.

For making a new table:

jovyan@jupyter-tiffanychu90 ~/data-infra/warehouse (vp-dwell) $ poetry run dbt run -s fct_vehicle_locations_dwell
18:20:03  Running with dbt=1.5.1
18:20:09  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.calitp_warehouse.mart.ad_hoc
18:20:09  Found 477 models, 1018 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 178 sources, 4 exposures, 0 metrics, 0 groups
18:20:09  
18:20:12  Concurrency: 8 threads (target='dev')
18:20:12  
18:20:12  1 of 1 START sql table model tiffany_mart_gtfs.fct_vehicle_locations_dwell ..... [RUN]
18:20:21  1 of 1 OK created sql table model tiffany_mart_gtfs.fct_vehicle_locations_dwell  [CREATE TABLE (392.0k rows, 2.1 GiB processed) in 8.31s]
18:20:21  
18:20:21  Finished running 1 table model in 0 hours 0 minutes and 11.33 seconds (11.33s).
18:20:21  

For adding docs:

jovyan@jupyter-tiffanychu90 ~/data-infra/warehouse (vp-dwell) $ poetry run dbt docs generate
18:25:18  Running with dbt=1.5.1
18:25:21  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.calitp_warehouse.mart.ad_hoc
18:25:22  Found 477 models, 1018 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 178 sources, 4 exposures, 0 metrics, 0 groups
18:25:22  
18:25:31  Concurrency: 8 threads (target='dev')
18:25:31  
18:26:16  Building catalog
18:26:40  Catalog written to /home/jovyan/data-infra/warehouse/target/catalog.json

Post-merge follow-ups

Document any actions that must be taken post-merge to deploy or otherwise implement the changes in this PR (for example, running a full refresh of some incremental model in dbt). If these actions will take more than a few hours after the merge or if they will be completed by someone other than the PR author, please create a dedicated follow-up issue and link it here to track resolution.

  • No action required
  • Actions required (specified below)

@tiffanychu90 tiffanychu90 removed the request for review from evansiroky January 14, 2025 18:43
@tiffanychu90
Copy link
Member Author

@vevetron I only tested the sql on one operator, one day to see if I'm getting results I want. Can you give me comments related to how to better structure the SQL?

@tiffanychu90 tiffanychu90 marked this pull request as draft January 14, 2025 19:26
@vevetron
Copy link
Contributor

#3651 - this should help with the python dbt checks. Something might need to be updated...

@tiffanychu90
Copy link
Member Author

@vevetron I only tested the sql on one operator, one day to see if I'm getting results I want. Can you give me comments related to how to better structure the SQL?

Feedback:

  • rename tables for CTE to be more readable as is...merged, get_next should get renamed
  • add vp_primary_direction...a single column in mart_gtfs is not that consequential!
  • run dbt test with -s +fct_vehicle_locations_dwell+ to make sure it's good
  • formatter for dbt sql

@tiffanychu90 tiffanychu90 marked this pull request as ready for review January 17, 2025 21:46
Copy link

Warehouse report 📦

Checks/potential follow-ups

Checks indicate the following action items may be necessary.

  • For new models, do they all have a surrogate primary key that is tested to be not-null and unique?

New models 🌱

calitp_warehouse.mart.gtfs.fct_vehicle_locations_dwell

DAG

Legend (in order of precedence)

Resource type Indicator Resolution
Large table-materialized model Orange Make the model incremental
Large model without partitioning or clustering Orange Add partitioning and/or clustering
View with more than one child Yellow Materialize as a table or incremental
Incremental Light green
Table Green
View White

@vevetron
Copy link
Contributor

  • is this more accurately where the bus is stopped, not dwelling? From what i understand dwell is at a pickup stop, but this query will find anyplace a bus has stopped moving.
  • change the config materialized to something else, perhaps no option at all and we can make these calculations on demand only.

@tiffanychu90
Copy link
Member Author

tiffanychu90 commented Jan 23, 2025

Close PR, rebase off of new image for python 3.11 (#3657). Also include how we do not want to materialize as a table, change to incremental, and get dbt tests passing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants