Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Used columns in col_vals_expr() are extracted lazily #570

Merged
merged 2 commits into from
Sep 14, 2024

Conversation

yjunechoe
Copy link
Collaborator

@yjunechoe yjunechoe commented Sep 12, 2024

In #505, the extraction of used columns was eager, at the validation step's creation. This forced evaluation of preconditions even when active was set to FALSE. To respect active and prevent materializing the table until if/when necessary, this PR simply moves the "extract used columns" behavior into interrogate().

Consequently, inactive steps revert to the old col_vals_expr() behavior of not showing used columns (as what columns are available in the table is is unknown until precondition is triggered)

(modified) reprex from #569

agent <- create_agent(tbl = iris) |> 
  col_vals_expr(~ Petal.Volume > Petal.Width,
                preconditions = \(x) x[x$Petal.Volume > -1, ],
                active = has_columns(iris, Petal.Volume)
  ) |>
  interrogate()

str(
  agent$validation_set[, c("eval_active", "column")]
)
#> tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ eval_active: logi FALSE
#>  $ column     :List of 1
#>   ..$ : chr NA

For the same reason, uninterrogated agents will not show used columns for col_vals_expr(). But interrogated agents will.

agent_uninterrogated <- create_agent(tbl = iris) |> 
  col_vals_expr(~ Petal.Length > Petal.Width)
str(
  agent_uninterrogated$validation_set[, c("eval_active", "column")]
)
#> tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ eval_active: logi NA
#>  $ column     :List of 1
#>   ..$ : chr NA

agent_interrogated <- agent_uninterrogated %>% 
  interrogate()
str(
  agent_interrogated$validation_set[, c("eval_active", "column")]
)
#> tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ eval_active: logi TRUE
#>  $ column     :List of 1
#>   ..$ : chr [1:2] "Petal.Length" "Petal.Width"

Active and interrogated col_vals_expr() steps will continue to show used columns in the report:

get_agent_report(agent_interrogated)

image

Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

@rich-iannone
Copy link
Member

(Feel free to merge whenever.)

@yjunechoe yjunechoe merged commit 4688c5f into rstudio:main Sep 14, 2024
12 checks passed
@yjunechoe yjunechoe deleted the extract-used-columns-lazy branch September 17, 2024 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants