Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantile regression implementation notes #30

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

topepo
Copy link
Member

@topepo topepo commented Aug 5, 2024

Added here to discuss

@topepo topepo requested review from hfrick and simonpcouch August 5, 2024 17:36
@topepo
Copy link
Member Author

topepo commented Aug 5, 2024

@ryantibs @dajmcdon please review and/or add discussion below.

Copy link

@ryantibs ryantibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote a couple of thoughts on terminology and evaluation metrics.

The format of the list column would be

```
tibble(.quantile = numeric(), .pred_quantile = numeric())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just note that I find the use of "quantile" here (and in other places, used similarly) a bit confusing. I think that you are referring to a "probability level" for the quantile, such as 0.5 or 0.9. This is sometimes called also "quantile level".

Somewhat confusingly, it's also sometimes called just "quantile". But then that can be easily confused with the associated quantile value itself. So I prefer something like "level" for that reason, to disambiguate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use "level". That should be intuitive for people.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using:

.pred_quantile = tibble(.value = numeric(), .quantile_level = numeric())

for specificity.

It may also be worth considering giving the a class or using vctrs::vctrs_rcrd.

  1. A user might want to, for example, perform math on the .value.
  2. It might also help with the tune methods as well. When tuning, what if we end up comparing models fit with different sets of quantiles (possibly with an overlap)? So evaluation may need to compare one set with .quantile_level = c(.2, .5, .8) and another with .quantile_level = c(.1, .2, .5, .8, .9).


## yardstick

We need one or more model performance metrics for quantile loss that work across all quantiles. They will need a new class too (say `“quantile_metric”`).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most obvious metric is just to add up the quantile loss over all quantile levels. This is given in (for example) equation (10) of these notes: https://www.stat.berkeley.edu/~ryantibs/statlearn-s23/lectures/calibration.pdf.

As the notes explain, this is also equivalent (up to a factor of 2) to weighted interval score (WIS). That is defined in equation (8) and the equivalence is stated in equation (11).

Some people in the field do not know of this equivalence and I've even seen papers reporting both quantile loss and WIS as separate columns of a predictive perrformance metrics table. So it'd be good for this package to clearly state they are equivalent! Perhaps in the documentation.

WIS is nice because it offers a complementary view on the same equivalent metric. As explained in this paper: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008618, you can decompose the score by inspecting the two summands separately (summed up over all levels), and interpret these loosely as a spread/sharpness measure (first one) and calibration measure (second one). Some people like looking at that as consituents of error.

Other than quantile loss and equivalently WIS, I'm not aware of any other score based on predicted quantiles that is proper. People in forecast scoring care a lot about propriety. However, I'll just note that it would be reasonable in general to additionally look at absolute error (AE) using the predicted 50% quantile (predicted median) if it's available in the list of quantiles, and also to look at coverage (empirical coverage of each quantile versus the nominal coverage). I think people often do this in practice.

Finally, in an off-the-cuff remark, I'll note that I'm becoming more and more bothered by quantile loss/WIS as a general go-to metric. I think it can be too lenient for underdispersed forecasters. I have an ongoing project with students exploring this. I don't have anything concrete to recommend as an alternative at the moment, maybe we will at some future point.

The format of the list column would be

```
tibble(.quantile = numeric(), .pred_quantile = numeric())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using:

.pred_quantile = tibble(.value = numeric(), .quantile_level = numeric())

for specificity.

It may also be worth considering giving the a class or using vctrs::vctrs_rcrd.

  1. A user might want to, for example, perform math on the .value.
  2. It might also help with the tune methods as well. When tuning, what if we end up comparing models fit with different sets of quantiles (possibly with an overlap)? So evaluation may need to compare one set with .quantile_level = c(.2, .5, .8) and another with .quantile_level = c(.1, .2, .5, .8, .9).

* There could be confusion around the type of model and prediction being made. Suppose that `rpart` could make quantile predictions (it cannot). Would use an engine of “part” result in the regular CART model or one optimized via quantile loss?
* Since the list of quintiles would be specified upfront, it would be advantageous to do this with the new mode (`set_mode(“quantgreg,” quantiles = (1:9) / 10)`).

To test using a new mode, there is a [parsnip branch](https://github.com/tidymodels/parsnip/tree/quantile-mode) that enables that so you can use:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is the preferred way to do this, because of the cons listed above. Thanks for the testing branch. I'll try reimplementing some of our engines off this branch in the next few days and link back here describing any difficulties or benefits I encounter.

@topepo
Copy link
Member Author

topepo commented Aug 15, 2024

I took a few minutes and built a small hardhat template for making our on quantile regression models: topepo/salmon#1

@dajmcdon
Copy link

I made a small template for a few engines: https://github.com/dajmcdon/quantrines

@topepo
Copy link
Member Author

topepo commented Aug 29, 2024

@dajmcdon 😬 I didn't see your repo and did some implementing this afternoon in that parsnip branch.

I added some infrastructure to parsnip to pick up the quantile levels and insert them into the fit calls (instead of having to pass them in as engine args with potentially different names). I have the api adding them to set_mode() so it can validate them at the same times as the model/engine/mode combination. There's a code example below.

We can combine both sets of work since they don't 100% overlap.

If you are good with it, I'll setup some weekly or every-other-week time to catch up on this.

# pak::pak(c("tidymodels/parsnip@quantile-mode"), ask = FALSE)

library(tidymodels)
qunt_reg_spec <- 
  linear_reg() %>% 
  set_engine("quantreg") %>% 
  set_mode("quantile regression", quantile_level = c(.1, .5, .9))
qunt_reg_spec
#> Linear Regression Model Specification (quantile regression)
#> 
#> Computational engine: quantreg
#> Quantile levels: 0.1, 0.5, and 0.9.
qunt_reg_f_fit <- 
  qunt_reg_spec %>% fit(price ~ sqft, data = Sacramento)
qunt_reg_f_fit
#> parsnip model object
#> 
#> Call:
#> quantreg::rq(formula = price ~ sqft, tau = quantile_level, data = data)
#> 
#> Coefficients:
#>                tau= 0.1  tau= 0.5   tau= 0.9
#> (Intercept) -21815.3846 5946.2931 29181.5323
#> sqft           107.6923  136.0187   190.4032
#> 
#> Degrees of freedom: 932 total; 930 residual

# or use:
qunt_reg_xy_fit <- 
  qunt_reg_spec %>% fit_xy(x =Sacramento %>% select(sqft), y = Sacramento$price)
pred_vals <- predict(qunt_reg_f_fit, Sacramento[1:2,], type = "quantile")
pred_vals$.pred_quantile[[1]]
#> # A tibble: 3 × 2
#>   .pred_quantile .quantile_level
#>            <dbl>           <dbl>
#> 1         68215.             0.1
#> 2        119658.             0.5
#> 3        188359.             0.9

# or just
predict(qunt_reg_xy_fit, Sacramento[1:2,])
#> # A tibble: 2 × 1
#>   .pred_quantile  
#>   <list>          
#> 1 <tibble [3 × 2]>
#> 2 <tibble [3 × 2]>

Created on 2024-08-29 with reprex v2.1.0

@dajmcdon
Copy link

@topepo That looks great! (I should have probably tagged you on the repo directly). How would you like to proceed from here?

  1. I modified the rand_forest() implementation I have in that package to use your updates, and it works really well!

  2. I removed the .onLoad() for my quantreg version to avoid the

Error in `is_discordant_info()`:
! The combination of engine `quantreg` and mode `quantile regression` "and
   prediction type 'quantile'" already has predict data for model `linear_reg` and
   the new information being registered is different.

error.

One thing with quantreg::rq() is that it produces different output objects with just one quantile_level requested. So the post function in the predictions will need a slight modification.

library(parsnip)
tib <- data.frame(
  y = rnorm(100), x = rnorm(100), z = rnorm(100),
  f = factor(sample(letters[1:3], 100, replace = TRUE))
)
spec <- linear_reg(engine = "quantreg") %>%
  set_mode(mode = "quantile regression", quantile_level = c(.1, .5, .9))
out <- fit(spec, formula = y ~ x + z, data = tib)
predict(out, new_data = tib[1:5, ])
#> # A tibble: 5 × 1
#>   .pred_quantile  
#>   <list>          
#> 1 <tibble [3 × 2]>
#> 2 <tibble [3 × 2]>
#> 3 <tibble [3 × 2]>
#> 4 <tibble [3 × 2]>
#> 5 <tibble [3 × 2]>

spec <- linear_reg(engine = "quantreg") %>%
  set_mode(mode = "quantile regression", quantile_level = c(.5))
out <- fit(spec, formula = y ~ x + z, data = tib)
predict(out, new_data = tib[1:5, ])
#> Warning in rep(quantile_level, each = n): first element used of 'each' argument
#> Error in 1:n: argument of length 0

Created on 2024-08-29 with reprex v2.1.1

@topepo
Copy link
Member Author

topepo commented Sep 1, 2024

One thing with quantreg::rq() is that it produces different output objects with just one quantile_level

Yep! I have a TODO in there for that. I always check that and make sure that it works with a 1 row data frame/matrix.

For the random forest engine, do you want to branch off of this branch and make a PR into parsnip (after this one is merged)?

Happy to keep it somewhere else but this puts the support overhead to us. Either is fine with us. I only added quantreg to the PR since I needed an engine to test against. We can keep that in your repo too.

@topepo
Copy link
Member Author

topepo commented Sep 16, 2024

Some notes on changes for existing quantile prediction methods: tidymodels/parsnip#1203

@jrosell
Copy link

jrosell commented Sep 17, 2024

I currently use lightgbm engine parameters objective='quantile' and alpha using regression mode. Is the plan to support this approach and quantile mode too?

@topepo
Copy link
Member Author

topepo commented Sep 17, 2024

@jrosell I believe that @dajmcdon has an interest in including more engines (#30 (comment)).

lightgbm would require a separate model fit for each requested quantile level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants