[Feature Request] Inverse CDF of distributions #262

ghost · 2023-04-24T12:05:13Z

Is your feature request related to a problem? Please describe.
Inverse CDFs are useful for calculating credible intervals for a given distribution, among other things.

Describe the solution you'd like
It would be great to have inverse CDFs for all distributions. But starting from normal distribution would be great.

bvenn · 2023-04-24T12:20:41Z

The normal InvCDF for mean = 0 and sigma = 1 is already implemented at an inproper position:

FSharp.Stats/src/FSharp.Stats/Signal/QQPlot.fs

Lines 91 to 92 in b74ecf2

    
           let inverseCDF x = 
        
               sqrt 2. * Errorfunction.inverf (2. * x - 1.)

I agree, we should add quantile functions as InvCDF for all distributions 👍

bvenn · 2023-04-25T15:26:41Z

I've added an InvCDF member to all distributions by 3d6a220.
I noticed the approximation of the inverse error function leads to some discrepancies when extreme values are chosen and compared to the R qnorm procedure.

// Testing FSharp.Stats
(Distributions.Continuous.Normal.InvCDF 0. 1. 0.5)  //0.

# Testing R
qnorm(0.5,0,1)

# Testing Python
from scipy.stats import norm
norm.ppf(0.5, loc=0, scale=1)

Mean	StDev	X	result FSharp.Stats	result R	result Python
0	1	0.5	1.253321755e-09	0	0
0	1	0	-infinity	-infinity	-infinity
0	1	1	infinity	infinity	infinity
3	0.01	0.01	2.97673652985179	2.97673652126	2.97673652125959
-300000	5000	0.99	-288368.2649258	-288368.2606298	-288368.2606297958

While the deviation is small and just occurs at extreme values, it would be worth checking if the approximation presented in Wichura, “Algorithm AS 241: The Percentage Points of the Normal Distribution.”, 1988 should be implemented.

add tests
check accuracy

References

https://github.com/SurajGupta/r-source/blob/master/src/nmath/qnorm.c

#262

bvenn · 2023-05-02T06:20:04Z

I've implemented the quantile function of the normal distribution as described in Wichura et al.. Its accurate for
15 decimal places.

fslaborg#262

bvenn · 2023-07-04T08:07:52Z

bvenn · 2023-07-20T12:59:19Z

Many distributions have no closed form of the quantile function. Besides published approximations would be beneficial to add a member for each Distribution that approximates the correct x for a given p. The CDF is continuously increasing and therefore a root finding approach should work just fine. I propose the following:

type MyDistribution =

    static member PDF a b x = ...

    static member CDF a b x = ...

    static member InvCDF a b x = //possible no closed form exists

    static member InvCDFApprox a b x accuracy = 
        ///parameters: function (float -> float); accuracy (float); minimum (float); maximum (float); maxIterations
        let tmp = Optimization.Bisection.tryFindRoot (fun x -> MyDistribution.CDF a b x - p) accuracy 0. 1. 1000
        match tmp with 
        | Some x -> x
        | None -> failwith "no InvCDF found to satisfy the given conditions"

Drawbacks

While this should be feasible for any distribution, the optimization step may be quite slow.
If the CDF itself is an approximation, an error propagation would inflate the InvCDF error.

To discuss:

should the InvCDFApprox fail or result in nan when no root can be identified?
can the maxIterations be determined by accuracy? In my understanding, the range between min and max is divided into two sections during each iteration. Therefore, the accuracy should be coupled to the number of maximum iterations by: $$accuracy = 0.5^{maxIterations}$$ If this is correct, maxIterations could be set to $$System.Math.Log(accuracy,0.5)$$
should the accuracy be given as float, or maybe model it as type like in Expecto.Accuracy.
Is there a better alternative for Optimization.Bisection? Like e.g. NelderMead with (fun x -> abs (MyDistribution.CDF a b x - p)) as objective function.
Are there any better naming options as InvCDFApprox?
Are there more concerns, that I missed?

bvenn pushed a commit that referenced this issue Apr 30, 2023

add wichura invCDF

6c9eb18

#262

bvenn pushed a commit that referenced this issue Apr 30, 2023

add normal invCDF tests

900fc8b

#262

bvenn added the up-for-grabs label May 3, 2023

DoganCK pushed a commit to DoganCK/FSharp.Stats that referenced this issue Jun 12, 2023

add invCDF for lognormal

7b21036

fslaborg#262

DoganCK pushed a commit to DoganCK/FSharp.Stats that referenced this issue Jun 12, 2023

add lognormal invCDF tests

ce67272

fslaborg#262

DoganCK mentioned this issue Jun 12, 2023

Lognormal Inv cdf #268

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Inverse CDF of distributions #262

[Feature Request] Inverse CDF of distributions #262

ghost commented Apr 24, 2023

bvenn commented Apr 24, 2023

bvenn commented Apr 25, 2023 •

edited

Loading

bvenn commented May 2, 2023

bvenn commented Jul 4, 2023 •

edited

Loading

bvenn commented Jul 20, 2023

[Feature Request] Inverse CDF of distributions #262

[Feature Request] Inverse CDF of distributions #262

Comments

ghost commented Apr 24, 2023

bvenn commented Apr 24, 2023

bvenn commented Apr 25, 2023 • edited Loading

References

bvenn commented May 2, 2023

bvenn commented Jul 4, 2023 • edited Loading

bvenn commented Jul 20, 2023

Drawbacks

To discuss:

bvenn commented Apr 25, 2023 •

edited

Loading

bvenn commented Jul 4, 2023 •

edited

Loading