-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Negative Binomial CDF and Quantile do not always roundtrip #935
Comments
Note that this failure only occured on our macos_arm64_test. This is the same platform in which we see the overflow warnings. Actually, they could be related. I didn't realize it when I wrote
but we are silencing overflow warnings from the nbinom PPF calculation, too. |
I wonder if this is truly a bug, but more particularly this comes down to what we actually want the quantile to do here as different libraries take slightly different approaches. Our goal for quantile is to find integer k such that So I guess my question would be: is the original bug a result of a needed feature, or simply someone doing a naive sanity check? |
In general, the PPF is the smallest integer k such that the CDF >= probability, right?
I can't speak for whether users will actually need the edge behavior to be perfect for science/engineering use. Nonetheless, users check cases like this to verify their understanding of the function and to check that it's working as they expect, and they will find and report edge cases like this as a bug. This can cause confusion and erode confidence in SciPy, so we try to get the edge cases right, too.
It sounds like you are using a typical bracketing rootfinder, say bisection, and the ends of the bracket are Either way, can't you terminate as soon as the bracket width drops below 1? At that point (or as soon as Please forgive me if I'm confusing myself. I've certainly done that before when thinking about this problem! |
Me too! I think there may be a solution though:
We might even save one or two CDF evaluations per quantile call that way. |
I think the situation is improved by terminating early, yup. Less likely to run into the trouble you described. This is making me wonder about how
In the round-trip case. If the CDF were numerically monotonic, this wouldn't happen, right? This would mean that the bracket was no longer valid because the root is at the boundary, but now both ends of the bracket are on one side of the boundary. Not a bracket. But since there are more floating point values within the support of a discrete distribution than there are between 0 and 1 (the range of the CDF), it seems that the CDF will always be a bit step-wise. So when "one end randomly pops over the integer boundary", does this happen such that I think if this is the case and you round up, I think you run the risk of returning one integer too high. This would be very rare if you terminate as soon as possible, but I can see how it would happen if you continue to refine the bracket. This brings another question to mind - how is the CDF defined if |
Just looking through our code at this.... looks like this case should already be handled (albeit it may be possible to be more efficient), so I pushed this test case to investigate more what's actually happening rather than just speculating: https://github.com/boostorg/math/pull/938/files#diff-f08ad2489004236ba31ecf0b1d938d4433525c9bfecd328bba33398f65642e18 But it's passing the tests, what am I missing? https://drone.cpp.al/boostorg/math/1163/1/2 |
Maybe the architecture? |
Ah, arm64, missed that, thought it was M1, I think we can emulate arm64/Linux at least, will try and see. |
macOS ARM should be limited to M1 |
@jzmaddock I don't know if it matters but these are the policies being used in the build:
On M1 |
OK, is it a release build as well? |
Here is the file with all the compiler flags. I don't see anything specifying debug or release builds. I am likely just ignorant of their build system. |
Hmm, still passing, what's the clang version? We have "Clang version 14.0.0 (clang-1400.0.29.202)" |
C++ compiler for the host machine: c++ (clang 13.1.6 "Apple clang version 13.1.6 (clang-1316.0.21.2.3)") |
Thanks Matt, sadly, the test is still passing, can I get you take a look locally if you have an M1 machine there? There might be ways to re-write the code, which might also be more efficient, and might fix the issue, but I'd really like a reproducer so I can understand what's going on. |
I can't reproduce on your branch with M1 using clang 14 and GCC-12 with C++14,17,20,and 2b. I also had a slightly modified version here that runs cleanly. |
Then I don't know what to make of the report, @mdhaber do you know the exact command line used to build, and/or can you provide a reproducer? I feel we're missing something vital here. |
I don't. Maybe @mckib2 can help with that |
I was seeing this locally on an M1 Mac Mini (along with the aforementioned overflows). I'll try to get a reproducer on that and report the environment specifics later this evening EDIT: scratch that, I'm seeing "FloatingPointError: overflow encountered in _nbinom_ppf" when trying to run @mborland's minimal reproducer at the top of this issue. I'll try to see if I can resolve that, or it's taking too long, I'll ignore that exception and press on |
Alright, with #941 applied, I can reproduce the issue with the original reproducer up top. Here's info on my Mac Mini, please ask for any other interesting info: macOS Monterey version 12.6
Interestingly, meson is using a different compiler than the first one encountered on the PATH as above:
Not sure which one I'd trust -- maybe what meson configure reports? |
Sorry @mckib2 : do we have a reproducer or no? Did you try the C++ file referenced: https://github.com/boostorg/math/pull/936/files ? |
Still no reproduction here on:
We've tried clang-14.0 as well. |
Sorry -- forgot to go back and do this. I checked out mborland/935 and added this to the root cmakelists to get an exectuable: add_executable(issue935 test/git_issue_935.cpp)
target_compile_features(issue935 PRIVATE cxx_std_17)
target_link_libraries(issue935 PUBLIC Boost::math) Running the resulting executable, I get the following:
So yes, this does reproduce the round-trip failure for me |
If it's interesting to compare to your own, here are my CMakeCache.txt and build.ninja files |
Failure can be found here: https://github.com/scipy/scipy/pull/17432/checks?check_run_id=10961417302
Found on PR: scipy/scipy#17432
Minimal reproducer:
Note in their code says:
In very rare cases, the finite precision calculation of ppf(cdf(supp))
can produce an array in which an element is off by one. We nudge the
CDF values down by 10 ULPs help to avoid this.
At 15 ULPs it seems to be correct.
The text was updated successfully, but these errors were encountered: