Prevent passing denormals in calculation. #1141

jzmaddock · 2024-05-23T18:06:03Z

Refs scipy/scipy#20693

when checking for small values.

codecov · 2024-05-24T10:34:32Z

Codecov Report

Attention: Patch coverage is 99.32886% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 93.71%. Comparing base (cf0d343) to head (20f44d1).
Report is 4 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1141      +/-   ##
===========================================
+ Coverage    93.69%   93.71%   +0.02%     
===========================================
  Files          772      774       +2     
  Lines        61168    61271     +103     
===========================================
+ Hits         57311    57422     +111     
+ Misses        3857     3849       -8

Files	Coverage Δ
include/boost/math/concepts/std_real_concept.hpp	`100.00% <ø> (ø)`
...t/math/distributions/detail/hypergeometric_pdf.hpp	`96.62% <100.00%> (ø)`
...h/distributions/detail/hypergeometric_quantile.hpp	`96.84% <100.00%> (ø)`
...lude/boost/math/distributions/non_central_beta.hpp	`91.96% <100.00%> (+0.16%)`	⬆️
include/boost/math/distributions/non_central_t.hpp	`97.62% <100.00%> (-0.03%)`	⬇️
...e/boost/math/quadrature/detail/exp_sinh_detail.hpp	`96.71% <100.00%> (+1.47%)`	⬆️
...clude/boost/math/special_functions/jacobi_zeta.hpp	`100.00% <ø> (ø)`
test/exp_sinh_quadrature_test.cpp	`99.13% <100.00%> (+0.05%)`	⬆️
test/nc_t_pdf_data.ipp	`100.00% <ø> (ø)`
test/tanh_sinh_quadrature_test.cpp	`100.00% <100.00%> (ø)`
... and 4 more

... and 3 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf0d343...20f44d1. Read the comment docs.

Fix exp_sinh issues so that it does actually find the integral.

Lot's more tests, especially in the tails. Added Hypergeometric and Integration methods as fallbacks.

dschmitz89 · 2024-06-05T20:13:07Z

@jzmaddock : this looks impressive! One question for the addition to scipy. Do you have an external source for reference values for the noncentral T PDF? I did not check the complete (large) testing diff, might have missed it.

If not, I would implement the formula currently used in scipy with the high precision mpmath python library.

dschmitz89 · 2024-06-05T20:15:21Z

tools/non_central_t_pdf_data.cpp

+
+struct nc_t_pdf_gen
+{
+   mp_t operator()(mp_t v, mp_t mu, mp_t x)


Ah, this might be arbitrary precision implementation boost style that generates the test values after all?

jzmaddock · 2024-06-06T08:23:48Z

Nod. I used the hypergeometric formula and multiprecision arithmetic for the new test values.

One thing I should have mentioned: the integration method though accurate, is about 10x slower than the alternatives :( Other than truncation to zero, there really is no alternative though... as I'm typing though I'm just wondering if wolframalpha would generate a series for the integral...

ckormanyos · 2024-06-07T04:40:05Z

just wondering if wolframalpha would generate a series

I'm not all that familiar with this function but I've recently been doing a lot of computer algebra for Series and Pade approximants in other projects. Does anyone have a link to representation of the integral mentioned? I coould try for an approximation.

jzmaddock · 2024-06-07T08:33:40Z

I'm not all that familiar with this function but I've recently been doing a lot of computer algebra for Series and Pade approximants in other projects. Does anyone have a link to representation of the integral mentioned? I coould try for an approximation.

Cool, it's:

integral y^v exp(-1/2 ((y - mu * x / sqrt(x * x + v)))^2) dy for x = 0 to infinity

Which is the integral part of this formula https://wikimedia.org/api/rest_v1/media/math/render/svg/0059c7e6e3afb0e4496fecdcb3b85296c162f065

Ideally we need this whenever y < 0 in which domain the result is expected to be tiny.

ckormanyos · 2024-06-07T17:10:43Z

try for an approximation

The first thing I tried is the exponent part alone without integrating over $dy$, wondering if series expansion and subsequent integration would work. And it is normalized by dividing by the large exponent term at the end of the expression. If the series is continued in $x$, then integration of $y^{\nu}$ should be straightforward enough. It's been a long day, but @jzmaddock your intuition is correct, it seems like a series could do the trick.

W-Alpha didn't do it. I needed a bit more power and I ended up doing this first try locally.

Normal[Series[Exp[-1/2 ((y - mu*x/Sqrt[x*x + v]))^2], {x, Infinity, 4}] / Exp[-(1/2) (y - mu)^2]]

Note that we use Normal[] to truncate the series.

Here is a picture of the input and its output.

ckormanyos · 2024-06-08T05:07:28Z

OK I included the scaling by the exponential term added the multiplication by $y^{\nu}$ in the integrand and did the definite integral. It gives results like the following to Order $4$. See below.

Is there A numerical example I could use to check the approximation to Order $4$

ckormanyos · 2024-06-08T05:52:42Z

Actually, @jzmaddock I don't know if expanding at $x{\rightarrow}{\infty}$ was correct. Are the presumptions in my approximation above even correct?

jzmaddock · 2024-06-08T11:11:18Z

Actually, @jzmaddock I don't know if expanding at was correct. Are the presumptions in my approximation above even correct?

Since we're really interested in x < 0, I would say expansion either at x == 0 or x = -infinity would be better?

I also note that the curve being integrated, is extremely "pointy" - all the area is located in a tiny area - so this might be getting out of hand, but a series around the maxima (for y) would be optimal I guess?

If we fix the SciPy test case, with x = -1, v = 8 and mu=16 then https://www.wolframalpha.com/input?i=x%5E8+exp%28%28%28x+-+16+*+-1+%2F+sqrt%281+%2B+8%29%29%29%5E2+%2F+-2%29%3B shows the curve nicely, most of the peak in this case is outside of [0, INF] but that's not always the case. Interestingly, with all the variables fixed like this, wolfram alpha gives a simple(-ish) closed form for the integral, and for the summit of the curve. Does this help at all? Possibly not, since evaluating the definite integral in this case is the difference between two large values?

The best way to sanity check the series would be against the integral itself:

            boost::math::quadrature::exp_sinh<T, Policy> integrator;
            // Remove this line to check just the integral part:
            T integral = pow(v, v / 2) * exp(-v * mu * mu / (2 * (x * x + v)));
            if (integral != 0)
            {
               integral *= integrator.integrate([&x, v, mu](T y) 
                  {
                     T p;
                     if (v * log(y) < tools::log_max_value<T>())
                        p = pow(y, v) * exp(boost::math::pow<2>((y - mu * x / sqrt(x * x + v))) / -2);
                     else
                        p = exp(log(y) * v + boost::math::pow<2>((y - mu * x / sqrt(x * x + v))) / -2);
                     return p; 
                  });
            }

Extracted from non_central_t.hpp.

ckormanyos · 2024-06-08T12:19:08Z

Hi John (@jzmaddock) OK I am getting closer.

I just tried:

Integrate[(y^nu) Exp[-((y - ((mu x)/Sqrt[x x + nu]))^2)/2], {y, 0, Infinity}]

and received a closed form answer.

jzmaddock · 2024-06-08T14:39:06Z

There seems to be something wrong there if the answer depends only on x^2 ?

jzmaddock · 2024-06-08T14:44:55Z

I think it should have been Integrate[ [x^v exp(-1/2 ((x - mu * t / sqrt(t^2 + v)))^2)], {x, 0, Infinity}] but wolframalpha chokes on that.

ckormanyos · 2024-06-08T17:51:42Z

think it should have been

Geez this is challenging. Like this then...

Integrate[x^v Exp[-1/2 ((x - mu t/Sqrt[t^2 + v]))^2], {x, 0, Infinity}]

Whith the following answer.

But I think there are some very large terms and this is numerically unstable. I haven't quite figured out what to do with that yet. I'll try off and onn. Does not seem to be critical. If I ever come up with something, I'll report.

ckormanyos · 2024-06-08T17:58:12Z

Then I evaluated the numerical point at the test case, but internally the computer algebra system enlarges the precision. So the cancellations do not influence the result. But I haven't figured out what to do at fixed precison.

ckormanyos · 2024-06-08T18:17:33Z

It's a nightmare. Here are the two added terms in the result. The cancellations quench the result for machine precision.

I wish I knew more about asymptotics and perturbation theory... Arrrggghhh

jzmaddock · 2024-06-09T10:22:51Z

Well the good news is that 2.29e-9 is the correct answer ;)

But yes, you've basically re-invented the hypergeometric approximation, and the cancellation is indeed terrible :(

We might well be out of ideas at this point, but I appreciate the effort!

ckormanyos · 2024-06-09T11:17:54Z

might well be out of ideas at this point

I tried expanding in parameters - the first trick of asymptotics. Expanding in ${\mu}$ worked but still suffered from severe cancellations. Expansion in ${\nu}$ did not converge.

You know, this would be a great spot for the infamous double-double? I don't know if you'd like to throw Multiprecision at it, but if you double the working precision, you get, as you know, the digits back. But Multiprecision doesn't quite work with standalone Math. Somewhere down the line, Math could benefit from its own kind of a double-double.

Other than that, I'm out of ideas also.

jzmaddock · 2024-06-09T11:36:01Z

The issue with the hypergeomentrics is that when v is small, the largest term in the hypergeometric series is of the order (mu^2)! which will quite rapidly overflow. And even if it doesn't you end up subtracting two arbitrarily large numbers :( So I don't think even a double-double would do it, better to suck it up and use numerical integration I would guess.

ckormanyos · 2024-06-09T12:31:48Z

The issue with the hypergeomentrics is that when v is small, the largest term in the hypergeometric series is of the order (mu^2)! which will quite rapidly overflow. And even if it doesn't you end up subtracting two arbitrarily large numbers :( So I don't think even a double-double would do it, ...

Indeed. I even tried expansion in ${\mu}$ and although the expansion was promising, the cancellation was also present.

better to suck it up and use numerical integration I would guess.

Yes. Thanks John. I'll hang up my hat on this one now.

jzmaddock added 2 commits May 23, 2024 19:05

Prevent passing denormals in calculation.

0b2aa54

Refs scipy/scipy#20693

Poisson part can be negative, take fabs of product

5e7fec6

when checking for small values.

dschmitz89 mentioned this pull request May 25, 2024

ENH: stats.nct.pdf: increase range in left tail using boost scipy/scipy#20785

Merged

jzmaddock added 11 commits May 30, 2024 12:29

Add non central t PDF integral tests.

f28c776

Fix exp_sinh issues so that it does actually find the integral.

Update Non central T PDF:

9e6f2b1

Lot's more tests, especially in the tails. Added Hypergeometric and Integration methods as fallbacks.

Jacobi Zeta: remove unused variable.

70b6ab7

Allow exp_sinh to be used on non-exception environments.

18f49f9

Add missing test file.

5ec5c67

Lots of warning suppressions.

eb42cba

Correct isfinite call.

69443ab

Set quad precision error rates.

7439011

Correct conceptual failings, remove tests which can't possibly succeed.

2bea451

tentative CI failure fixes.

7d56997

Work through more CI failures.

20f44d1

jzmaddock merged commit 70cdb37 into develop Jun 2, 2024
78 checks passed

dschmitz89 reviewed Jun 5, 2024

View reviewed changes

NAThompson deleted the nc_t_improvements branch June 9, 2024 16:07

Prevent passing denormals in calculation. #1141

Prevent passing denormals in calculation. #1141

Uh oh!

Conversation

jzmaddock commented May 23, 2024

Uh oh!

codecov bot commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

dschmitz89 commented Jun 5, 2024

Uh oh!

dschmitz89 Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

jzmaddock commented Jun 6, 2024

Uh oh!

ckormanyos commented Jun 7, 2024

Uh oh!

jzmaddock commented Jun 7, 2024

Uh oh!

ckormanyos commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckormanyos commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckormanyos commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jzmaddock commented Jun 8, 2024

Uh oh!

ckormanyos commented Jun 8, 2024

Uh oh!

jzmaddock commented Jun 8, 2024

Uh oh!

jzmaddock commented Jun 8, 2024

Uh oh!

ckormanyos commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckormanyos commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckormanyos commented Jun 8, 2024

Uh oh!

jzmaddock commented Jun 9, 2024

Uh oh!

ckormanyos commented Jun 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jzmaddock commented Jun 9, 2024

Uh oh!

ckormanyos commented Jun 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov bot commented May 24, 2024 •

edited

Loading

ckormanyos commented Jun 7, 2024 •

edited

Loading

ckormanyos commented Jun 8, 2024 •

edited

Loading

ckormanyos commented Jun 8, 2024 •

edited

Loading

ckormanyos commented Jun 8, 2024 •

edited

Loading

ckormanyos commented Jun 8, 2024 •

edited

Loading

ckormanyos commented Jun 9, 2024 •

edited

Loading

ckormanyos commented Jun 9, 2024 •

edited

Loading