Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define fallback implementations for mean, var, and entropy #1875

Closed
wants to merge 1 commit into from

Conversation

ararslan
Copy link
Member

We have the internal expectation function that uses quadgk to compute integrals, as well as a fallback implementation for kldivergence that uses expectation, so it seems reasonable to similarly define fallbacks for other quantities trivially computable using expectation: mean, var, and entropy. We could probably do skewness and kurtosis as well.

To do:

  • Determine the set of functions that should support expectation-based fallbacks
  • Add tests
  • Document that the fallbacks exist

(n.b. I skipped CI on the initial commit purposefully as I haven't given this a ton of thought yet and wanted to see whether anybody was strongly for or against it before doing meaningful work here)

We have the internal `expectation` function that uses `quadgk` to
compute integrals, as well as a fallback implementation for
`kldivergence` that uses `expectation`, so it seems reasonable to
similarly define fallbacks for other quantities trivially computable
using `expectation`: `mean`, `var`, and `entropy`. We could probably do
`skewness` and `kurtosis` as well.

NOTE: Skipping CI until I've actually done more with this since we run a
lot of CI on this repo

[ci skip]
@devmotion
Copy link
Member

Copied from #1874 (comment):

My worry (that was also expressed in issues such as #968) is that generally numerical integration is challenging and a fallback might lead to silently incorrect results. It seems such a fallback would be wrong (or at least problematic) e.g. if the moments are not finite (such as e.g. for Cauchy).

So my general feeling is that numerical integration should maybe be restricted to a smaller subset of distributions, or maybe even only be available as a separate function. In case we want to use it more broadly, I think it would also be safer to error if the integration error estimate is too large, to reduce the probability of silently incorrect results.

@@ -214,7 +214,7 @@ skewness(d::UnivariateDistribution)

Compute the entropy value of distribution `d`.
"""
entropy(d::UnivariateDistribution)
entropy(d::UnivariateDistribution) = expectation(x -> -log(pdf(d, x)), d)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
entropy(d::UnivariateDistribution) = expectation(x -> -log(pdf(d, x)), d)
entropy(d::UnivariateDistribution) = expectation(x -> -logpdf(d, x), d)

pdf can overflow

@bvdmitri
Copy link
Contributor

Generally agree with @devmotion . I've hit a similar issue for kldivergence here. In general, its not a good idea to have a silent approximation method and its better to let the user decide

@ararslan
Copy link
Member Author

Yep, fair enough.

@ararslan ararslan closed this Jul 26, 2024
@ararslan ararslan deleted the aa/fallback-expectation branch July 26, 2024 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants