-
Notifications
You must be signed in to change notification settings - Fork 432
Implementing Hypergeometric sampling to remove RFunctions dependency #1994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👍 Can you update the PR to follow the design of Distributions.jl, ie create a new sampler in src/samplers instead of changing the distribution type? 🙂 |
Tanks for taking a look at it, will do. Will be back with an update in the near future. (Thinking about it, this may solve the test that is currently failing, I'm dispatching on a type parameter to switch between algorithms and this seems to be against the intended pattern here) |
This comment was marked as resolved.
This comment was marked as resolved.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1994 +/- ##
==========================================
+ Coverage 86.28% 86.45% +0.17%
==========================================
Files 146 147 +1
Lines 8787 8861 +74
==========================================
+ Hits 7582 7661 +79
+ Misses 1205 1200 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I'm back. The errors seen before, after further investigation, also happen with other distributions due to the lack of support for the logpdf function for Dual values of x in StatsFuns. Those other distributions are already excluded from the test. I've included the Hypergeometric distribution to this exclusion rule (it is still tested with Float64 values of x in the logpdf). Also added one more test to cover the modes function which was uncovered before. |
Checks are passing and test coverage seems to be complete now. @devmotion , please review this when you have time, thanks. |
Following the TODO in the Hypergeometric distribution sampling code, this PR implements the H2PE/HIM algorithm presented by Kachitvichyanukul & Schmeiser in 1984, removing this dependency on RFunctions and improving the sampling performance with a penalty in setup time and some extra stored constants.
Added some test cases for complete coverage of the distribution code. More extensive testing with the distribution of the generated variables was done (locally, manually) on both of the sub-cases of this algorithm.
Example case of HIM
The HIM algorith runs for smaller distributions. To exemplify, the
Hypergeometric(20, 20, 10)
:Setup Benchmark (with RFunctions):

Setup Benchmark (HIM):

Sampling Benchmark (with RFunctions):

Sampling Benchmark (HIM):

QQPlot with 1000 samples
Example case of H2PE
The H2PE algorithm is used for larger cases. To examplify,
Hypergeometric(200, 200, 100)
:Setup Benchmark (with RFunctions):

Setup Benchmark (H2PE)

Sampling Benchmark (with RFunctions):

Sampling Benchmark (H2PE):

QQPlot with 1000 samples

Feel free to ask any questions and do/request any adjustments. Have a nice week.