Alea is a collection of utilities to work with most known probability distributions, written in pure Crystal.
Note: This project is in development state and many distributions are still missing, as well as cumulative distribution functions, so keep in mind that breaking changes may occur frequently.
Crystal compiles to really fast native code without sacrificing any of the modern programming languages standards providing a nice and clean interface.
- PRNGs implementations
- Random sampling (single/double precision)
- Cumulative Distribution Functions (single/double precision)
Distribution | Sampling (32 / 64) | CDF (32 / 64) |
---|---|---|
Beta | Y Y | N N |
Chi-Square | Y Y | Y Y |
Exponential | Y Y | Y Y |
F-Snedecor | Y Y | N N |
Gamma | Y Y | Y Y |
Laplace | Y Y | Y Y |
Log-Normal | Y Y | Y Y |
Normal | Y Y | Y Y |
Poisson | N Y | N Y |
T-Student | Y Y | N N |
Uniform | Y Y | Y Y |
- Distribution and empirical data statistical properties
- Quantile Functions
- Add the dependency to your
shard.yml
:
dependencies:
alea:
github: nin93/alea
-
Run
shards install
-
Import the library:
require "alea"
Random
is the interface provided to perform sampling:
random = Alea::Random(Alea::XSR128).new
random.normal # => -0.36790519967553736 : Float64
# Append '32' to call the single-precision version
random.normal32 # => 0.19756398 : Float32
It also accepts an initial seed to reproduce the same seemingly random events across runs:
seed = 9377
random = Alea::Random(Alea::XSR128).new(seed)
random.exp # => 0.10203669577353723 : Float64
Plain sampling methods (such as #normal
, #gamma32
) performs checks
over arguments passed to prevent bad data generation or inner exceptions.
In order to avoid checks (might be slow in a large data generation) you must use their
unsafe version by prepending next_
to them:
random = Alea::Random(Alea::XSR128).new
random.normal(loc: 0, sigma: 0) # raises Alea::UndefinedError: sigma is 0 or negative.
random.next_normal(loc: 0, sigma: 0) # these might raise internal exceptions.
Timings are definitely comparable, though: see the benchmarks for direct comparisons between these methods.
Random
is actually a wrapper over a well defined pseudo-random number generator.
The basic generation of integers and floats comes from the underlying engine, more specifically
from: #next_u32
, returning a random UInt32
, and #next_u64
, returning a random UInt64
.
Floats are obtained by ldexp
(load exponent) operations upon generated
unsigned integers; signed integers are obtained by raw cast.
Currently implemented engines:
XSR128
backed by xoroshiro128++ (32/64 bit)XSR256
backed by xoshiro256++ (32/64 bit)MT19937
backed by mersenne twister (32/64 bit)
The digits in the class name stand for the overall period of the PRNG as a power of 2:
(2^N) - 1
, where N
is the said number.
XSR256
and XSR128
engines are from the xoshiro (XOR/shift/rotate)
collection, designed by Sebastiano Vigna and David Blackman: really fast generators promising
exquisite statistical properties as well.
MT19937
engine is an implementation of the famous
Mersenne Twister, developed by Makoto
Matsumoto and Takuji Nishimura: the most widely used PRNG passing most strict statistical tests.
All PRNGs in this library inherit from a module: PRNG
. You are allowed to build
your own custom PRNG by including the module and defining the methods needed by
Alea::Random
to ensure proper repeatability and sampling, as described in this
example.
It is worth noting that in these implementations #next_u32
and #next_u64
depend on different states and thus they are independent from each other,
as well as #next_f32
and #next_f64
or #next_i32
and #next_i64
.
It is still fine, though, if both #next_u32
and #next_u64
rely on the same
state, if you want. I choose not to, as it makes state advancements unpredictable.
CDF
is the interface used to calculate the Cumulative Distribution Functions.
Given X ~ D and a fixed quantile x, CDFs are defined as the functions that
associate x to the probability that the real-valued random X from the
distribution D will take a value less or equal to x.
Arguments passed to CDF
methods to shape the distributions are analogous to
those used for sampling:
Alea::CDF.normal(0.0) # => 0.5 : Float64
Alea::CDF.normal(2.0, loc: 1.0, sigma: 0.5) # => 0.9772498680518208 : Float64
Alea::CDF.chisq(5.279, df: 5.0) # => 0.6172121213841358 : Float64
Alea::CDF.chisq32(5.279, df: 5.0) # => 0.61721206 : Float32
Documentation is hosted on GitHub Pages.
Here is a list of the projects including alea:
Fully listed in LICENSE.md:
- Crystal
Random
module for uniform sampling - NumPy
random
module for pseudo-random sampling methods - NumPy
mt19937
prng implementation - JuliaLang
random
module for ziggurat methods - IncGammaBeta.jl for incomplete gamma functions
- Fork it (https://github.com/nin93/alea/fork)
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request
- Elia Franzella - creator and maintainer