Description
Summary
Poisson distribution where, at high lambda values, the sample variance deviates noticeably from the mean.
During a Kolmogorov-Smirnov (KS) test, the Poisson distribution fails when lambda approaches MAX_LAMBDA
. I conducted a simple test to examine the sample mean and variance, and observed the following results:
lambda: 1e1, mean: 1.00e1, var: 9.98e0
lambda: 1e10, mean: 1.00e10, var: 1.00e10
lambda: 1e12, mean: 1.00e12, var: 1.00e12
lambda: 1e14, mean: 1.00e14, var: 1.11e14
lambda: 1e15, mean: 1.00e15, var: 1.38e15
lambda: 1e16, mean: 1.00e16, var: 8.05e16
lambda: 1e17, mean: 1.00e17, var: 2.42e18
lambda: 1e18, mean: 1.00e18, var: 7.84e19
lambda: 2e19, mean: 1.84e19, var: 5.41e21
- Mean: ΣX_i/n
- Variance: Σ(X_i - λ)^2/(n-1)
For samples from a Poisson distribution, the sample mean and variance should be very close to lambda.
For reference, I also checked the sample variance of Poisson generators in NumPy and R (I haven't conducted a full goodness-of-fit test yet).
NumPy implementation for lam>=10
is based on the paper (The transformed rejection method for generating Poisson random variables)
lambda = 1e+01, mean = 1.00e+01, std = 1.00e+01
lambda = 1e+10, mean = 1.00e+10, std = 1.00e+10
lambda = 1e+12, mean = 1.00e+12, std = 9.99e+11
lambda = 1e+14, mean = 1.00e+14, std = 1.01e+14
lambda = 1e+15, mean = 1.00e+15, std = 1.04e+15
lambda = 1e+16, mean = 1.00e+16, std = 1.42e+16
lambda = 1e+17, mean = 1.00e+17, std = 1.64e+17
lambda = 1e+18, mean = 1.00e+18, std = 1.52e+18
Algorithm PTRD can be safely used for 10≤ μ ≤10^8. Algorithm PTRS for 10 ≤ μ ≤10^7.
R implementation is based on the paper (Computer generation of Poisson deviates from modified normal distributions)
lambda = 10, mean = 10, std = 10.0191
lambda = 1e+10, mean = 1e+10, std = 9.99373e+09
lambda = 1e+12, mean = 1e+12, std = 1.00086e+12
lambda = 1e+14, mean = 1e+14, std = 9.9847e+13
lambda = 1e+15, mean = 1e+15, std = 1.00205e+15
lambda = 1e+16, mean = 1e+16, std = 1.00097e+16
lambda = 1e+17, mean = 1e+17, std = 9.98917e+16
lambda = 1e+18, mean = 1e+18, std = 1.00145e+18
Code sample
use rand::{rngs::SmallRng, SeedableRng};
use rand_distr::{Distribution, Poisson};
const SAMPLES: usize = 1_000_000;
const MAX_LAMBDA: f64 = Poisson::<f64>::MAX_LAMBDA;
fn poisson_mean_var(lambda: f64) -> (f64, f64) {
let mut rng = SmallRng::seed_from_u64(0);
let dist = Poisson::new(lambda).unwrap();
let samples = (0..SAMPLES)
.map(|_| dist.sample(&mut rng))
.collect::<Vec<f64>>();
if samples.iter().any(|&x| x.is_nan()) {
panic!("NaN in samples");
}
if samples.iter().any(|&x| x.is_infinite()) {
panic!("Infinite in samples");
}
let mean = samples.iter().map(|&x| x / SAMPLES as f64).sum::<f64>();
let var = samples.iter().map(|&x| (x - lambda).powi(2)).sum::<f64>() / (SAMPLES as f64 - 1.0);
(mean, var)
}
fn main() {
let lambdas = [10.0, 1e10, 1e12, 1e14, 1e15, 1e16, 1e17, 1e18, MAX_LAMBDA];
for lambda in lambdas.iter() {
let (mean, var) = poisson_mean_var(*lambda);
println!("|{:.0e}| {:.2e}|{:.2e}|", lambda, mean, var);
}
}
[dependencies]
rand = { path = "../rand" }
rand_distr = { path = "../rand/rand_distr"}