Skip to content

Poisson variance deviation at high lambda #1515

Closed
@JamboChen

Description

Summary

Poisson distribution where, at high lambda values, the sample variance deviates noticeably from the mean.

During a Kolmogorov-Smirnov (KS) test, the Poisson distribution fails when lambda approaches MAX_LAMBDA. I conducted a simple test to examine the sample mean and variance, and observed the following results:

lambda: 1e1, mean: 1.00e1, var: 9.98e0
lambda: 1e10, mean: 1.00e10, var: 1.00e10
lambda: 1e12, mean: 1.00e12, var: 1.00e12
lambda: 1e14, mean: 1.00e14, var: 1.11e14
lambda: 1e15, mean: 1.00e15, var: 1.38e15
lambda: 1e16, mean: 1.00e16, var: 8.05e16
lambda: 1e17, mean: 1.00e17, var: 2.42e18
lambda: 1e18, mean: 1.00e18, var: 7.84e19
lambda: 2e19, mean: 1.84e19, var: 5.41e21
  • Mean: ΣX_i/n
  • Variance: Σ(X_i - λ)^2/(n-1)

For samples from a Poisson distribution, the sample mean and variance should be very close to lambda.

For reference, I also checked the sample variance of Poisson generators in NumPy and R (I haven't conducted a full goodness-of-fit test yet).

NumPy implementation for lam>=10 is based on the paper (The transformed rejection method for generating Poisson random variables)

lambda = 1e+01, mean = 1.00e+01, std = 1.00e+01
lambda = 1e+10, mean = 1.00e+10, std = 1.00e+10
lambda = 1e+12, mean = 1.00e+12, std = 9.99e+11
lambda = 1e+14, mean = 1.00e+14, std = 1.01e+14
lambda = 1e+15, mean = 1.00e+15, std = 1.04e+15
lambda = 1e+16, mean = 1.00e+16, std = 1.42e+16
lambda = 1e+17, mean = 1.00e+17, std = 1.64e+17
lambda = 1e+18, mean = 1.00e+18, std = 1.52e+18

Algorithm PTRD can be safely used for 10≤ μ ≤10^8. Algorithm PTRS for 10 ≤ μ ≤10^7.

R implementation is based on the paper (Computer generation of Poisson deviates from modified normal distributions)

lambda = 10, mean = 10, std = 10.0191
lambda = 1e+10, mean = 1e+10, std = 9.99373e+09
lambda = 1e+12, mean = 1e+12, std = 1.00086e+12
lambda = 1e+14, mean = 1e+14, std = 9.9847e+13
lambda = 1e+15, mean = 1e+15, std = 1.00205e+15
lambda = 1e+16, mean = 1e+16, std = 1.00097e+16
lambda = 1e+17, mean = 1e+17, std = 9.98917e+16
lambda = 1e+18, mean = 1e+18, std = 1.00145e+18

Code sample

use rand::{rngs::SmallRng, SeedableRng};
use rand_distr::{Distribution, Poisson};
const SAMPLES: usize = 1_000_000;
const MAX_LAMBDA: f64 = Poisson::<f64>::MAX_LAMBDA;

fn poisson_mean_var(lambda: f64) -> (f64, f64) {
    let mut rng = SmallRng::seed_from_u64(0);
    let dist = Poisson::new(lambda).unwrap();
    let samples = (0..SAMPLES)
        .map(|_| dist.sample(&mut rng))
        .collect::<Vec<f64>>();
    if samples.iter().any(|&x| x.is_nan()) {
        panic!("NaN in samples");
    }
    if samples.iter().any(|&x| x.is_infinite()) {
        panic!("Infinite in samples");
    }

    let mean = samples.iter().map(|&x| x / SAMPLES as f64).sum::<f64>();
    let var = samples.iter().map(|&x| (x - lambda).powi(2)).sum::<f64>() / (SAMPLES as f64 - 1.0);

    (mean, var)
}

fn main() {
    let lambdas = [10.0, 1e10, 1e12, 1e14, 1e15, 1e16, 1e17, 1e18, MAX_LAMBDA];
    for lambda in lambdas.iter() {
        let (mean, var) = poisson_mean_var(*lambda);
        println!("|{:.0e}| {:.2e}|{:.2e}|", lambda, mean, var);
    }
}
[dependencies]
rand = { path = "../rand" }
rand_distr = { path = "../rand/rand_distr"}

Metadata

Assignees

No one assigned

    Labels

    X-bugType: bug report

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions