Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Zero-Inflated Poisson and Negative Binomial distributions #1134

Open
minaskar opened this issue Oct 20, 2020 · 13 comments
Open

Comments

@minaskar
Copy link

Are there any plans to add a Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) to TFP? Those are usually very common distributions in other packages, and it shouldn't be hard to implement.

@jeffpollock9
Copy link
Contributor

Hi @minaskar

If at all useful, I've coded up e.g. zero inflated Poisson stuff as a Mixture of a Deterministic and Poisson before. Something like this:

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions

zero_prob = 0.3
poisson_log_rate = 2.5

zero_inflated_poisson = tfd.Mixture(
    cat=tfd.Categorical(probs=[zero_prob, 1.0 - zero_prob]),
    components=[tfd.Deterministic(loc=0.0), tfd.Poisson(log_rate=poisson_log_rate)],
)

samples = zero_inflated_poisson.sample(1_000)

values, counts = np.unique(samples, return_counts=True)

plt.bar(values, counts)
plt.grid()
plt.show()

zero_inflated_poisson

@minaskar
Copy link
Author

Hi @jeffpollock9 ,

This looks very nice! How would that work as a layer? I tried the following but it doesn't work:

tfpl.DistributionLambda(
      make_distribution_fn=lambda t: tfd.Mixture(cat=tfd.Categorical(probs=[t[0], 1.0 - t[0]]),
components=[tfd.Deterministic(loc=0.0), tfd.Poisson(log_rate=t[1])],),
      convert_to_tensor_fn=lambda s: s.sample()
  )

@brianwa84
Copy link
Contributor

brianwa84 commented Oct 20, 2020 via email

@minaskar
Copy link
Author

@brianwa84

I'm getting an error message saying ValueError: Shapes must be equal rank, but are 0 and 1

@jeffpollock9
Copy link
Contributor

I'm not 100% sure as I don't use those layers, but I think you need to capture any batch dimensions in t:

tfpl.DistributionLambda(
    make_distribution_fn=lambda t: tfd.Mixture(
        cat=tfd.Categorical(logits=[t[..., 0], 0.0]),
        components=[
            tfd.Deterministic(loc=0.0),
            tfd.Poisson(log_rate=t[..., 1]),
        ],
    ),
    convert_to_tensor_fn=lambda s: s.sample(),
)

at least that appears to be the pattern in https://www.tensorflow.org/probability/examples/Probabilistic_Layers_Regression#case_4_aleatoric_epistemic_uncertainty

@brianwa84
Copy link
Contributor

brianwa84 commented Oct 20, 2020 via email

@minaskar
Copy link
Author

@jeffpollock9 Yes, this is exactly what I tried next, still get the same error message.

@brianwa84 the log_prob has a closed form for both distributions so it shouldn't be very hard.

@dirmeier
Copy link
Contributor

Hey @brianwa84, if noone else already started working on this already, I would have a look and implement them.
Cheers, Simon

@brianwa84
Copy link
Contributor

brianwa84 commented Jun 28, 2021 via email

@shtoneyan
Copy link

shtoneyan commented Jun 28, 2021

Hey guys! I might need zero-infl poisson as a loss asap and have this code so far which throws an error:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.random.set_seed(42)
y_true = tf.random.uniform((2, 100, 4), minval=0, maxval=2, dtype=tf.int32)
y_pred = tf.random.uniform((2, 100, 4), minval=0, maxval=1, dtype=tf.float32)
# multinomial part of loss function
rate = tf.math.exp(y_pred)
nonzero_prob = tf.math.divide(
    tf.cast(tf.math.count_nonzero(y_pred, axis=(1, 2)), tf.float32),
    tf.cast(tf.size(y_pred), tf.float32))
cat = tfd.Categorical(probs=tf.stack([1-nonzero_prob, nonzero_prob], -1))
components = [tfd.Deterministic(loc=tf.zeros_like(rate)), tfd.Poisson(rate)]
# Error here...
zip_dist = tfd.Mixture(cat=cat, components=components)

Any input on if this implementation is wrong or what could be causing the error would be very much appreciated!
The error message is:

ValueError                                Traceback (most recent call last)
<ipython-input-72-ea5c1ffcaa5b> in <module>
    11 components = [tfd.Deterministic(loc=tf.zeros_like(rate)), tfd.Poisson(rate)]
    12 # Error here...
---> 13 zip_dist = tfd.Mixture(cat=cat, components=components)

<decorator-gen-281> in __init__(self, cat, components, validate_args, allow_nan_stats, use_static_graph, name)

~/tf_2/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py in wrapped_init(***failed resolving arguments***)
   274       # called, here is the place to do it.
   275       self_._parameters = None
--> 276       default_init(self_, *args, **kwargs)
   277       # Note: if we ever want to override things set in `self` by subclass
   278       # `__init__`, here is the place to do it.

~/tf_2/lib/python3.7/site-packages/tensorflow_probability/python/distributions/mixture.py in __init__(self, cat, components, validate_args, allow_nan_stats, use_static_graph, name)
   140         raise ValueError(
   141             "components[{}] batch shape must be compatible with cat "
--> 142             "shape and other component batch shapes".format(di))
   143       static_event_shape = tensorshape_util.merge_with(
   144           static_event_shape, d.event_shape)

ValueError: components[0] batch shape must be compatible with cat shape and other component batch shapes

@ColCarroll
Copy link
Contributor

I think you have the batch dimension wrong for cat in the mixture:

zip = tfd.Mixture(cat = tfd.Categorical(probs=tf.stack([nonzero_prob, 1 - nonzero_prob], -1)),
                  components=[tfd.Deterministic(tf.zeros_like(rate)), tfd.Poisson(rate)])

at least has the right shape, but I might be parsing the batch and event shape of this problem wrong...

@shtoneyan
Copy link

I just edited the code but seem to be running into the same issue... thanks for the help though!

@ruiw-uber
Copy link

@shtoneyan do you have any following updates about using tfp to do zero-inflated poisson?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants