-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 5th No.18】Add Binomial and Poisson API -part #57856
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
Sorry to inform you that 4c1bfd4's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
Sorry to inform you that 8a1f336's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
可以麻烦两位老师看一下 CI-Coverage 和 Paddle-Doc-Preview 是哪里的问题吗? |
请问两位老师还有什么需要修改的地方吗?如果没什么问题我就开始写中文文档。 |
可以先同步写中文文档 |
""" | ||
|
||
def __init__(self, total_count, probability): | ||
self.dtype = 'float32' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(probability, np.ndarray): | ||
probability = paddle.to_tensor(probability) | ||
total_count = paddle.cast(total_count, dtype=self.dtype) | ||
probability = paddle.cast(probability, dtype=self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_tensor 可以指定dtype参数,此处代码可以简化
if isinstance(total_count, np.ndarray): | ||
total_count = paddle.to_tensor(total_count) | ||
if isinstance(probability, np.ndarray): | ||
probability = paddle.to_tensor(probability) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上述文档中描述参数类型是 float|Tensor,实际支持类型和文档描述保持一致;此处建议去掉numpy类型支持
Tensor: Sampled data with shape `sample_shape` + `batch_shape`. | ||
""" | ||
if not isinstance(shape, Iterable): | ||
raise TypeError('sample shape must be Iterable object.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterable 和 Sequence 语义略有区别 https://stackoverflow.com/questions/72157296/what-is-the-difference-between-type-hinting-a-variable-as-an-iterable-versus-a-s ,此处应该是只需要传入Sequence
self.probability, shape=output_shape | ||
) | ||
if in_dynamic_mode(): | ||
broadcast_func = np.frompyfunc(binomial_sample, nin=2, nout=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用Paddle原生采样算法,numpy仅支持cpu采样; Paddle GPU底层基础采样算法使用curand进行加速,大规模数据采样性能远超过CPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binomial的原生采样算法我目前没有找到直接对应哪个API,如果用 paddle.bernoulli() 来做的话,对于这种情况: Binomial( [ n_1, n_2 ], [ p_1, p_2 ] ) 的采样可能会比较麻烦,不知道这块要怎么做好一点
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binomial可以通过均匀分布逆采样实现, 参考pytorch/aten/src/ATen/native/Distributions.h sample_binomial;你的代码其实已经基于numpy实现了类似逻辑
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个题目涉及到新增c++ Kernel,难度会有所提升,已和组委会同学沟通 @luotao1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我试着写一下
break | ||
u -= f | ||
y += 1 | ||
f *= g / y - p_q |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用Paddle原生采样算法及逆变换
/ 166320.0 | ||
) | ||
if A <= tmp: | ||
return n - y if p > 0.5 else y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
""" | ||
|
||
def __init__(self, rate): | ||
self.dtype = 'float32' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
super().__init__(batch_shape) | ||
|
||
def _to_tensor(self, rate): | ||
"""Convert the input parameters into tensors with dtype = float32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with global default dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
rate = paddle.to_tensor([rate], dtype=self.dtype) | ||
if isinstance(rate, np.ndarray): | ||
rate = paddle.to_tensor(rate) | ||
rate = paddle.cast(rate, dtype=self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
x=output_rate, | ||
out=output, | ||
) | ||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,使用paddle原生采样算法及逆变换实现
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改,用 paddle.poisson()
# step H | ||
if c * abs(u) <= py * math.exp(px + E) - fy * math.exp(fx + E): | ||
break | ||
return K |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
np.ndarray, binomial samples. | ||
""" | ||
broadcast_func = np.frompyfunc(poisson_sample, nin=1, nout=1) | ||
return np.array(broadcast_func(rr), dtype='float32') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
self._np_mean(), | ||
rtol=config.RTOL.get(str(self.probability.dtype)), | ||
atol=config.ATOL.get(str(self.probability.dtype)), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用 scipy.stats.binom.mean
self._np_variance(), | ||
rtol=config.RTOL.get(str(self.probability.dtype)), | ||
atol=config.ATOL.get(str(self.probability.dtype)), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用 scipy.stats.binom.var
已解决 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[2.94053698, 3.00781751, 2.51124287]) | ||
""" | ||
|
||
def __init__(self, total_count, probability): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use probs
instead of probability
will be better? consistent with other similar APIs like paddle.distribution.Bernoulli
, paddle.distribution.Geometric
and paddle.distribution.Multinomial
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rfc use probs
which is inconsistent with the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
|
||
In the above equation: | ||
|
||
* :math:`total_count = n`: is the size, meaning the total number of Bernoulli experiments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* :math:`total_count = n`: is the size, meaning the total number of Bernoulli experiments. | |
* :math:`total\_count = n`: is the size, meaning the total number of Bernoulli experiments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
APIs
Description
Add Binomial and Poisson API