-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Precision Depth Alignment] paddle.nn.functional.softplus accuracy and torch alignment #75363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
zhengshengning
merged 1 commit into
PaddlePaddle:develop
from
zhengshengning:accuracy_stable_softplus
Sep 21, 2025
Merged
[Precision Depth Alignment] paddle.nn.functional.softplus accuracy and torch alignment #75363
zhengshengning
merged 1 commit into
PaddlePaddle:develop
from
zhengshengning:accuracy_stable_softplus
Sep 21, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
你的PR提交成功,感谢你对开源项目的贡献! |
wanghuancoder
approved these changes
Sep 19, 2025
Contributor
wanghuancoder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
zrr1999
approved these changes
Sep 19, 2025
zhengshengning
added a commit
to zhengshengning/Paddle
that referenced
this pull request
Oct 24, 2025
zhengshengning
added a commit
to zhengshengning/Paddle
that referenced
this pull request
Oct 24, 2025
zhengshengning
added a commit
that referenced
this pull request
Oct 27, 2025
* CallScalarFunction uses the dtype of 'self' as the type of 'other' when opotype is 'div'(#75237) * LinspaceKernel uses the dtype of 'self' as the type of 'step' when tensor is floating (#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel * fix CudaSigmoidGradFunctor and CudaSiluGradFunctor (#75341) * Softplus accuracy and torch alignment 1 (#75363) * [Precision Depth Alignment] paddle.tan reverse calculation: dx = dout *(1 + tan(x)^2) (#75335) * Tan reverse calculation: dx = dout *(1 + tan(x)^2) * [Precision Depth Alignment] Add support for CUDNN to paddle.nn.functional.grid_sample to align with torch accuracy. (#75355) * accuracy_stable_grid_sample * fix * correlation supports big tensor (#75383) * fix * fix test * fix * paddle.tanh Grad and torch alignment (float16) (#75454) * [Precision Depth Alignment] paddle.sin and paddle.cos aligns with torch precision. (#75503) * accuracy_stable_sin * accuracy_stable_cos * [深度对齐]Divide (#75379) * fix * fix * fix * fix * fix * [Precision Depth Alignment] fix precision for float16 of paddle.tan backward (#75525) * fix precision for float16 of paddle.tan backward * fix else branch of CudaTanGradFunctor * [Precision Depth Alignment] fix precision for paddle.expm1 (#75549) * accuracy_stable_expm1 * fix * Bigtensor排查修复[Paddle/paddle/phi/kernels/funcs] (#75523) * fix * fix * [Precision Depth Alignment] fix beta and threshold of paddle.nn.functional.softplus to double (#75426) * fix beta and threshold of Softplus to double * fix test_softplus_activation_fuse_pass v1 * fix test_activation_zero * fix flaot of SoftplusDoubleGradKernel to double * add op_patches for softplus * add yaml for ops/yaml/legacy * fix infershape/operator for FLOAT64 * fix * add SoftPlusOpTranscriber * fix * fix * fix1 * fix2 * fix coverage * fix coverage2 * fix (#75605) * [深度对齐] dot (#75717) * fix * fix * fix dcu * [Precision Depth Alignment] paddle.log aligns with torch precision (#75799) * accuracy_stable_log * accuracy_stable_log * fix * fix * fix * fix * fix5 * [Precision Depth Alignment] fix eps of paddle.logit from float to double (#75816) * accuracy_stable_logit * add LogitOpTranscriber * fix coverage * fix 0yaml * [Precision Depth Alignment] paddle.log_sigmoid (#75898) * accuracy_stable_log_sigmoid * fix test_activation_stride_op.py * [Precision Depth Alignment] Modify the negative_slope parameter of the paddle.nn.functional.leaky_relu API to double (#75547) * [big tensor] Paddle/paddle/phi/kernels/funcs gpuBigtensor (#75856) * fix funcs * gpu * fix * fix * 修改PADDLE_ENFORCE信息 * fix cpu error * fix dcu * fix dcu * fix * [Fix] log sigmoid complex (#75953) * feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types. * refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations. * refactor: modified the formula in LogSigmoidFunctor to make it numerical stable --------- Co-authored-by: Zhan Rongrui <46243324+zrr1999@users.noreply.github.com> Co-authored-by: 正在学习 <62892980+cszdrg@users.noreply.github.com> Co-authored-by: Bvicii <98971614+scyyh11@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
torch实现代码路径:src/pytorch/aten/src/ATen/native/cuda/ActivationSoftplusKernel.cu
主要改动点(为与 PyTorch 精度对齐):
前向 softplus 数值稳定性 :
新增
log1p_local的复数重载,将log(1 + exp(bx))改为log1p_local(exp(bx)):log1p_local直接调用 log1p,提升 bx 很负时的精度(避免 log(1+y) 的消失误差)。log1p_local(Complex)为log(1 + exp(x)),兼容无复杂版 log1p 的情况。反向梯度重写为稳定形式
dout / (1 + exp(-bx))改为设z=exp(bx),dout * z / (z + 1),避免计算exp(-bx)带来的上溢/下溢,和 PyTorch 表达一致。dout / conj(1 + exp(-bx))改为dout * conj(z / (z + 1))。复测结果:
paddle.nn.functional.softplus 对齐case(共29 个)
未对齐case(3个): 未对齐的原因是函数签名不同,beta参数在paddle中只接收float类型,当beta在计算时强转成double时会出现精度问题。
上面未对齐的3个double类型的case,需要改动函数签名等,所以在另一个pr中修复(https://github.com/PaddlePaddle/Paddle/pull/75426)
pcard-67164