-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Cherry-pick Fleety_12] Bigtensor and api precision #76028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cherry-pick Fleety_12] Bigtensor and api precision #76028
Conversation
…en opotype is 'div'(PaddlePaddle#75237)
…nsor is floating (PaddlePaddle#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel
… *(1 + tan(x)^2) (PaddlePaddle#75335) * Tan reverse calculation: dx = dout *(1 + tan(x)^2)
…onal.grid_sample to align with torch accuracy. (PaddlePaddle#75355) * accuracy_stable_grid_sample * fix
* fix * fix test * fix
…ch precision. (PaddlePaddle#75503) * accuracy_stable_sin * accuracy_stable_cos
* fix * fix * fix * fix * fix
…ackward (PaddlePaddle#75525) * fix precision for float16 of paddle.tan backward * fix else branch of CudaTanGradFunctor
…dle#75549) * accuracy_stable_expm1 * fix
…ional.softplus to double (PaddlePaddle#75426) * fix beta and threshold of Softplus to double * fix test_softplus_activation_fuse_pass v1 * fix test_activation_zero * fix flaot of SoftplusDoubleGradKernel to double * add op_patches for softplus * add yaml for ops/yaml/legacy * fix infershape/operator for FLOAT64 * fix * add SoftPlusOpTranscriber * fix * fix * fix1 * fix2 * fix coverage * fix coverage2
* fix * fix * fix dcu
…addlePaddle#75799) * accuracy_stable_log * accuracy_stable_log * fix * fix * fix * fix * fix5
…ble (PaddlePaddle#75816) * accuracy_stable_logit * add LogitOpTranscriber * fix coverage * fix 0yaml
* accuracy_stable_log_sigmoid * fix test_activation_stride_op.py
…e paddle.nn.functional.leaky_relu API to double (PaddlePaddle#75547)
…le#75856) * fix funcs * gpu * fix * fix * 修改PADDLE_ENFORCE信息 * fix cpu error * fix dcu * fix dcu * fix
* feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types. * refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations. * refactor: modified the formula in LogSigmoidFunctor to make it numerical stable
|
你的PR提交成功,感谢你对开源项目的贡献! |
wanghuancoder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
XiaoguangHu01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
dc7ca3f
into
PaddlePaddle:fleety_12
PR Category
Operator Mechanism
PR Types
New features
Description
将下面paddle develop的如下PR cherry-pick到 fleety_12:
大Tensor:
#75856
#75523
#75383
精度逐位对齐:
#75717
#75379
#75605
#75799
#75341
#75503
#75355
#75363
#75426
#75454
#75367
#75335
#75525
#75549
#75816
#75547
#74638
#75237
#75238
#75898
其中 #74638 和 #75367 已在其他pr中cherry-pick进去了。
有一些fluid、pir、onednn、pass、组合算子、自动并行的修改是由于:为了不损失attribute精度,与Torch对齐,将kernel签名从float改为了double。但这些Kernel从最终Operator算子库的Maker开始就是float类型的,这导致需要很多兼容问题,如:
此外,部分算子运算逻辑调整后,组合算子也需要配合调整组合逻辑,引发了一些组合算子修改。
cherry-pick过程中无异常,未发生冲突,未发生需要一起前置依赖PR。