Skip to content

Conversation

@zhengshengning
Copy link
Contributor

@zhengshengning zhengshengning commented Oct 24, 2025

PR Category

Operator Mechanism

PR Types

New features

Description

将下面paddle develop的如下PR cherry-pick到 fleety_12

大Tensor:
#75856
#75523
#75383

精度逐位对齐:
#75717
#75379
#75605
#75799
#75341
#75503
#75355
#75363
#75426
#75454
#75367
#75335
#75525
#75549
#75816
#75547
#74638
#75237
#75238
#75898

其中 #74638#75367 已在其他pr中cherry-pick进去了。

有一些fluid、pir、onednn、pass、组合算子、自动并行的修改是由于:为了不损失attribute精度,与Torch对齐,将kernel签名从float改为了double。但这些Kernel从最终Operator算子库的Maker开始就是float类型的,这导致需要很多兼容问题,如:

  • Save/Load兼容性
  • 老算子库兼容性
  • 组合算子兼容性
  • 推理Pass兼容性,以及常规算子与Fused算子一致性

此外,部分算子运算逻辑调整后,组合算子也需要配合调整组合逻辑,引发了一些组合算子修改。

cherry-pick过程中无异常,未发生冲突,未发生需要一起前置依赖PR。

zrr1999 and others added 22 commits October 24, 2025 07:18
…nsor is floating (PaddlePaddle#75238)

* align LinspaceKernel

* update meta

* update gpu kernel

* fix LinspaceKernelInner

* improve kernel
… *(1 + tan(x)^2) (PaddlePaddle#75335)

* Tan reverse calculation: dx = dout *(1 + tan(x)^2)
…onal.grid_sample to align with torch accuracy. (PaddlePaddle#75355)

* accuracy_stable_grid_sample

* fix
…ch precision. (PaddlePaddle#75503)

* accuracy_stable_sin

* accuracy_stable_cos
* fix

* fix

* fix

* fix

* fix
…ackward (PaddlePaddle#75525)

* fix precision for float16 of paddle.tan backward

* fix else branch of CudaTanGradFunctor
…ional.softplus to double (PaddlePaddle#75426)

* fix beta and threshold of Softplus to double

* fix test_softplus_activation_fuse_pass v1

* fix test_activation_zero

* fix flaot of SoftplusDoubleGradKernel to double

* add op_patches for softplus

* add yaml for ops/yaml/legacy

* fix infershape/operator for FLOAT64

* fix

* add SoftPlusOpTranscriber

* fix

* fix

* fix1

* fix2

* fix coverage

* fix coverage2
…addlePaddle#75799)

* accuracy_stable_log

* accuracy_stable_log

* fix

* fix

* fix

* fix

* fix5
…ble (PaddlePaddle#75816)

* accuracy_stable_logit

* add LogitOpTranscriber

* fix coverage

* fix 0yaml
* accuracy_stable_log_sigmoid

* fix test_activation_stride_op.py
…le#75856)

* fix funcs

* gpu

* fix

* fix

* 修改PADDLE_ENFORCE信息

* fix cpu error

* fix dcu

* fix dcu

* fix
* feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers

This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types.

* refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations.

* refactor: modified the formula in LogSigmoidFunctor to make it numerical stable
@paddle-bot
Copy link

paddle-bot bot commented Oct 24, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhengshengning zhengshengning changed the title Bigtensor precision [Cherry-pick Fleety_12] Bigtensor and api precision Oct 24, 2025
Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhengshengning zhengshengning merged commit dc7ca3f into PaddlePaddle:fleety_12 Oct 27, 2025
96 of 105 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants