[release/3.3] Fix log10 precision issues#77894
Open
zrr1999 wants to merge 5 commits intoPaddlePaddle:release/3.3from
Open
[release/3.3] Fix log10 precision issues#77894zrr1999 wants to merge 5 commits intoPaddlePaddle:release/3.3from
zrr1999 wants to merge 5 commits intoPaddlePaddle:release/3.3from
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
…ix float literal constants - Remove fast but less accurate __log10f intrinsic in favor of standard ::log10 - Fix GPU gradient functor: change log(10.0f) to log(10.0) to preserve double precision - Fix complex forward path: change log(10.0f) to log(10.0) - Fix complex gradient path: change log(10.0f) to log(10.0) These changes align Paddle's log10 implementation with PyTorch's precision by: 1. Using IEEE-compliant log10 instead of fast approximation 2. Computing ln(10) constant in full double precision instead of truncating to float Expected impact: - Float32 forward: reduce max error from ~4.7e-07 to near float32 epsilon - Float64 backward: reduce max error from ~3.5e-15 to near float64 epsilon - Complex types: improved constant precision File modified: paddle/phi/kernels/funcs/activation_functor.h (4 precision-critical changes)
… constant Replace runtime log(10.0) computation with compile-time constant 2.30258509299404568401 in CudaLog10GradFunctor to match PyTorch's M_LN10 precision. - Changed from: T log_ten = static_cast<T>(log(static_cast<MPType>(10.0))); - Changed to: T log_ten = static_cast<T>(2.30258509299404568401); This eliminates machine-epsilon drift in float64 backward gradients by avoiding double-rounding through runtime log() computation. The constant matches the value used in paddle/fluid/platform/device/ipu/ and provides full IEEE 754 double precision. Impact: Fixes float64 backward accuracy errors at ~2e-16 to 2e-15 level. File: paddle/phi/kernels/funcs/activation_functor.h
… order with PyTorch - Update ln(10) constant from 21-digit to PyTorch's 16-digit literal (2.3025850929940456) - Reorder operations from dout/(x*log_ten) to (dout/x)/log_ten for better rounding - Target: reduce float64 backward 1-ULP errors (~4.4e-16 absolute difference) - Session: 20260211-095341, Round 3, AD iteration 2
- Change GPU backward from (dout/x)/log_ten to dout/(x*log_ten) - Change CPU backward to use precomputed constant 2.3025850929940456 instead of runtime log(10) - Matches PyTorch's derivatives.yaml: grad / (self * 2.3025850929940456) - Ensures CPU/GPU consistency and bit-exact PyTorch alignment for float64 precision
The log10_local template function calls ::log10(x) in device code. On Windows MSVC, when instantiated with integral types (int, long long), this resolves to the standard library's host-only std::log10(int) overload, causing NVCC error: 'calling a __host__ function from a __device__ function is not allowed'. Fix: In CudaLog10Functor::operator(), cast MPType to a floating-point type before calling log10_local. Use std::conditional_t to select float for integral MPType, preserving the original type for floating-point MPType. This ensures log10_local is only instantiated with FP types, avoiding the host-only overload. No precision impact: the int→float cast was already implicit in ::log10(); moving it earlier produces bit-identical results. Verified: Build succeeds on Linux; Windows CI will confirm CUDA compilation.
6d497e2 to
1eed3b5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport PR #77855 to branch release/3.3
This backport includes the commits from the original PR that fix log10 precision.