Accuracy problem with F16 NEFullyConnectedLayer #1112

allnes · 2024-06-18T19:45:33Z

Hi,

I try to use NEFullyConnectedLayer FP16 in our OpenVINO CPU plugin and my tests is failed by accuracy.

Attach reproducer
main.txt

Also attach data where it was reproduced:
out_ref.txt
out_acl.txt
arg1.txt
arg0.txt

And attach debug log from application:
log.txt

I get accuracy degradation every 10-20 runnings of reproducer

allnes · 2024-06-19T08:25:53Z

Also attaching common info

Output of 'strings libarm_compute.so | grep arm_compute_version':

arm_compute_version.embed - f2eda6665c12d568e179f5b0e7a24ccdc0ac824d

Platform:

Mac Studio - Apple M2 Max

Operating System:

MacOS Sonoma - Version 14.5 (23F79)

morgolock · 2024-10-10T14:48:35Z

Hi @allnes

I tried to reproduce this on Neoverse N1 but I can't see the accuracy problem. I built ACL from the latest main branch with the following options: cons -j32 Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 validation_tests=1 os=linux arch=armv8a build=native multi_isa=1 fixed_format_kernels=1 openmp=1 cppthreads=0 asserts=1 debug=1 logging=1

See the result:

LD_LIBRARY_PATH=../ComputeLibrary/build/:$LD_LIBRARY_PATH ./fctest 
 [ComputeLibrary][10-10-2024 03:39:51][INFO]  arm_compute::NEFullyConnectedLayer::configure() : 
 input: ITensor->info(): Shape=120,59,DataLayout=NCHW,DataType=F16
 weights: ITensor->info(): Shape=1,120,DataLayout=NCHW,DataType=F16
 biases: nullptr
 output: ITensor->info(): Shape=1,59,DataLayout=NCHW,DataType=F16
 fc_info: {activation_info=, weights_trained_layout=NCHW, transpose_weights=0, are_weights_reshaped=0, retain_internal_weights=0, fp_mixed_precision=1}
 
 [ComputeLibrary][10-10-2024 03:39:51][INFO]  arm_compute::cpu::CpuFullyConnected::configure() : 
 src: Shape=120,59,DataLayout=NCHW,DataType=F16
 weights: Shape=1,120,DataLayout=NCHW,DataType=F16
 biases: nullptr
 dst: Shape=1,59,DataLayout=NCHW,DataType=F16
 fc_info: {activation_info=, weights_trained_layout=NCHW, transpose_weights=0, are_weights_reshaped=0, retain_internal_weights=0, fp_mixed_precision=1}
 
 [ComputeLibrary][10-10-2024 03:39:51][INFO]  arm_compute::cpu::CpuGemm::configure() : 
 a: Shape=120,59,DataLayout=NCHW,DataType=F16
 b: Shape=1,120,DataLayout=NCHW,DataType=F16
 c: nullptr
 d: Shape=1,59,DataLayout=NCHW,DataType=F16
 alpha: 1.000000
 beta: 1.000000
 gemm_info: {is_a_reshaped=0,is_b_reshaped=0,reshape_b_only_on_first_run=1,depth_output_gemm3d=0,reinterpret_input_as_3d=0,retain_internal_weights=0,fp_mixed_precision=0,broadcast_bias=0,pretranspose_B=0,}
 
Index of max absolut difference: 0
Max absolut difference: 0

diff out_acl.txt out_ref.txt 
37c37
< -79.250000
---
> -79.312500

Could you please try this again with v24.09 or the latest main?

Hope this helps

allnes mentioned this issue Jun 19, 2024

[ARM CPU] Add ACL FC executor for FP32/FP16 precision openvinotoolkit/openvino#24123

Merged

morgolock added the Help wanted label Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accuracy problem with F16 NEFullyConnectedLayer #1112

Accuracy problem with F16 NEFullyConnectedLayer #1112

allnes commented Jun 18, 2024

allnes commented Jun 19, 2024

morgolock commented Oct 10, 2024

Accuracy problem with F16 NEFullyConnectedLayer #1112

Accuracy problem with F16 NEFullyConnectedLayer #1112

Comments

allnes commented Jun 18, 2024

allnes commented Jun 19, 2024

morgolock commented Oct 10, 2024