You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bindings:
x
sigmoid_0.tmp_0958
I0601 16:12:27.191135 13695 engine.cc:469] ====== engine info end ======
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0601 16:12:27.238207 13695 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU
terminate called after throwing an instance of 'phi::enforce::EnforceNotMet'
what():
ResourceExhaustedError: Not enough available GPU memory.
[Hint: Expected available_to_alloc >= alloc_bytes, but received available_to_alloc:318272921 < alloc_bytes:419430400.] (at /home/paddle/data/xly/workspace/23303/Paddle/paddle/fluid/platform/device/gpu/gpu_info.cc:99)
Aborted (core dumped)
The text was updated successfully, but these errors were encountered:
WYQ-Github
changed the title
在nvidia Jetson TX2部署, 编译paddleocr c++ 预测demo的问题
在nvidia Jetson TX2部署, 编译paddleocr c++ 预测demo --use_tensorrt=true的问题
Jun 1, 2023
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
完整报错如下
./build/ppocr --limit_side_len=960 --visualize=true --precision=fp32 --gpu_mem=400 --use_tensorrt=true --visualize=true
In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320',if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
total images num: 50
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0601 16:09:40.643213 13695 analysis_predictor.cc:881] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
I0601 16:09:40.696204 13695 fuse_pass_base.cc:57] --- detected 8 subgraphs
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [delete_quant_dequant_filter_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [add_support_int8_pass]
I0601 16:09:41.050448 13695 fuse_pass_base.cc:57] --- detected 237 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [preln_skip_layernorm_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0601 16:09:41.226161 13695 fuse_pass_base.cc:57] --- detected 33 subgraphs
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [trt_squeeze2_matmul_fuse_pass]
--- Running IR pass [trt_reshape2_matmul_fuse_pass]
--- Running IR pass [trt_flatten2_matmul_fuse_pass]
--- Running IR pass [trt_map_matmul_v2_to_mul_pass]
--- Running IR pass [trt_map_matmul_v2_to_matmul_pass]
--- Running IR pass [trt_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0601 16:09:41.302793 13695 fuse_pass_base.cc:57] --- detected 49 subgraphs
--- Running IR pass [tensorrt_subgraph_pass]
I0601 16:09:41.380200 13695 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 187 nodes
I0601 16:09:41.415360 13695 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I0601 16:09:43.589931 13695 engine.cc:222] Run Paddle-TRT Dynamic Shape mode.
W0601 16:09:43.592581 13695 helper.h:107] DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
I0601 16:12:27.171494 13695 engine.cc:462] ====== engine info ======
I0601 16:12:27.190670 13695 engine.cc:467] Layers:
conv2d (Output: batch_norm_0.tmp_315)
PWN(PWN((Unnamed Layer* 1) [Activation]), hard_swish (Output: hardswish_0.tmp_017))
conv2d (Output: batch_norm_1.tmp_330) + relu (Output: relu_0.tmp_032)
conv2d (Output: depthwise_conv2d_0.tmp_035) + batchnorm_add_scale (Output: batch_norm_2.tmp_345) + relu (Output: relu_1.tmp_047)
conv2d (Output: batch_norm_3.tmp_360) + elementwise (Output: elementwise_add_062)
conv2d (Output: batch_norm_4.tmp_375) + relu (Output: relu_2.tmp_077)
conv2d (Output: depthwise_conv2d_1.tmp_080) + batchnorm_add_scale (Output: batch_norm_5.tmp_390) + relu (Output: relu_3.tmp_092)
conv2d (Output: batch_norm_6.tmp_3105)
conv2d (Output: batch_norm_7.tmp_3118) + relu (Output: relu_4.tmp_0120)
conv2d (Output: depthwise_conv2d_2.tmp_0123) + batchnorm_add_scale (Output: batch_norm_8.tmp_3133) + relu (Output: relu_5.tmp_0135)
conv2d (Output: batch_norm_9.tmp_3148) + elementwise (Output: elementwise_add_1150)
conv2d (Output: batch_norm_10.tmp_3163) + relu (Output: relu_6.tmp_0165)
conv2d (Output: conv2d_114.tmp_0775)
pool2d (Output: pool2d_3.tmp_0777)
conv2d (Output: depthwise_conv2d_3.tmp_0168) + batchnorm_add_scale (Output: batch_norm_11.tmp_3178) + relu (Output: relu_7.tmp_0180)
conv2d (Output: conv2d_115.tmp_1783) + relu (Output: relu_15.tmp_0785)
conv2d (Output: conv2d_116.tmp_1791)
conv2d (Output: batch_norm_12.tmp_3193)
conv2d (Output: batch_norm_13.tmp_3206) + relu (Output: relu_8.tmp_0208)
conv2d (Output: depthwise_conv2d_4.tmp_0211) + batchnorm_add_scale (Output: batch_norm_14.tmp_3221) + relu (Output: relu_9.tmp_0223)
conv2d (Output: batch_norm_15.tmp_3236) + elementwise (Output: elementwise_add_2238)
conv2d (Output: batch_norm_16.tmp_3251) + relu (Output: relu_10.tmp_0253)
conv2d (Output: depthwise_conv2d_5.tmp_0256) + batchnorm_add_scale (Output: batch_norm_17.tmp_3266) + relu (Output: relu_11.tmp_0268)
conv2d (Output: batch_norm_18.tmp_3281) + elementwise (Output: elementwise_add_3283)
conv2d (Output: batch_norm_19.tmp_3296)
conv2d (Output: conv2d_111.tmp_0750)
pool2d (Output: pool2d_2.tmp_0752)
PWN(PWN((Unnamed Layer* 53) [Activation]), hard_swish (Output: hardswish_1.tmp_0298))
conv2d (Output: conv2d_112.tmp_1758) + relu (Output: relu_14.tmp_0760)
conv2d (Output: depthwise_conv2d_6.tmp_0301) + batchnorm_add_scale (Output: batch_norm_20.tmp_3311)
conv2d (Output: conv2d_113.tmp_1766)
PWN(PWN((Unnamed Layer* 60) [Activation]), hard_swish (Output: hardswish_2.tmp_0313))
conv2d (Output: batch_norm_21.tmp_3326)
conv2d (Output: batch_norm_22.tmp_3339)
PWN(PWN((Unnamed Layer* 67) [Activation]), hard_swish (Output: hardswish_3.tmp_0341))
conv2d (Output: depthwise_conv2d_7.tmp_0344) + batchnorm_add_scale (Output: batch_norm_23.tmp_3354)
PWN(PWN((Unnamed Layer* 72) [Activation]), hard_swish (Output: hardswish_4.tmp_0356))
conv2d (Output: batch_norm_24.tmp_3369) + elementwise (Output: elementwise_add_4371)
conv2d (Output: batch_norm_25.tmp_3384)
PWN(PWN((Unnamed Layer* 77) [Activation]), hard_swish (Output: hardswish_5.tmp_0386))
conv2d (Output: depthwise_conv2d_8.tmp_0389) + batchnorm_add_scale (Output: batch_norm_26.tmp_3399)
PWN(PWN((Unnamed Layer* 81) [Activation]), hard_swish (Output: hardswish_6.tmp_0401))
conv2d (Output: batch_norm_27.tmp_3414) + elementwise (Output: elementwise_add_5416)
conv2d (Output: batch_norm_28.tmp_3429)
PWN(PWN((Unnamed Layer* 86) [Activation]), hard_swish (Output: hardswish_7.tmp_0431))
conv2d (Output: depthwise_conv2d_9.tmp_0434) + batchnorm_add_scale (Output: batch_norm_29.tmp_3444)
PWN(PWN((Unnamed Layer* 90) [Activation]), hard_swish (Output: hardswish_8.tmp_0446))
conv2d (Output: batch_norm_30.tmp_3459) + elementwise (Output: elementwise_add_6461)
conv2d (Output: batch_norm_31.tmp_3474)
PWN(PWN((Unnamed Layer* 95) [Activation]), hard_swish (Output: hardswish_9.tmp_0476))
conv2d (Output: depthwise_conv2d_10.tmp_0479) + batchnorm_add_scale (Output: batch_norm_32.tmp_3489)
PWN(PWN((Unnamed Layer* 99) [Activation]), hard_swish (Output: hardswish_10.tmp_0491))
conv2d (Output: batch_norm_33.tmp_3504)
conv2d (Output: batch_norm_34.tmp_3517)
PWN(PWN((Unnamed Layer* 103) [Activation]), hard_swish (Output: hardswish_11.tmp_0519))
conv2d (Output: depthwise_conv2d_11.tmp_0522) + batchnorm_add_scale (Output: batch_norm_35.tmp_3532)
PWN(PWN((Unnamed Layer* 107) [Activation]), hard_swish (Output: hardswish_12.tmp_0534))
conv2d (Output: batch_norm_36.tmp_3547) + elementwise (Output: elementwise_add_7549)
conv2d (Output: batch_norm_37.tmp_3562)
conv2d (Output: conv2d_108.tmp_0725)
pool2d (Output: pool2d_1.tmp_0727)
PWN(PWN((Unnamed Layer* 113) [Activation]), hard_swish (Output: hardswish_13.tmp_0564))
conv2d (Output: conv2d_109.tmp_1733) + relu (Output: relu_13.tmp_0735)
conv2d (Output: depthwise_conv2d_12.tmp_0567) + batchnorm_add_scale (Output: batch_norm_38.tmp_3577)
conv2d (Output: conv2d_110.tmp_1741)
PWN(PWN((Unnamed Layer* 120) [Activation]), hard_swish (Output: hardswish_14.tmp_0579))
conv2d (Output: batch_norm_39.tmp_3592)
conv2d (Output: batch_norm_40.tmp_3605)
PWN(PWN((Unnamed Layer* 127) [Activation]), hard_swish (Output: hardswish_15.tmp_0607))
conv2d (Output: depthwise_conv2d_13.tmp_0610) + batchnorm_add_scale (Output: batch_norm_41.tmp_3620)
PWN(PWN((Unnamed Layer* 132) [Activation]), hard_swish (Output: hardswish_16.tmp_0622))
conv2d (Output: batch_norm_42.tmp_3635) + elementwise (Output: elementwise_add_8637)
conv2d (Output: batch_norm_43.tmp_3650)
PWN(PWN((Unnamed Layer* 137) [Activation]), hard_swish (Output: hardswish_17.tmp_0652))
conv2d (Output: depthwise_conv2d_14.tmp_0655) + batchnorm_add_scale (Output: batch_norm_44.tmp_3665)
PWN(PWN((Unnamed Layer* 141) [Activation]), hard_swish (Output: hardswish_18.tmp_0667))
conv2d (Output: batch_norm_45.tmp_3680) + elementwise (Output: elementwise_add_9682)
conv2d (Output: batch_norm_46.tmp_3695)
PWN(PWN((Unnamed Layer* 146) [Activation]), hard_swish (Output: hardswish_19.tmp_0697))
conv2d (Output: conv2d_105.tmp_0700)
pool2d (Output: pool2d_0.tmp_0702)
conv2d (Output: conv2d_106.tmp_1708) + relu (Output: relu_12.tmp_0710)
conv2d (Output: conv2d_107.tmp_1716)
PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_0.tmp_0718)), elementwise (Output: tmp_0720)), elementwise (Output: tmp_1722))
nearest_interp_v2 (Output: nearest_interp_v2_0.tmp_0799)
conv2d (Output: conv2d_117.tmp_0812)
PWN(PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_1.tmp_0743)), elementwise (Output: tmp_2745)), elementwise (Output: tmp_3747)), elementwise (Output: tmp_8801))
pool2d (Output: pool2d_4.tmp_0814)
nearest_interp_v2 (Output: nearest_interp_v2_1.tmp_0803)
conv2d (Output: conv2d_120.tmp_0837)
conv2d (Output: conv2d_118.tmp_1820) + relu (Output: relu_16.tmp_0822)
PWN(PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_2.tmp_0768)), elementwise (Output: tmp_4770)), elementwise (Output: tmp_5772)), elementwise (Output: tmp_9805))
pool2d (Output: pool2d_5.tmp_0839)
nearest_interp_v2 (Output: nearest_interp_v2_2.tmp_0807)
conv2d (Output: conv2d_123.tmp_0862)
conv2d (Output: conv2d_121.tmp_1845) + relu (Output: relu_17.tmp_0847)
conv2d (Output: conv2d_119.tmp_1828)
PWN(PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_3.tmp_0793)), elementwise (Output: tmp_6795)), elementwise (Output: tmp_7797)), elementwise (Output: tmp_10809))
pool2d (Output: pool2d_6.tmp_0864)
conv2d (Output: conv2d_126.tmp_0887)
conv2d (Output: conv2d_124.tmp_1870) + relu (Output: relu_18.tmp_0872)
conv2d (Output: conv2d_122.tmp_1853)
pool2d (Output: pool2d_7.tmp_0889)
PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_4.tmp_0830)), elementwise (Output: tmp_11832)), elementwise (Output: tmp_12834))
conv2d (Output: conv2d_127.tmp_1895) + relu (Output: relu_19.tmp_0897)
conv2d (Output: conv2d_125.tmp_1878)
nearest_interp_v2 (Output: nearest_interp_v2_3.tmp_0911)
PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_5.tmp_0855)), elementwise (Output: tmp_13857)), elementwise (Output: tmp_14859))
conv2d (Output: conv2d_128.tmp_1903)
nearest_interp_v2 (Output: nearest_interp_v2_4.tmp_0913)
PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_6.tmp_0880)), elementwise (Output: tmp_15882)), elementwise (Output: tmp_16884))
nearest_interp_v2 (Output: nearest_interp_v2_5.tmp_0915)
PWN(PWN(PWN(hard_sigmoid (Output: hardsigmoid_7.tmp_0905)), elementwise (Output: tmp_17907)), elementwise (Output: tmp_18909))
nearest_interp_v2_3.tmp_0911 copy
nearest_interp_v2_4.tmp_0913 copy
nearest_interp_v2_5.tmp_0915 copy
conv2d (Output: batch_norm_47.tmp_3930) + relu (Output: batch_norm_47.tmp_4932)
conv2d_transpose (Output: conv2d_transpose_4.tmp_0935) + (Unnamed Layer* 201) [Constant] + (Unnamed Layer* 207) [Shuffle] + elementwise (Output: elementwise_add_10.tmp_0938) + batchnorm_add_scale (Output: batch_norm_48.tmp_3948) + relu (Output: batch_norm_48.tmp_4950)
conv2d_transpose (Output: conv2d_transpose_5.tmp_0953) + (Unnamed Layer* 212) [Constant] + (Unnamed Layer* 218) [Shuffle] + elementwise (Output: elementwise_add_11.tmp_0956)
PWN(sigmoid (Output: sigmoid_0.tmp_0958))
Bindings:
x
sigmoid_0.tmp_0958
I0601 16:12:27.191135 13695 engine.cc:469] ====== engine info end ======
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0601 16:12:27.238207 13695 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU
terminate called after throwing an instance of 'phi::enforce::EnforceNotMet'
what():
C++ Traceback (most recent call last):
0 paddle_infer::CreatePredictor(paddle::AnalysisConfig const&)
1 paddle_infer::Predictor::Predictor(paddle::AnalysisConfig const&)
2 std::unique_ptr<paddle::PaddlePredictor, std::default_deletepaddle::PaddlePredictor > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&)
3 paddle::AnalysisPredictor::Init(std::shared_ptrpaddle::framework::Scope const&, std::shared_ptrpaddle::framework::ProgramDesc const&)
4 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptrpaddle::framework::ProgramDesc const&)
5 paddle::AnalysisPredictor::OptimizeInferenceProgram()
6 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*)
7 paddle::inference::analysis::IrParamsSyncAmongDevicesPass::RunImpl(paddle::inference::analysis::Argument*)
8 paddle::inference::analysis::IrParamsSyncAmongDevicesPass::CopyParamsToGpu(paddle::inference::analysis::Argument*)
9 paddle::framework::TensorCopySync(phi::DenseTensor const&, phi::Place const&, phi::DenseTensor*)
10 phi::DenseTensor::mutable_data(phi::Place const&, paddle::experimental::DataType, unsigned long)
11 paddle::memory::AllocShared(phi::Place const&, unsigned long)
12 paddle::memory::allocation::AllocatorFacade::AllocShared(phi::Place const&, unsigned long)
13 paddle::memory::allocation::AllocatorFacade::Alloc(phi::Place const&, unsigned long)
14 paddle::memory::allocation::StatAllocator::AllocateImpl(unsigned long)
15 paddle::memory::allocation::Allocator::Allocate(unsigned long)
16 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long)
17 paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long)
18 paddle::memory::legacy::AllocVisitor::result_type paddle::platform::VisitPlacepaddle::memory::legacy::AllocVisitor(phi::Place const&, paddle::memory::legacy::AllocVisitor const&)
19 void* paddle::memory::legacy::Allocphi::GPUPlace(phi::GPUPlace const&, unsigned long)
20 paddle::memory::legacy::GetGPUBuddyAllocator(int)
21 paddle::memory::legacy::GPUBuddyAllocatorList::Get(int)
22 paddle::memory::legacy::GPUBuddyAllocatorList::Get(int)::{lambda()#1}::operator()() const
23 paddle::platform::GpuMaxChunkSize()
24 paddle::platform::GpuMaxAllocSize()
25 phi::enforce::EnforceNotMet::EnforceNotMet(phi::ErrorSummary const&, char const*, int)
26 phi::enforce::GetCurrentTraceBackStringabi:cxx11
Error Message Summary:
ResourceExhaustedError: Not enough available GPU memory.
[Hint: Expected available_to_alloc >= alloc_bytes, but received available_to_alloc:318272921 < alloc_bytes:419430400.] (at /home/paddle/data/xly/workspace/23303/Paddle/paddle/fluid/platform/device/gpu/gpu_info.cc:99)
Aborted (core dumped)
The text was updated successfully, but these errors were encountered: