Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paddle 推理完之后内存不释放 #43346

Open
Tian14267 opened this issue Jun 9, 2022 · 21 comments
Open

paddle 推理完之后内存不释放 #43346

Tian14267 opened this issue Jun 9, 2022 · 21 comments
Assignees
Labels

Comments

@Tian14267
Copy link

请提出你的问题 Please ask your question

你好,我遇到一个问题,paddle模型预测万之后内存不释放的情况。详细情况如下:

服务:使用Websocket 进行模型加载和部署
详情:
1:初始加载模型和服务的时候,**会消耗GPU大概1.7G内存(如图):
0
2:加载完之后进行语音合成的时候内存会涨上去。整个合成任务过程中,内存能够涨到13G左右(如图),任务结束之后,这个内存无法释放。即,无法从13G回到1.7G。
1
请问paddle有没有内存释放机制,能够在任务结束之后,再不关闭服务的前提下,把多出来的内存释放了?

https://github.com/PaddlePaddle/PaddleSpeech/issues/2024

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jun 9, 2022

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@hp03
Copy link
Contributor

hp03 commented Jun 9, 2022

a4f64df3805dfa1eed5a51c6c7f9038d

这个可以试一试。

如果这个 predictor 之后还要用,那 predictor.run() 之后,内存不会主动释放,因为后续推理还会用,不能每次 run 的时候去 malloc,run 完 free,这样影响性能。

如果是这个 predictor 不用了,C++ 里面可以直接 delete 这个 predictor,会释放掉其中内存。

另外,框架有一部分静态变量,在初始化后,只有进程退出才会释放,但这部分不会有那么多,你遇到的问题跟这个无关。

@Tian14267
Copy link
Author

a4f64df3805dfa1eed5a51c6c7f9038d

这个可以试一试。
如果这个 predictor 之后还要用,那 predictor.run() 之后,内存不会主动释放,因为后续推理还会用,不能每次 run 的时候去 malloc,run 完 free,这样影响性能。

如果是这个 predictor 不用了,C++ 里面可以直接 delete 这个 predictor,会释放掉其中内存。

另外,框架有一部分静态变量,在初始化后,只有进程退出才会释放,但这部分不会有那么多,你遇到的问题跟这个无关。

我这边使用了这两句,但是会有报错:

  File "inference_fastspeech2_pwgan_onnx.py", line 202, in inference
    self.model.clear_intermediate_tensor()
  File "/root/anaconda3/envs/paddlespeech/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 1110, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DurationFastSpeech2' object has no attribute 'clear_intermediate_tensor'

其中 'DurationFastSpeech2' 是自定义的相关网络结构函数。并没有使用到“predictor”
请问有没有其他方法可以释放GPU预测出来的缓存呢

@Tian14267
Copy link
Author

另外,在tensorflow和torch的python代码里面都有限制GPU内存的命令,即限制GPU使用多少的内存。在paddle里面有没有类似的python代码或命令?

@kouhinn
Copy link

kouhinn commented Jul 11, 2022

a4f64df3805dfa1eed5a51c6c7f9038d

这个可以试一试。
如果这个 predictor 之后还要用,那 predictor.run() 之后,内存不会主动释放,因为后续推理还会用,不能每次 run 的时候去 malloc,run 完 free,这样影响性能。

如果是这个 predictor 不用了,C++ 里面可以直接 delete 这个 predictor,会释放掉其中内存。

另外,框架有一部分静态变量,在初始化后,只有进程退出才会释放,但这部分不会有那么多,你遇到的问题跟这个无关。

同样的问题,每个predictor.run() 之后调用了上述clearxxx和shrink之后显存占用的确可以了。但是时常出现一些内存/显存相关的错误:
堆栈如下:
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():


C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun()
1 paddle::framework::NaiveExecutor::Run()
2 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 5ul, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernelpaddle::platform::float16, paddle::operators::ShapeKernel<paddle::platform::complex >, paddle::operators::ShapeKernel<paddle::platform::complex > >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)
7 std::_Sp_counted_deleter<paddle::memory::allocation::Allocation*, paddle::memory::allocation::Allocator::AllocationDeleter, std::allocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
8 paddle::memory::allocation::RetryAllocator::FreeImpl(paddle::memory::allocation::Allocation*)
9 paddle::memory::allocation::NaiveBestFitAllocator::FreeImpl(paddle::memory::allocation::Allocation*)
10 paddle::memory::detail::BuddyAllocator::Free(void*)
11 paddle::memory::detail::MetadataCache::LoadDesc(paddle::memory::detail::MemoryBlock*)
12 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
13 paddle::platform::GetCurrentTraceBackStringabi:cxx11


Error Message Summary:

NotFoundError: The memory block is not found in cache
[Hint: Expected iter != cache_.end(), but received iter == cache_.end().] (at ......../Paddle_2.2.2/Paddle/paddle/fluid/memory/detail/meta_cache.cc:30)

@kouhinn
Copy link

kouhinn commented Jul 12, 2022

跑24小时差不多必现。有时是如下一种报错:
***** FATAL SIGNAL RECEIVED *******
Received fatal signal: SIGABRT(6) PID: 18711

***** SIGNAL SIGABRT(6)

******* STACKDUMP *******
stack dump [1] /usr/local/lib/libg3log.so.2.1.0-0+0x1465a [0x7f1261e8165a]
stack dump [2] /lib/x86_64-linux-gnu/libpthread.so.0+0x12980 [0x7f127d0d5980]
stack dump [3] /lib/x86_64-linux-gnu/libc.so.6gsignal+0xc7 [0x7f1261519e87]
stack dump [4] /lib/x86_64-linux-gnu/libc.so.6abort+0x141 [0x7f126151b7f1]
stack dump [5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8c957 [0x7f1261b70957]
stack dump [6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92ae6 [0x7f1261b76ae6]
stack dump [7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x91b49 [0x7f1261b75b49]
stack dump [8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6__gxx_personality_v0+0x2a8 [0x7f1261b764b8]
stack dump [9] /lib/x86_64-linux-gnu/libgcc_s.so.1+0x10573 [0x7f12618dc573]
stack dump [10] /lib/x86_64-linux-gnu/libgcc_s.so.1_Unwind_Resume+0x125 [0x7f12618dcdf5]
stack dump [11] /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x1ecc378 [0x7f1267f44378]

stack dump [12]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::memory::allocation::NaiveBestFitAllocator::FreeImpl(paddle::memory::allocation::Allocation*)+0xc5 [0x7f126e181c95]

stack dump [13]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::memory::allocation::RetryAllocator::FreeImpl(paddle::memory::allocation::Allocation*)+0x41 [0x7f126e194e31]

stack dump [14]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : std::_Sp_counted_deleter<paddle::memory::allocation::Allocation*, paddle::memory::allocation::Allocator::AllocationDeleter, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x25 [0x7f1268f4ff85]
stack dump [15]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x227e757 [0x7f12682f6757]

stack dump [16]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)+0xc5 [0x7f1268608fa5]

stack dump [17]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : int paddle::operators::PackTensorsIntoVector<float>(paddle::framework::ExecutionContext const&, std::vector<paddle::framework::Tensor const*, std::allocator<paddle::framework::Tensor const*> >*, std::vector<paddle::framework::Tensor*, std::allocator<paddle::framework::Tensor*> >*, paddle::framework::Tensor*)+0x1dc [0x7f1268fd3bac]

stack dump [18]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, long>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::complex<float> >, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::complex<double> > >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)+0x5b [0x7f12690a9aeb]

stack dump [19]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const+0x312 [0x7f126e063dd2]

stack dump [20]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const+0x148 [0x7f126e064628]

stack dump [21]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)+0x1c7 [0x7f126e0604c7]

stack dump [22]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::NaiveExecutor::Run()+0x130 [0x7f12686795d0]

stack dump [23]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::AnalysisPredictor::ZeroCopyRun()+0x293 [0x7f1268324e73]

stack dump [24]  ./XXXX : PaddleDetection::ObjectDetector::Predict(std::vector<cv::Mat, std::allocator<cv::Mat> >, double, int, int, std::vector<PaddleDetection::ObjectResult, std::allocator<PaddleDetection::ObjectResult> >*, std::vector<int, std::allocator<int> >*, std::vector<double, std::allocator<double> >*, int)+0xee4 [0x559aca875082]

@kouhinn
Copy link

kouhinn commented Jul 12, 2022

@hp03 是不是上述clearxxx和shrink在每个predictor.run之后都必须调用?如果一些predictor.run之后调用这个,一些不调用,会不会出现上述问题?还是其他原因?

@kouhinn
Copy link

kouhinn commented Jul 13, 2022

+1:
***** FATAL SIGNAL RECEIVED *******
Received fatal signal: SIGABRT(6) PID: 30571

***** SIGNAL SIGABRT(6)

******* STACKDUMP *******
stack dump [1] /usr/local/lib/libg3log.so.2.1.0-0+0x1465a [0x7fa3a64e865a]
stack dump [2] /lib/x86_64-linux-gnu/libpthread.so.0+0x12980 [0x7fa3c173c980]
stack dump [3] /lib/x86_64-linux-gnu/libc.so.6gsignal+0xc7 [0x7fa3a5b80e87]
stack dump [4] /lib/x86_64-linux-gnu/libc.so.6abort+0x141 [0x7fa3a5b827f1]
stack dump [5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8c957 [0x7fa3a61d7957]
stack dump [6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92ae6 [0x7fa3a61ddae6]
stack dump [7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92b21 [0x7fa3a61ddb21]
stack dump [8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92d54 [0x7fa3a61ddd54]
stack dump [9] /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x1ebe224 [0x7fa3ac59d224]

stack dump [10]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::NaiveExecutor::Run()+0x130 [0x7fa3acce05d0]

stack dump [11]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::AnalysisPredictor::ZeroCopyRun()+0x293 [0x7fa3ac98be73]

stack dump [12]  ./xxx : doInference(paddle_infer::Predictor&, std::vector<float, std::allocator<float> > const&, std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> >&)+0x10d [0x563a23f08aad]

@kouhinn
Copy link

kouhinn commented Jul 14, 2022

terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():

Compile Traceback (most recent call last):
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\Scripts\x2paddle-script.py", line 33, in
sys.exit(load_entry_point('x2paddle==1.3.5', 'console_scripts', 'x2paddle')())
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\convert.py", line 373, in main
lite_model_type=args.lite_model_type)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\convert.py", line 234, in onnx2paddle
mapper.paddle_graph.gen_model(save_dir)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\core\program.py", line 296, in gen_model
self.dygraph2static(save_dir, input_shapes, input_types)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\core\program.py", line 580, in dygraph2static
osp.join(save_dir, "inference_model/model"))
File "", line 2, in save

File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\jit.py", line 744, in save
  inner_input_spec)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 517, in concrete_program_specify_input_spec
  *desired_input_spec)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 427, in get_concrete_program
  concrete_program, partial_program_layer = self._program_cache[cache_key]
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 723, in __getitem__
  self._caches[item] = self._build_once(item)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 714, in _build_once
  **cache_key.kwargs)
File "<decorator-gen-99>", line 2, in from_func_spec
  
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 662, in from_func_spec
  outputs = static_func(*inputs)
File "personbasemodelonnx2paddle\x2paddle_code.py", line 315, in forward
  x2paddle_convolution_output96 = self.conv1(x2paddle_convolution_output96_paded)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\layers.py", line 917, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\layers.py", line 907, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\nn\layer\conv.py", line 677, in forward
  use_cudnn=self._use_cudnn)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\nn\functional\conv.py", line 148, in _conv_nd
  type=op_type, inputs=inputs, outputs=outputs, attrs=attrs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\layer_helper.py", line 43, in append_op
  return self.main_program.current_block().append_op(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\framework.py", line 3184, in append_op
  attrs=kwargs.get("attrs", None))
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\framework.py", line 2224, in __init__
  for frame in traceback.extract_stack():

C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun()
1 paddle::framework::NaiveExecutor::Run()
2 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::CUDNNConvFusionOpKernel, paddle::operators::CUDNNConvFusionOpKernel >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const
7 void paddle::platform::CudnnWorkspaceHandle::RunFunc<paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const::{lambda(void*)#2}&>(paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const::{lambda(void*)#2}&, unsigned long)
8 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
9 paddle::platform::GetCurrentTraceBackStringabi:cxx11


Error Message Summary:

ExternalError: CUDNN error(8), CUDNN_STATUS_EXECUTION_FAILED.
[Hint: Please search for the error code(8) on website (https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnStatus_t) to get Nvidia's official solution and advice about CUDNN Error.] (at /home/xiangbin_train_workspace/PaddlePaddleWorkspace/Paddle_2.2.2/Paddle/paddle/fluid/operators/fused/conv_fusion_op.cu:381)
[operator < conv2d_fusion > error]
2022/07/13 08:15:47 454804

@anexplore
Copy link

有新的进展么?

@anexplore
Copy link

有新的进展么?

内存问题,我的方案:1、控制输入大小,对数据裁剪控制最大限制,比如图片统一压缩到128*128 2、控制批量大小

@BabyBoy-Yuan
Copy link

楼上的大哥们, 怎么解决的?

@Lanme
Copy link

Lanme commented Dec 5, 2022

有新的进展么?

内存问题,我的方案:1、控制输入大小,对数据裁剪控制最大限制,比如图片统一压缩到128*128 2、控制批量大小

issue的意思是,一直推理无法释放内存,但是压缩图片只是增加模型能推理的图片数量而已吧,好像没解决内存问题阿?

@anexplore
Copy link

有新的进展么?

内存问题,我的方案:1、控制输入大小,对数据裁剪控制最大限制,比如图片统一压缩到128*128 2、控制批量大小

issue的意思是,一直推理无法释放内存,但是压缩图片只是增加模型能推理的图片数量而已吧,好像没解决内存问题阿?

我这个是控制住内存的上限,不释放无所谓,因为要持续推理使用;

@Lanme
Copy link

Lanme commented Dec 5, 2022

有新的进展么?

内存问题,我的方案:1、控制输入大小,对数据裁剪控制最大限制,比如图片统一压缩到128*128 2、控制批量大小

issue的意思是,一直推理无法释放内存,但是压缩图片只是增加模型能推理的图片数量而已吧,好像没解决内存问题阿?

我这个是控制住内存的上限,不释放无所谓,因为要持续推理使用;

明白了,我以为是每次推理都会增加显存,刚测试了只要达到上限就行。

@leiqing1
Copy link

leiqing1 commented Jan 12, 2023

@Tian14267 @Lanme @2742195759 @anexplore 大家好 我是Paddle的产品经理 雷青 ,大家现在这个问题解决了嘛?

备注:这个issue应该问题说的应该是显存问题哈。我看贴图也都是显存的信息。

【显存释放问题】释放显存通常是在后续没有推理任务(即Predict后面不用了);如果后面还要继续推理,频繁释放显存会造成推理模型频繁的加载到显存里面。
如果大家这个问题还没有解决,可以加我微信(18813190139),我们专项解决下该问题。

【指定固定显存问题】Paddle现在还有没可以手动设定显存的功能,如果大家有改需求,欢迎大家给FastDeploy仓库提需求。
https://github.com/PaddlePaddle/FastDeploy/issues

【其他部署需求】大家在部署中有其他部署需求,也欢迎随时给FastDeploy仓库提需求
https://github.com/PaddlePaddle/FastDeploy/issues

@git3210
Copy link

git3210 commented Mar 21, 2023

为什么处理过程中显示一直增加。paddle很不稳定还总是coredump

@ZhangYuef
Copy link

这个 ISSUE 有最新进展么?使用 CPU 推理也会遇到同样问题:内存占用随着推理图片调用次数增加而持续增加。

@panp4n
Copy link

panp4n commented Nov 24, 2023

CPU推理关闭MKLDNN加速可以解决

@LRENAC1
Copy link

LRENAC1 commented Jan 3, 2024

确实有这个问题 预测一个更大大图片的时候会重新申请显存,并且不会释放原来的。。我试了下在这里加一行可以解决
image

@GloriaYY
Copy link

我也遇到类似的问题,但是不是在inference环节,training的环节就出现显存消耗一直增加。 Paddle有没有一个语句可以实时的返回消耗的显存额度?我好排查一下是在哪一步一直吃显存的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests