paddle 推理完之后内存不释放 #43346

Tian14267 · 2022-06-09T05:39:19Z

请提出你的问题 Please ask your question

你好，我遇到一个问题，paddle模型预测万之后内存不释放的情况。详细情况如下：

服务：使用Websocket 进行模型加载和部署
详情：
1：初始加载模型和服务的时候，**会消耗GPU大概1.7G内存（如图）：

2：加载完之后进行语音合成的时候内存会涨上去。整个合成任务过程中，内存能够涨到13G左右（如图），任务结束之后，这个内存无法释放。即，无法从13G回到1.7G。

请问paddle有没有内存释放机制，能够在任务结束之后，再不关闭服务的前提下，把多出来的内存释放了？

https://github.com/PaddlePaddle/PaddleSpeech/issues/2024

paddle-bot-old · 2022-06-09T05:39:21Z

您好，我们已经收到了您的问题，会安排技术人员尽快解答您的问题，请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时，您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快～

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API，FAQ，Github Issue and AI community to get the answer.Have a nice day!

hp03 · 2022-06-09T08:31:03Z

这个可以试一试。

如果这个 predictor 之后还要用，那 predictor.run() 之后，内存不会主动释放，因为后续推理还会用，不能每次 run 的时候去 malloc，run 完 free，这样影响性能。

如果是这个 predictor 不用了，C++ 里面可以直接 delete 这个 predictor，会释放掉其中内存。

另外，框架有一部分静态变量，在初始化后，只有进程退出才会释放，但这部分不会有那么多，你遇到的问题跟这个无关。

Tian14267 · 2022-06-10T02:01:35Z

这个可以试一试。
如果这个 predictor 之后还要用，那 predictor.run() 之后，内存不会主动释放，因为后续推理还会用，不能每次 run 的时候去 malloc，run 完 free，这样影响性能。

如果是这个 predictor 不用了，C++ 里面可以直接 delete 这个 predictor，会释放掉其中内存。

另外，框架有一部分静态变量，在初始化后，只有进程退出才会释放，但这部分不会有那么多，你遇到的问题跟这个无关。

我这边使用了这两句，但是会有报错：

  File "inference_fastspeech2_pwgan_onnx.py", line 202, in inference
    self.model.clear_intermediate_tensor()
  File "/root/anaconda3/envs/paddlespeech/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 1110, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DurationFastSpeech2' object has no attribute 'clear_intermediate_tensor'

其中 'DurationFastSpeech2' 是自定义的相关网络结构函数。并没有使用到“predictor”
请问有没有其他方法可以释放GPU预测出来的缓存呢

Tian14267 · 2022-06-10T02:34:46Z

另外，在tensorflow和torch的python代码里面都有限制GPU内存的命令，即限制GPU使用多少的内存。在paddle里面有没有类似的python代码或命令？

kouhinn · 2022-07-11T10:11:57Z

这个可以试一试。
如果这个 predictor 之后还要用，那 predictor.run() 之后，内存不会主动释放，因为后续推理还会用，不能每次 run 的时候去 malloc，run 完 free，这样影响性能。

如果是这个 predictor 不用了，C++ 里面可以直接 delete 这个 predictor，会释放掉其中内存。

另外，框架有一部分静态变量，在初始化后，只有进程退出才会释放，但这部分不会有那么多，你遇到的问题跟这个无关。

同样的问题，每个predictor.run() 之后调用了上述clearxxx和shrink之后显存占用的确可以了。但是时常出现一些内存/显存相关的错误：
堆栈如下：
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():

C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun()
1 paddle::framework::NaiveExecutor::Run()
2 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 5ul, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernel, paddle::operators::ShapeKernelpaddle::platform::float16, paddle::operators::ShapeKernel<paddle::platform::complex >, paddle::operators::ShapeKernel<paddle::platform::complex > >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)
7 std::_Sp_counted_deleter<paddle::memory::allocation::Allocation*, paddle::memory::allocation::Allocator::AllocationDeleter, std::allocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
8 paddle::memory::allocation::RetryAllocator::FreeImpl(paddle::memory::allocation::Allocation*)
9 paddle::memory::allocation::NaiveBestFitAllocator::FreeImpl(paddle::memory::allocation::Allocation*)
10 paddle::memory::detail::BuddyAllocator::Free(void*)
11 paddle::memory::detail::MetadataCache::LoadDesc(paddle::memory::detail::MemoryBlock*)
12 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
13 paddle::platform::GetCurrentTraceBackStringabi:cxx11

Error Message Summary:

NotFoundError: The memory block is not found in cache
[Hint: Expected iter != cache_.end(), but received iter == cache_.end().] (at ......../Paddle_2.2.2/Paddle/paddle/fluid/memory/detail/meta_cache.cc:30)

kouhinn · 2022-07-12T03:35:06Z

跑24小时差不多必现。有时是如下一种报错：
***** FATAL SIGNAL RECEIVED *******
Received fatal signal: SIGABRT(6) PID: 18711

***** SIGNAL SIGABRT(6)

******* STACKDUMP *******
stack dump [1] /usr/local/lib/libg3log.so.2.1.0-0+0x1465a [0x7f1261e8165a]
stack dump [2] /lib/x86_64-linux-gnu/libpthread.so.0+0x12980 [0x7f127d0d5980]
stack dump [3] /lib/x86_64-linux-gnu/libc.so.6gsignal+0xc7 [0x7f1261519e87]
stack dump [4] /lib/x86_64-linux-gnu/libc.so.6abort+0x141 [0x7f126151b7f1]
stack dump [5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8c957 [0x7f1261b70957]
stack dump [6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92ae6 [0x7f1261b76ae6]
stack dump [7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x91b49 [0x7f1261b75b49]
stack dump [8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6__gxx_personality_v0+0x2a8 [0x7f1261b764b8]
stack dump [9] /lib/x86_64-linux-gnu/libgcc_s.so.1+0x10573 [0x7f12618dc573]
stack dump [10] /lib/x86_64-linux-gnu/libgcc_s.so.1_Unwind_Resume+0x125 [0x7f12618dcdf5]
stack dump [11] /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x1ecc378 [0x7f1267f44378]

stack dump [12]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::memory::allocation::NaiveBestFitAllocator::FreeImpl(paddle::memory::allocation::Allocation*)+0xc5 [0x7f126e181c95]

stack dump [13]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::memory::allocation::RetryAllocator::FreeImpl(paddle::memory::allocation::Allocation*)+0x41 [0x7f126e194e31]

stack dump [14]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : std::_Sp_counted_deleter<paddle::memory::allocation::Allocation*, paddle::memory::allocation::Allocator::AllocationDeleter, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x25 [0x7f1268f4ff85]
stack dump [15]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x227e757 [0x7f12682f6757]

stack dump [16]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)+0xc5 [0x7f1268608fa5]

stack dump [17]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : int paddle::operators::PackTensorsIntoVector<float>(paddle::framework::ExecutionContext const&, std::vector<paddle::framework::Tensor const*, std::allocator<paddle::framework::Tensor const*> >*, std::vector<paddle::framework::Tensor*, std::allocator<paddle::framework::Tensor*> >*, paddle::framework::Tensor*)+0x1dc [0x7f1268fd3bac]

stack dump [18]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, long>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::complex<float> >, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::complex<double> > >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)+0x5b [0x7f12690a9aeb]

stack dump [19]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const+0x312 [0x7f126e063dd2]

stack dump [20]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const+0x148 [0x7f126e064628]

stack dump [21]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)+0x1c7 [0x7f126e0604c7]

stack dump [22]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::NaiveExecutor::Run()+0x130 [0x7f12686795d0]

stack dump [23]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::AnalysisPredictor::ZeroCopyRun()+0x293 [0x7f1268324e73]

stack dump [24]  ./XXXX : PaddleDetection::ObjectDetector::Predict(std::vector<cv::Mat, std::allocator<cv::Mat> >, double, int, int, std::vector<PaddleDetection::ObjectResult, std::allocator<PaddleDetection::ObjectResult> >*, std::vector<int, std::allocator<int> >*, std::vector<double, std::allocator<double> >*, int)+0xee4 [0x559aca875082]

kouhinn · 2022-07-12T09:38:12Z

@hp03 是不是上述clearxxx和shrink在每个predictor.run之后都必须调用？如果一些predictor.run之后调用这个，一些不调用，会不会出现上述问题？还是其他原因？

kouhinn · 2022-07-13T03:45:28Z

+1:
***** FATAL SIGNAL RECEIVED *******
Received fatal signal: SIGABRT(6) PID: 30571

***** SIGNAL SIGABRT(6)

******* STACKDUMP *******
stack dump [1] /usr/local/lib/libg3log.so.2.1.0-0+0x1465a [0x7fa3a64e865a]
stack dump [2] /lib/x86_64-linux-gnu/libpthread.so.0+0x12980 [0x7fa3c173c980]
stack dump [3] /lib/x86_64-linux-gnu/libc.so.6gsignal+0xc7 [0x7fa3a5b80e87]
stack dump [4] /lib/x86_64-linux-gnu/libc.so.6abort+0x141 [0x7fa3a5b827f1]
stack dump [5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8c957 [0x7fa3a61d7957]
stack dump [6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92ae6 [0x7fa3a61ddae6]
stack dump [7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92b21 [0x7fa3a61ddb21]
stack dump [8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x92d54 [0x7fa3a61ddd54]
stack dump [9] /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so+0x1ebe224 [0x7fa3ac59d224]

stack dump [10]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::framework::NaiveExecutor::Run()+0x130 [0x7fa3acce05d0]

stack dump [11]  /opt/paddle_lib/paddle_inference/paddle/lib/libpaddle_inference.so : paddle::AnalysisPredictor::ZeroCopyRun()+0x293 [0x7fa3ac98be73]

stack dump [12]  ./xxx : doInference(paddle_infer::Predictor&, std::vector<float, std::allocator<float> > const&, std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> >&)+0x10d [0x563a23f08aad]

kouhinn · 2022-07-14T03:28:25Z

terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():

Compile Traceback (most recent call last):
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\Scripts\x2paddle-script.py", line 33, in
sys.exit(load_entry_point('x2paddle==1.3.5', 'console_scripts', 'x2paddle')())
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\convert.py", line 373, in main
lite_model_type=args.lite_model_type)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\convert.py", line 234, in onnx2paddle
mapper.paddle_graph.gen_model(save_dir)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\core\program.py", line 296, in gen_model
self.dygraph2static(save_dir, input_shapes, input_types)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\x2paddle-1.3.5-py3.7.egg\x2paddle\core\program.py", line 580, in dygraph2static
osp.join(save_dir, "inference_model/model"))
File "", line 2, in save

File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\jit.py", line 744, in save
  inner_input_spec)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 517, in concrete_program_specify_input_spec
  *desired_input_spec)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 427, in get_concrete_program
  concrete_program, partial_program_layer = self._program_cache[cache_key]
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 723, in __getitem__
  self._caches[item] = self._build_once(item)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 714, in _build_once
  **cache_key.kwargs)
File "<decorator-gen-99>", line 2, in from_func_spec
  
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 662, in from_func_spec
  outputs = static_func(*inputs)
File "personbasemodelonnx2paddle\x2paddle_code.py", line 315, in forward
  x2paddle_convolution_output96 = self.conv1(x2paddle_convolution_output96_paded)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\layers.py", line 917, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\dygraph\layers.py", line 907, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\nn\layer\conv.py", line 677, in forward
  use_cudnn=self._use_cudnn)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\nn\functional\conv.py", line 148, in _conv_nd
  type=op_type, inputs=inputs, outputs=outputs, attrs=attrs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\layer_helper.py", line 43, in append_op
  return self.main_program.current_block().append_op(*args, **kwargs)
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\framework.py", line 3184, in append_op
  attrs=kwargs.get("attrs", None))
File "C:\Users\admin\anaconda3\envs\py37_tensorflow1_14\lib\site-packages\paddle\fluid\framework.py", line 2224, in __init__
  for frame in traceback.extract_stack():

C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun()
1 paddle::framework::NaiveExecutor::Run()
2 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::CUDNNConvFusionOpKernel, paddle::operators::CUDNNConvFusionOpKernel >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const
7 void paddle::platform::CudnnWorkspaceHandle::RunFunc<paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const::{lambda(void*)#2}&>(paddle::operators::CUDNNConvFusionOpKernel::Compute(paddle::framework::ExecutionContext const&) const::{lambda(void*)#2}&, unsigned long)
8 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
9 paddle::platform::GetCurrentTraceBackStringabi:cxx11

Error Message Summary:

ExternalError: CUDNN error(8), CUDNN_STATUS_EXECUTION_FAILED.
[Hint: Please search for the error code(8) on website (https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnStatus_t) to get Nvidia's official solution and advice about CUDNN Error.] (at /home/xiangbin_train_workspace/PaddlePaddleWorkspace/Paddle_2.2.2/Paddle/paddle/fluid/operators/fused/conv_fusion_op.cu:381)
[operator < conv2d_fusion > error]
2022/07/13 08:15:47 454804

anexplore · 2022-07-27T05:53:49Z

有新的进展么？

anexplore · 2022-08-03T03:29:00Z

有新的进展么？

内存问题，我的方案：1、控制输入大小，对数据裁剪控制最大限制，比如图片统一压缩到128*128 2、控制批量大小

BabyBoy-Yuan · 2022-09-29T06:03:30Z

楼上的大哥们, 怎么解决的?

Lanme · 2022-12-05T03:28:52Z

有新的进展么？

内存问题，我的方案：1、控制输入大小，对数据裁剪控制最大限制，比如图片统一压缩到128*128 2、控制批量大小

issue的意思是，一直推理无法释放内存，但是压缩图片只是增加模型能推理的图片数量而已吧，好像没解决内存问题阿？

anexplore · 2022-12-05T03:53:18Z

有新的进展么？

内存问题，我的方案：1、控制输入大小，对数据裁剪控制最大限制，比如图片统一压缩到128*128 2、控制批量大小

issue的意思是，一直推理无法释放内存，但是压缩图片只是增加模型能推理的图片数量而已吧，好像没解决内存问题阿？

我这个是控制住内存的上限，不释放无所谓，因为要持续推理使用；

Lanme · 2022-12-05T04:03:46Z

有新的进展么？

内存问题，我的方案：1、控制输入大小，对数据裁剪控制最大限制，比如图片统一压缩到128*128 2、控制批量大小

issue的意思是，一直推理无法释放内存，但是压缩图片只是增加模型能推理的图片数量而已吧，好像没解决内存问题阿？

我这个是控制住内存的上限，不释放无所谓，因为要持续推理使用；

明白了，我以为是每次推理都会增加显存，刚测试了只要达到上限就行。

leiqing1 · 2023-01-12T09:35:18Z

@Tian14267 @Lanme @2742195759 @anexplore 大家好我是Paddle的产品经理雷青，大家现在这个问题解决了嘛？

备注：这个issue应该问题说的应该是显存问题哈。我看贴图也都是显存的信息。

【显存释放问题】释放显存通常是在后续没有推理任务（即Predict后面不用了）；如果后面还要继续推理，频繁释放显存会造成推理模型频繁的加载到显存里面。
如果大家这个问题还没有解决，可以加我微信（18813190139），我们专项解决下该问题。

【指定固定显存问题】Paddle现在还有没可以手动设定显存的功能，如果大家有改需求，欢迎大家给FastDeploy仓库提需求。
https://github.com/PaddlePaddle/FastDeploy/issues

【其他部署需求】大家在部署中有其他部署需求，也欢迎随时给FastDeploy仓库提需求
https://github.com/PaddlePaddle/FastDeploy/issues

git3210 · 2023-03-21T06:29:13Z

为什么处理过程中显示一直增加。paddle很不稳定还总是coredump

ZhangYuef · 2023-06-25T03:04:08Z

这个 ISSUE 有最新进展么？使用 CPU 推理也会遇到同样问题：内存占用随着推理图片调用次数增加而持续增加。

panp4n · 2023-11-24T02:07:53Z

CPU推理关闭MKLDNN加速可以解决

LRENAC1 · 2024-01-03T05:35:45Z

确实有这个问题预测一个更大大图片的时候会重新申请显存，并且不会释放原来的。。我试了下在这里加一行可以解决

GloriaYY · 2024-02-24T07:58:17Z

我也遇到类似的问题，但是不是在inference环节，training的环节就出现显存消耗一直增加。 Paddle有没有一个语句可以实时的返回消耗的显存额度？我好排查一下是在哪一步一直吃显存的

Tian14267 added status/new-issue 新建 type/question 用户提问 labels Jun 9, 2022

paddle-bot-old bot assigned 2742195759 Jun 9, 2022

kouhinn mentioned this issue Jul 14, 2022

不能稳定运行20小时：paddle inference c++多线程预测常出现gpu相关异常退出 #44323

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paddle 推理完之后内存不释放 #43346

paddle 推理完之后内存不释放 #43346

Tian14267 commented Jun 9, 2022

paddle-bot-old bot commented Jun 9, 2022

hp03 commented Jun 9, 2022

Tian14267 commented Jun 10, 2022

Tian14267 commented Jun 10, 2022

kouhinn commented Jul 11, 2022 •

edited

Loading

kouhinn commented Jul 12, 2022 •

edited

Loading

kouhinn commented Jul 12, 2022 •

edited

Loading

kouhinn commented Jul 13, 2022

kouhinn commented Jul 14, 2022

anexplore commented Jul 27, 2022

anexplore commented Aug 3, 2022

BabyBoy-Yuan commented Sep 29, 2022

Lanme commented Dec 5, 2022

anexplore commented Dec 5, 2022

Lanme commented Dec 5, 2022

leiqing1 commented Jan 12, 2023 •

edited

Loading

git3210 commented Mar 21, 2023

ZhangYuef commented Jun 25, 2023

panp4n commented Nov 24, 2023

LRENAC1 commented Jan 3, 2024

GloriaYY commented Feb 24, 2024

paddle 推理完之后内存不释放 #43346

paddle 推理完之后内存不释放 #43346

Comments

Tian14267 commented Jun 9, 2022

请提出你的问题 Please ask your question

paddle-bot-old bot commented Jun 9, 2022

hp03 commented Jun 9, 2022

Tian14267 commented Jun 10, 2022

Tian14267 commented Jun 10, 2022

kouhinn commented Jul 11, 2022 • edited Loading

C++ Traceback (most recent call last):

Error Message Summary:

kouhinn commented Jul 12, 2022 • edited Loading

kouhinn commented Jul 12, 2022 • edited Loading

kouhinn commented Jul 13, 2022

kouhinn commented Jul 14, 2022

C++ Traceback (most recent call last):

Error Message Summary:

anexplore commented Jul 27, 2022

anexplore commented Aug 3, 2022

BabyBoy-Yuan commented Sep 29, 2022

Lanme commented Dec 5, 2022

anexplore commented Dec 5, 2022

Lanme commented Dec 5, 2022

leiqing1 commented Jan 12, 2023 • edited Loading

git3210 commented Mar 21, 2023

ZhangYuef commented Jun 25, 2023

panp4n commented Nov 24, 2023

LRENAC1 commented Jan 3, 2024

GloriaYY commented Feb 24, 2024

kouhinn commented Jul 11, 2022 •

edited

Loading

kouhinn commented Jul 12, 2022 •

edited

Loading

kouhinn commented Jul 12, 2022 •

edited

Loading

leiqing1 commented Jan 12, 2023 •

edited

Loading