Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug fix] Release global variables #3624

Merged
merged 2 commits into from
Sep 29, 2020
Merged

Conversation

poohRui
Copy link
Contributor

@poohRui poohRui commented Sep 28, 2020

解决老版本CtrlServer在python端不会挂的问题,部分Global对象没有被释放。

@oneflow-ci-bot oneflow-ci-bot merged commit 9966d1d into master Sep 29, 2020
@oneflow-ci-bot oneflow-ci-bot deleted the dev_fix_bug_release_global branch September 29, 2020 02:18
@jackalcooper
Copy link
Collaborator

合并了这个改动之后,经常出现奇怪的错误。可能是因为调用Global<EnvGlobalObjectsScope>::Delete()之后有其他地方调用了用到 env 中包含的对象/内存。在没有完全梳理好生命周期之前(可能和 python session 等代码移到 cpp 相关),是不是可以考虑先不调用这个 delete,因为这样让系统可靠性下降。
出现的错误有:

free(): invalid next size (fast)
Fatal Python error: Aborted

或者

#13 0x00007f99f1bef41b in oneflow::MemoryAllocatorImpl::Deallocate(void*, oneflow::MemoryCase) ()
   from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) 
#14 0x00007f99f1befdc2 in oneflow::MemoryAllocator::Deallocate(char*, oneflow::MemoryCase) ()
   from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) 
#15 0x00007f99f1bf03e3 in std::_Function_handler<void (), std::_Bind<void (oneflow::MemoryAllocator::*(oneflow::MemoryAllocator*, char*, oneflow::MemoryCase))(char*, oneflow::MemoryCase)> >::_M_invoke(std::_Any_data const&) () from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) 
#16 0x00007f99f1bef608 in oneflow::MemoryAllocator::~MemoryAllocator() () from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) 
#17 0x00007f99f1a84c6f in oneflow::Runtime::DeleteAllGlobal() () from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) 
#18 0x00007f99f1a854be in oneflow::Runtime::~Runtime() () from /python_scripts/oneflow/_oneflow_internal.cpython-36m-x86_64-linux-gnu.so
(gdb) quit

liujuncheng added a commit that referenced this pull request Jun 3, 2021
Co-authored-by: Juncheng <liujuncheng1022@gmail.com>
Former-commit-id: 9966d1d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants