-
Notifications
You must be signed in to change notification settings - Fork 901
Fix singleton tmp files cleanup #13261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Only in singleton mode, directory cleaning needs to be done by the program itself. There are some problems with these parts of the code that cause the directory to not be cleaned. This commit fixes these issues. Signed-off-by: xbw <78337767+xbw22109@users.noreply.github.com>
The btl cleanup looks good; looks like that was a miss that we never noticed because pmix cleaned it up for us. I'm a little dubious of the session directory cleanup:
|
Yes - you remove proc, then job, then session (i.e., work your way upwards)
Can't say - however, PMIx and PRRTE always set that flag to "true". I suppose that in this one special case (a singleton), you could technically just call dirpath_destroy on the top-level session directory with recursive set to "true" since no other job can share the tree. 🤷♂️
The top session directory's name includes the pid of the singleton process - this is necessary to ensure that some other process doesn't attempt to remove it from underneath you. Accordingly, no other process can use it, so it is safe to remove it. |
Is this code specific to the singleton case? If so, I missed that. |
Yeah, it's a little convoluted, but it still is restricted to singleton case. You only set the "destroy_xxx" flag if we are a singleton and therefore created the session directory ourselves. So you only execute the destruct code if that flag is set, which means you have to be a singleton. You might want to double-check that the logic behind that didn't get changed and is correct. Just glancing, it looked like it was still okay. |
Thank you @rhc54 for the detailed explanation — I completely agree with your points and appreciate the context you provided.
Yes. And here is some additional information. When PRRTE is present, it is responsible for both the creation and cleanup of the session directory. In
If PRRTE is running correctly, In the In this case, Open MPI should create the session directory itself, handle its cleanup, and use the PID to distinguish the top-level session directory. These responsibilities were not fully handled in the original implementation. In the original implementation, the This structure requires additional directory parsing to perform a full cleanup, which the original implementation does not handle. As a result, the directory |
In addition, based on my observation, starting from Open MPI 5.0.x, the singleton no longer launches a thread that runs pmix server. This is why singleton processes can no longer clean up their paths upon abnormal termination. In contrast, for non-singleton programs, PRRTE or OpenPMIx takes care of path cleanup after an abnormal termination. So, what should we do? Adding a signal handler for singleton programs may not be a good idea, since it would be registered into the "user program" during MPI_Init. On the other hand, we would need an additional "registration system" to track and manage files created by sm. Personally, I lean toward documenting this behavior—indicating that when a singleton process terminates abnormally, we cannot guarantee the cleanup of temporary files. |
I’ve taken a closer look and have now understood it myself — it turns out to be just a variable initialization. There’s no need to follow up on this further. |
Errr...that's not true. You initialized PMIx, and so the PMIx library is indeed running its progress thread. Technically, you could follow the same code as in Bit of work, so up to you guys - suspect users will complain if it doesn't clean up, but... 🤷♂️ |
I hadn't noticed that before — I'll go take a look. My earlier question was based on the assumption that the PMIx library is not running at all in singleton mode in 5.0.x. If what you said is correct, then following the same code as in prte.c to handle file cleanup — rather than just documenting the behavior — would be the best solution. |
In singleton mode, directory cleaning needs to be done by the ompi library.
However, there are issues in the current implementation that prevent complete cleanup of the directory.
Singleton Mode Cleanup Behavior
When the program terminates normally
1. The session dir is only partially cleaned.(Fixed)
The directory
/tmp/ompi.<username>.<uid>
will still be left behind, along with some subdirectories.I modified the directory structure and enabled recursive deletion.
2. Segment files created by btl/sm are not properly removed.(Fixed)
In
ompi/opal/mca/btl/sm/btl_sm_module.c: function sm_finalize
, unlink will never be done, singleton or not. However, in non-singleton mode, the file will be cleaned by pmix afteropal_pmix_register_cleanup
succeeds.I registered the segment cleanup with OPAL, which is similar to
mca_btl_smcuda_component_init
.Question : Although not directly related to this pr. In
opal/mca/btl/sm/btl_sm_component.c
around line 421, is this correct? is this need to be removed?When the program terminates by ctrl-C (see issue #13002)
Question : No file cleaning up is possible in this case. Should we explicitly document this behavior or add a signal handling mechanism? It is not easy to add a signal handling mechanism in
btl/sm
.