Description
Bug report
Bug description:
I must admit that this issue only happens in very specific conditions, namely when using the multiprocessing lib on a system where the temp folder is set to a long path by the user. One could argue this is a user error, but bear with me a second.
The problem is in multiprocessing.util.get_temp_dir
def get_temp_dir():
# get name of a temp directory which will be automatically cleaned up
tempdir = process.current_process()._config.get('tempdir') # This could be set to an arbitrary long path by users
if tempdir is None:
import shutil, tempfile
tempdir = tempfile.mkdtemp(prefix='pymp-')
info('created temp directory %s', tempdir)
# keep a strong reference to shutil.rmtree(), since the finalizer
# can be called late during Python shutdown
Finalize(None, _remove_temp_dir, args=(shutil.rmtree, tempdir),
exitpriority=-100)
process.current_process()._config['tempdir'] = tempdir
return tempdir
This function is used in multiprocessing.connect.arbitrary_address
def arbitrary_address(family):
'''
Return an arbitrary free address for the given family
'''
if family == 'AF_INET':
return ('localhost', 0)
elif family == 'AF_UNIX':
return tempfile.mktemp(prefix='listener-', dir=util.get_temp_dir())
elif family == 'AF_PIPE':
return tempfile.mktemp(prefix=r'\\.\pipe\pyc-%d-%d-' %
(os.getpid(), next(_mmap_counter)), dir="")
else:
raise ValueError('unrecognized family')
Where this is problematic is that arbitrary_address is used in multiprocessing.forkserver.Forkserver.ensure_running
which uses the address provided by arbitrary_address to create and bind to a socket
with socket.socket(socket.AF_UNIX) as listener:
address = connection.arbitrary_address('AF_UNIX')
listener.bind(address)
if not util.is_abstract_socket_namespace(address):
os.chmod(address, 0o600)
listener.listen()
Since UNIX sockets path have a limited character count between 92 and 108 [1], it would make sense for the std lib to have safe guards against generating paths that are too long to begin with. Sure, users can use work arounds such as setting TMP_DIR="/tmp" [2] but this requires each and every user of the multiprocessing library to be aware of such limitations.
I propose we fix the problem once and for all at the source, by checking the size of the path created by tempfile.mktemp(prefix='listener-', dir=util.get_temp_dir())
and should it be greater or equal to the max length supported by the platform, reverting to what get_temp_dir()
does when tempdir=None
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_un.h.html#tag_13_67_04
[2] https://patchwork.ozlabs.org/project/qemu-devel/patch/20220722182508.89761-2-peter@pjd.dev/#2938322
CPython versions tested on:
3.11
Operating systems tested on:
Linux