Skip to content

Ensure that multiprocessing.util.get_temp_dir() can be used to create socket files with limited path length #132124

Open
@Poupine

Description

@Poupine

Bug report

Bug description:

I must admit that this issue only happens in very specific conditions, namely when using the multiprocessing lib on a system where the temp folder is set to a long path by the user. One could argue this is a user error, but bear with me a second.
The problem is in multiprocessing.util.get_temp_dir

def get_temp_dir():
    # get name of a temp directory which will be automatically cleaned up
    tempdir = process.current_process()._config.get('tempdir') # This could be set to an arbitrary long path by users
    if tempdir is None:
        import shutil, tempfile
        tempdir = tempfile.mkdtemp(prefix='pymp-')
        info('created temp directory %s', tempdir)
        # keep a strong reference to shutil.rmtree(), since the finalizer
        # can be called late during Python shutdown
        Finalize(None, _remove_temp_dir, args=(shutil.rmtree, tempdir),
                 exitpriority=-100)
        process.current_process()._config['tempdir'] = tempdir
    return tempdir

This function is used in multiprocessing.connect.arbitrary_address

def arbitrary_address(family):
    '''
    Return an arbitrary free address for the given family
    '''
    if family == 'AF_INET':
        return ('localhost', 0)
    elif family == 'AF_UNIX':
        return tempfile.mktemp(prefix='listener-', dir=util.get_temp_dir())
    elif family == 'AF_PIPE':
        return tempfile.mktemp(prefix=r'\\.\pipe\pyc-%d-%d-' %
                               (os.getpid(), next(_mmap_counter)), dir="")
    else:
        raise ValueError('unrecognized family')

Where this is problematic is that arbitrary_address is used in multiprocessing.forkserver.Forkserver.ensure_running which uses the address provided by arbitrary_address to create and bind to a socket

with socket.socket(socket.AF_UNIX) as listener:
    address = connection.arbitrary_address('AF_UNIX')
    listener.bind(address)
    if not util.is_abstract_socket_namespace(address):
        os.chmod(address, 0o600)
    listener.listen()

Since UNIX sockets path have a limited character count between 92 and 108 [1], it would make sense for the std lib to have safe guards against generating paths that are too long to begin with. Sure, users can use work arounds such as setting TMP_DIR="/tmp" [2] but this requires each and every user of the multiprocessing library to be aware of such limitations.

I propose we fix the problem once and for all at the source, by checking the size of the path created by tempfile.mktemp(prefix='listener-', dir=util.get_temp_dir()) and should it be greater or equal to the max length supported by the platform, reverting to what get_temp_dir() does when tempdir=None

[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_un.h.html#tag_13_67_04
[2] https://patchwork.ozlabs.org/project/qemu-devel/patch/20220722182508.89761-2-peter@pjd.dev/#2938322

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.14bugs and security fixes3.15new features, bugs and security fixesstdlibPython modules in the Lib dirtopic-multiprocessingtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions