Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.12+ breaks backwards compatibility for logging QueueHandler with some Queue classes #119819

Closed
shmilee opened this issue May 31, 2024 · 20 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@shmilee
Copy link

shmilee commented May 31, 2024

Bug report

Bug description:

Related bug: #111615
Related pull: #111638
Related codes:

if 'queue' in config:
from multiprocessing.queues import Queue as MPQueue
qspec = config['queue']
if not isinstance(qspec, (queue.Queue, MPQueue)):
if isinstance(qspec, str):
q = self.resolve(qspec)
if not callable(q):
raise TypeError('Invalid queue specifier %r' % qspec)
q = q()
elif isinstance(qspec, dict):
if '()' not in qspec:
raise TypeError('Invalid queue specifier %r' % qspec)
q = self.configure_custom(dict(qspec))
else:
raise TypeError('Invalid queue specifier %r' % qspec)
config['queue'] = q

reproducible example using Python 3.12.3

import logging.config


def main(q):
    config = {
        'version': 1,
        'handlers': {
            'sink': {
                'class': 'logging.handlers.QueueHandler',
                'queue': q,
            },
        },
        'root': {
            'handlers': ['sink'],
        },
    }
    logging.config.dictConfig(config)


if __name__ == '__main__':
    import multiprocessing as mp
    main(mp.Manager().Queue())
    #import asyncio
    #main(asyncio.Queue())  # broken too

error:

Traceback (most recent call last):
  File "/usr/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/logging/config.py", line 801, in configure_handler
    raise TypeError('Invalid queue specifier %r' % qspec)
TypeError: Invalid queue specifier <AutoProxy[Queue] object, typeid 'Queue' at 0x7f8b07189d90>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/xxx/logging-queue-handler-bug.py", line 33, in <module>
    main(mp.Manager().Queue())
  File "/home/xxx/logging-queue-handler-bug.py", line 29, in main
    logging.config.dictConfig(config)
  File "/usr/lib/python3.12/logging/config.py", line 914, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.12/logging/config.py", line 588, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'sink'

Queue classes to check for logging QueueHandler

  • queue.Queue(), <queue.Queue at 0x7fb86eeef170>, works.
  • multiprocessing
    • multiprocessing.Queue(), <multiprocessing.queues.Queue at 0x7fb871790500>, works.
    • multiprocessing.Manager().Queue() <AutoProxy[Queue] object, typeid 'Queue' at 0x7fb86ef4a840>, broken.
      Its class is multiprocessing.managers.AutoProxy[Queue], a subclass of multiprocessing.managers.BaseProxy.
  • asyncio.queues.Queue() <Queue at 0x7fb86f0be4e0 maxsize=0>, broken.
  • any other Queue?

discuss: how to fix

Check all Queue classes mentioned above in

if not isinstance(qspec, (queue.Queue, MPQueue)):

or just focus on handling special cases: str and dict, while leaving other cases un-checked (e.g., queue.Queue, multiprocessing.queues.Queue).

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Linked PRs

@shmilee shmilee added the type-bug An unexpected behavior, bug, or error label May 31, 2024
@Eclips4 Eclips4 added the stdlib Python modules in the Lib dir label May 31, 2024
shmilee added a commit to shmilee/gdpy3 that referenced this issue May 31, 2024
* wait for python/cpython#119819

* temporarily fix error in gdpy3:

```
Process ForkPoolWorker-246:
Traceback (most recent call last):
  File "/usr/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/logging/config.py", line 801, in configure_handler
    raise TypeError('Invalid queue specifier %r' % qspec)
TypeError: Invalid queue specifier <AutoProxy[Queue] object, typeid 'Queue' at 0x7f43d2fee7e0>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 109, in worker
    initializer(*initargs)
  File "/usr/lib/python3.12/site-packages/gdpy3/glogger.py", line 167, in __call__
    logging.config.dictConfig(self.glogger_config)
  File "/usr/lib/python3.12/logging/config.py", line 914, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.12/logging/config.py", line 588, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'queue'
```
@picnixz
Copy link
Contributor

picnixz commented May 31, 2024

I'm not really sure about this but would it actually make sense to only check for protocol-based queues and not the actual types of queues? strictly speaking, we only need a queue-like interface right? If so, I can work on that one.

@shmilee
Copy link
Author

shmilee commented May 31, 2024

After reading the codes above and this pull request, I tend to not check the queue or queue-like classes, as the main purpose of the code from lines 789 to 804 is to handle the special cases of isinstance(qspec, str) and isinstance(qspec, dict) and then update config['queue'] to be a valid queue instance.
Other cases are considered that config['queue'] is already a queue, like an instance of queue.Queue or any other Queue type, which is the common use case in the previous version of Python.

Perhaps we can ask @vsajip for some advice. Is there anything else we should consider?

@vsajip
Copy link
Member

vsajip commented May 31, 2024

Can you explain the use cases which require multiprocessing.Manager().Queue() (the AutoProxy)? Also, since logging isn't async, why would you want to pass an asyncio.queues.Queue()?

@shmilee
Copy link
Author

shmilee commented Jun 1, 2024

I have only used multiprocessing.Manager().Queue(). The asyncio.queues.Queue() mentioned is just a possible use case I imagined after searching the standard library for queues. So I can only talk about the former.

Using Python 3.12 on Linux, I change manager.Queue() (manager is a Manager instance ) to multiprocessing.Queue(), in my codes logging also works. So, currently, it seems that using manager.Queue() is not mandatory.
The reason for choosing it before is that in previous versions of Python, multiprocessing.Queue() may cause some errors related to multiprocessing Pool, spawn, and fork things, on Linux or Windows.

Sorry for leaking an example code with logging, because tests on Windows, and installing previous versions of Python, may take a lot of time. However, some information about using use a queue created with a manager to avoid some potential issues can be provided below after I searched:

  1. python-multiprocessing-queue-vs-multiprocessing-manager-queue
  2. https://docs.python.org/3.12/library/multiprocessing.html#pipes-and-queues

    If they really bother you then you can instead use a queue created with a manager.

  3. google keywords: RuntimeError: Queue objects should only be shared between processes through inheritance.

@vsajip
Copy link
Member

vsajip commented Jun 1, 2024

Do you have an actual use case where you're running into problems? Or is it just a theoretical exercise? If an actual use case, please give details.

@shmilee
Copy link
Author

shmilee commented Jun 1, 2024

Here is an actual use case,and quickly show the problem:

cd /tmp/
git clone --depth 1 --branch v0.8.3 https://github.com/shmilee/gdpy3.git
cd gdpy3/
python3.12 -m unittest src/processors/tests/test_multiprocessor.py

Then I will get errors from more and more ForkPoolWorker-XXX, XXX is an increasing number.

Process ForkPoolWorker-210:
Traceback (most recent call last):
  File "/usr/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/logging/config.py", line 801, in configure_handler
    raise TypeError('Invalid queue specifier %r' % qspec)
TypeError: Invalid queue specifier <AutoProxy[Queue] object, typeid 'Queue' at 0x7f5250423d70>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 109, in worker
    initializer(*initargs)
  File "/tmp/gdpy3/src/glogger.py", line 167, in __call__
    logging.config.dictConfig(self.glogger_config)
  File "/usr/lib/python3.12/logging/config.py", line 914, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.12/logging/config.py", line 588, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'queue'

@shmilee
Copy link
Author

shmilee commented Jun 1, 2024

  1. Using multiprocessing.Queue() can fix it. see here

  2. I also test this patch on logging/config.py. The error also disappears.
    (cannot find class AutoProxy[Queue] to use in managers module, so just test with BaseProxy)

--- config.py	2024-05-31 10:16:24.905893630 +0800
+++ config.py-fix3.12	2024-05-31 13:18:07.223555882 +0800
@@ -786,8 +786,11 @@
                     # raise ValueError('No handlers specified for a QueueHandler')
                 if 'queue' in config:
                     from multiprocessing.queues import Queue as MPQueue
+                    from multiprocessing.managers import BaseProxy as MPBaseProxy                    
                     qspec = config['queue']
-                    if not isinstance(qspec, (queue.Queue, MPQueue)):
+                    print('============', isinstance(qspec, MPBaseProxy))
+                    print(qspec, type(qspec))
+                    if not isinstance(qspec, (queue.Queue, MPQueue, MPBaseProxy)):

@vsajip
Copy link
Member

vsajip commented Jun 1, 2024

No, that's not the use case I meant. What I meant is, do you have a real problem where you have to use the proxy and can't use multiprocessing.Queue?

@shmilee
Copy link
Author

shmilee commented Jun 1, 2024

No. For me, using logging with multiprocessing.Queue also works fine.

@shmilee
Copy link
Author

shmilee commented Jun 1, 2024

And I think using the proxy "manager.Queue()" should not be excluded / droped, it was working fine before 3.12.
Even in version 3.12, there is still a way to bypass the existing restrictions (Line-792), using the much more powerful dict qspec configuration. About the listener part, I haven't figured out how to set up a "manager.Queue()" together with handler yet.

import logging.config
import multiprocessing as mp
import asyncio

def main(i, qd):
    print('%d. qspec dict: ' % i, qd)
    config = {
        'version': 1,
        'handlers': {
            'sink': {
                'class': 'logging.handlers.QueueHandler',
                'queue': qd,
            },
        },
        'root': {
            'handlers': ['sink'],
        },
    }
    logging.config.dictConfig(config)
    l = logging.getLogger()
    s = l.handlers[0]
    print('USE: ', id(s.queue), s.queue, type(s.queue))
    print('------')

def get_mp_queue(manager=None, maxsize=None):
    q = manager.Queue(maxsize)
    print('GET: ', id(q), q)
    return q

def get_asy_queue(maxsize=None):
    q = asyncio.Queue(maxsize)
    print('GET: ', id(q), q)
    return q

if __name__ == '__main__':
    m = mp.Manager()
    main(1, {'()': get_mp_queue, 'manager': m, 'maxsize': 10})
    main(2, {'()': m.Queue, 'maxsize': 20})
    main(3, {'()': get_asy_queue, 'maxsize': 10})  # not async, so the user is responsible for his choice.

Output:

1. qspec dict:  {'()': <function get_mp_queue at 0x7fd645687100>, 'manager': <multiprocessing.managers.SyncManager object at 0x7fd644d2c440>, 'maxsize': 10}
GET:  140558252776848 <queue.Queue object at 0x7fd644bccbf0>
USE:  140558252776848 <queue.Queue object at 0x7fd644bccbf0> <class 'multiprocessing.managers.AutoProxy[Queue]'>
------
2. qspec dict:  {'()': <bound method BaseManager.register.<locals>.temp of <multiprocessing.managers.SyncManager object at 0x7fd644d2c440>>, 'maxsize': 20}
USE:  140558252622480 <queue.Queue object at 0x7fd644bccf50> <class 'multiprocessing.managers.AutoProxy[Queue]'>
------
3. qspec dict:  {'()': <function get_asy_queue at 0x7fd645532200>, 'maxsize': 10}
GET:  140558252955872 <Queue maxsize=10>
USE:  140558252955872 <Queue maxsize=10> <class 'asyncio.queues.Queue'>
------

@vsajip
Copy link
Member

vsajip commented Jun 2, 2024

And I think using the proxy "manager.Queue()" should not be excluded / droped

Why should it not be excluded, since multiprocessing.Queue () is available to be used? You can't check the class easily as AutoProxy could be used to proxy other things, not just queues.

@picnixz
Copy link
Contributor

picnixz commented Jun 2, 2024

My 2 cents: one concrete use case I can imagine is when you want to pass that queue to a pool without using a global variable and a custom initializer. AFAIK, you need a manager-based queue since multiprocessing.Queue cannot be pickled. There are other (possibly rare) issues mentioned at the end of the https://docs.python.org/3.10/library/multiprocessing.html#pipes-and-queues paragraph, so maybe some people might be affected?

When I suggested the protocol-based approach, it was mostly to deal with the issue arising from the proxy interface because checking the runtime type would not help in that case.

@shmilee
Copy link
Author

shmilee commented Jun 3, 2024

Why should it not be excluded, since multiprocessing.Queue () is available to be used?

Except for the mentioned reasons, 1) some people may get errors when updating to py3.12 due to their previous code using proxy queue, 2) py3.12 excludes but doesn't completely exclude proxy queue (dict qspec), I don't have any other new reasons.

You can't check the class easily as AutoProxy could be used to proxy other things, not just queues.

Yes, that's true and makes checking the type of proxy indeed tricky.

  1. If it really needs to be checked, I think picnixz's suggestion is quite good.
    (PS: In my personal code, I might directly use its __class__ to check the class, because, for personal projects, I treat the class names in the standard library basically unchanged. Of course, in a standard library this will be weird.)

  2. If the final decision is not to check proxy, perhaps the documentation should explicitly mention it.
    Because even though its class name is <class 'multiprocessing.managers.AutoProxy[Queue]'>, it still is a <queue.Queue object at xxx>. Perhaps, besides me, someone also believes they are the same AutoProxy[Queue] can be treated as queue.Queue.

SyncManager.register('Queue', queue.Queue)
SyncManager.register('JoinableQueue', queue.Queue)

@vsajip
Copy link
Member

vsajip commented Jun 4, 2024

Please check if the changes in the linked PR #120030 fix the problem reported..

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 4, 2024
…ultipr… (pythonGH-120030)

(cherry picked from commit 99d945c)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 4, 2024
…ultipr… (pythonGH-120030)

(cherry picked from commit 99d945c)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
vsajip pushed a commit that referenced this issue Jun 4, 2024
vsajip pushed a commit that referenced this issue Jun 4, 2024
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 4, 2024
…le. (pythonGH-120067)

(cherry picked from commit 109e108)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 5, 2024
…tiproc… (pythonGH-120090)

(cherry picked from commit 983efcf)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
pythongh-119819: Update logging configuration to support joinable multiprocessing manager queues.
vsajip pushed a commit that referenced this issue Jun 5, 2024
vsajip pushed a commit that referenced this issue Jun 5, 2024
barneygale pushed a commit to barneygale/cpython that referenced this issue Jun 5, 2024
barneygale pushed a commit to barneygale/cpython that referenced this issue Jun 5, 2024
barneygale pushed a commit to barneygale/cpython that referenced this issue Jun 5, 2024
…tiproc… (pythonGH-120090)

pythongh-119819: Update logging configuration to support joinable multiprocessing manager queues.
freakboy3742 added a commit that referenced this issue Jun 15, 2024
…sing subprocess support (#120476)

Skip tests that require multiprocessing subprocess support.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 15, 2024
…processing subprocess support (pythonGH-120476)

Skip tests that require multiprocessing subprocess support.
(cherry picked from commit 92f6d40)

Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 15, 2024
…processing subprocess support (pythonGH-120476)

Skip tests that require multiprocessing subprocess support.
(cherry picked from commit 92f6d40)

Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
freakboy3742 added a commit that referenced this issue Jun 15, 2024
…iprocessing subprocess support (GH-120476) (#120531)

gh-119819: Conditional skip of logging tests that require multiprocessing subprocess support (GH-120476)

Skip tests that require multiprocessing subprocess support.
(cherry picked from commit 92f6d40)

Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
freakboy3742 added a commit that referenced this issue Jun 15, 2024
…iprocessing subprocess support (GH-120476) (#120532)

gh-119819: Conditional skip of logging tests that require multiprocessing subprocess support (GH-120476)

Skip tests that require multiprocessing subprocess support.
(cherry picked from commit 92f6d40)

Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
mrahtz pushed a commit to mrahtz/cpython that referenced this issue Jun 30, 2024
…processing subprocess support (python#120476)

Skip tests that require multiprocessing subprocess support.
noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
…tiproc… (pythonGH-120090)

pythongh-119819: Update logging configuration to support joinable multiprocessing manager queues.
noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
…processing subprocess support (python#120476)

Skip tests that require multiprocessing subprocess support.
@noreentry
Copy link

noreentry commented Jul 13, 2024

multiprocessing.Pool with spawn method breaks logging.handlers.QueueHandler configuration

import logging
import logging.config
import multiprocessing
import time


def _init(q):
    logging.config.dictConfig({
        'version': 1,
        'disable_existing_loggers': True,
        'handlers': {'log_to_parent': {'class': 'logging.handlers.QueueHandler', 'queue': q}},
        'root': {'handlers': ['log_to_parent'], 'level': 'DEBUG'}
    })
    logging.getLogger().info('log from child')


if __name__ == '__main__':    
    multiprocessing.set_start_method('spawn')
    with multiprocessing.Manager() as manager:
        # q = manager.Queue()
        q = multiprocessing.Queue()
        
        # listen for log messages from child processes
        # ...

        with multiprocessing.Pool(processes=1, maxtasksperchild=1, initializer=_init, initargs=(q,)) as pool:
            time.sleep(1)

got exception (when q = manager.Queue() or q = multiprocessing.Queue())

Process SpawnPoolWorker-2:
Traceback (most recent call last):
  File "/home/foo/miniconda3/envs/py312forge/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/foo/miniconda3/envs/py312forge/lib/python3.12/logging/config.py", line 792, in configure_handler
    proxy_queue = MM().Queue()
                  ^^^^
  File "/home/foo/miniconda3/envs/py312forge/lib/python3.12/multiprocessing/context.py", line 57, in Manager
    m.start()
  File "/home/foo/miniconda3/envs/py312forge/lib/python3.12/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/home/foo/miniconda3/envs/py312forge/lib/python3.12/multiprocessing/process.py", line 118, in start
    assert not _current_process._config.get('daemon'), \
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: daemonic processes are not allowed to have children

The above exception was the direct cause of the following exception:
...

@vsajip
Copy link
Member

vsajip commented Jul 13, 2024

Not sure this is the same error - what exact version of 3.12 is in use? What happens with 3.13?

@noreentry
Copy link

it works on 3.12.3
doesn't work on 3.12.4
can't test it on 3.13

estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…tiproc… (pythonGH-120090)

pythongh-119819: Update logging configuration to support joinable multiprocessing manager queues.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…processing subprocess support (python#120476)

Skip tests that require multiprocessing subprocess support.
freakboy3742 added a commit to freakboy3742/cpython that referenced this issue Aug 5, 2024
…e multiprocessing subprocess support (python#120476)

Skip tests that require multiprocessing subprocess support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

5 participants