-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Messages are getting overlapped when logging in the sys.stderr using multiprocessing #108
Comments
Hey @chkoar, thanks for the bug report! I can reproduce this, but only when using Windows (works fine with Linux). Are you using Windows too? |
Correct. I can confirm that in Linux it is working as it was intended to be. |
There is surely something that can be done about this, but I don't know what exactly yet. 😛 This is related to the lack of Everything that is outside of Putting I will need to investigate this further, notably by looking at the recipes from the standard logging library: Logging to a single file from multiple processes. In the meantime, you can add |
Well, actually adding
I think it should be stated in the docs that if you use multiprocessing with |
Instead of updating the documentation, I think I should find the proper implementation to prevent logs to be mixed up, no matter what is the value of Also, I think setting |
Ok, I think I better understand what is going on. So here are a few notes to myself.
Many headaches in perspective. :) |
@Delgan Should the use of loguru in a multiprocessing-context on Windows be discouraged then? I ran into a similar issue, see the following snippet: import time
import multiprocessing
from loguru import logger
LOGGER_CONFIG = {
"handlers": [
dict(
sink="test.log",
serialize=True,
enqueue=True,
level=20,
),
],
}
class BusyLogProcess(multiprocessing.Process):
"""A process that logs very fast."""
def run(self):
print(f"Busy {logger}")
print(id(logger))
while True:
logger.info("a")
time.sleep(0.01)
if __name__ == "__main__":
#multiprocessing.set_start_method("spawn")
# Set up logger
print(logger._handlers)
logger.configure(**LOGGER_CONFIG)
logger.info("testing in main")
busy_log_process = BusyLogProcess()
busy_log_process.start()
# Let BusyLogProcess create some log entries
time.sleep(0.5)
busy_log_process.terminate()
print(logger._handlers) When running this under Linux, all log entries go into |
Hi @pylipp. Thanks for the code snippet, it is another useful illustration of the problem. I really did not expect You are right, unfortunately it seems that Loguru's I'm thinking to a possible workaround, but it's still not precisely defined in my mind. |
I'm not getting any log messages from the child processes (created with multiprocessing.Pool), even with enqueue set to True. :/ This is my config: logger.remove(0) And yes, I'm also on Windows.. |
Hey @Joolius, would this be possible to please share a complete code snippet, with |
@Delgan I see a similar issue when using Multiprocessing. Messages via sys.stderr overlap. Plus any messages sent by my child libraries are not captured My child libraries are using traditional logging, since I thought I could work around the problem, no luck. I tired to relocate the handler generation inside of main but no luck. I also tried my own queue handler too. Oddly before I moved to loguru, it worked okay in the normal logging module with the addition of a custom queue handler (below). Here are the relevant sections of code: def main():
arguments = docopt(__doc__, version='1.0.0')
logfile = arguments["--logfile"]
#Logging Set-Up Loguru
logger.remove()
#Intercept Logging Messages
logging.basicConfig(handlers=[InterceptHandler()], level=0)
logger.add(MultiprocessingLogHandler('c:\dev\mystreamhandler.log'), level='DEBUG')
if arguments["debug"]:
logger.add(sys.stdout, level='DEBUG', filter='__main__')
else:
logger.add(sys.stdout, level='INFO', filter='__main__')
if logfile is not None:
try:
if arguments["debug"]:
logger.add(logfile, level="DEBUG", enqueue=True, retention="10 days", backtrace=True, diagnose=True)
else:
logger.add(logfile, level="INFO", enqueue=True, retention="10 days")
except Exception as e:
logger.exception('Failed to add file handlers')
###DO OTHER THINGS
for index, row in records.iterrows():
results = pool.apply_async(func=calculations.Run_Allocation_Passes,
args=(tm1, max_passes, row['Scenario'], row['Cluster'], row['Time Monthly'],
row['Service'], output_file_path,),
callback=update)
logger.debug('this will work') Logging inside the main module works, but sub_modules called via the apply_async, seem to go no where. As I mentioned the sub modules are using logging like this: import logging
logger = logging.getLogger(__name__) Custom handler that worked with traditional logging class MultiprocessingLogHandler(logging.Handler):
"""multiprocessing log handler
This handler makes it possible for several processes
to log to the same file by using a queue.
"""
def __init__(self, fname):
logging.Handler.__init__(self)
self._handler = FH(fname)
self.queue = multiprocessing.Queue(-1)
thrd = threading.Thread(target=self.receive)
thrd.daemon = True
thrd.start()
def setFormatter(self, fmt):
logging.Handler.setFormatter(self, fmt)
self._handler.setFormatter(fmt)
def receive(self):
while True:
try:
record = self.queue.get()
self._handler.emit(record)
except (KeyboardInterrupt, SystemExit):
raise
except EOFError:
break
except:
traceback.print_exc(file=sys.stderr)
def send(self, s):
self.queue.put_nowait(s)
def _format_record(self, record):
if record.args:
record.msg = record.msg % record.args
record.args = None
if record.exc_info:
dummy = self.format(record)
record.exc_info = None
return record
def emit(self, record):
try:
s = self._format_record(record)
self.send(s)
except (KeyboardInterrupt, SystemExit):
raise
except:
self.handleError(record)
def close(self):
self._handler.close()
logging.Handler.close(self) |
Hey @rclapp, thanks for the detailed explanations! The problem is that while using This is why, on Windows, the Also, concerning the interoperability with standard logging: your |
Ok, I finally managed to fix that. 😛 Dealing with I had to rethink the management of handlers (see #157), it's not perfect but the final solution is satisfactory enough in my opinion. Basically, you will need to make your child processes "inherit" from the parent logger. It should not be passed as an argument once the child is started, it should be passed during construction of the child. To do so, most So, the initial snippet of this issue may look like this once updated: import sys
import time
import random
from concurrent.futures import ProcessPoolExecutor
from loguru import logger
def set_logger(logger_):
global logger
logger = logger_
def worker(seconds):
time.sleep(seconds)
logger.success("My function executed successfully")
def main():
with ProcessPoolExecutor(initializer=set_logger, initargs=(logger, )) as executor:
seconds = [random.randint(1, 3) for i in range(10)]
executor.map(worker, seconds)
if __name__ == "__main__":
logger.remove()
logger.add(sys.stderr, enqueue=True)
main() I added some code snippets and explanation in the documentation, I highly recommend to read it if you need to deal with I will leave this issue open until next |
@Delgan that is great! When do you plan to make a new release? |
@chkoar My "goal" was to publish |
wow, amazing, looking forward! sorry for not having posted my code. will test this out once released. |
BTW this problem also happens on linux when using the 'forkserver' spawn method. |
I refactored some of my code to leverage the class and static method "method". When I have the sys.stdout as one of my sinks prior to initilizeing my pool I get a pickle error. If I initialize my pool then add sys.stdout as a sink it works okay `` File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 2108, in File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 2099, in main File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 1408, in run File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 1415, in _exec File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile File "C:/Dev/AWS_Management_Allocations/src\Management_Allocations.py", line 331, in
File "C:/Dev/AWS_Management_Allocations/src\Management_Allocations.py", line 291, in run_allocations File "C:/Dev/AWS_Management_Allocations/src\Management_Allocations.py", line 226, in run_allocations File "C:/Dev/AWS_Management_Allocations/src\Management_Allocations.py", line 145, in init_processing_pool File "C:/Dev/AWS_Management_Allocations/src\Management_Allocations.py", line 142, in init_processing_pool File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 119, in Pool File "C:\Program Files\Python37\lib\multiprocessing\pool.py", line 176, in init File "C:\Program Files\Python37\lib\multiprocessing\pool.py", line 241, in _repopulate_pool File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen File "C:\Program Files\Python37\lib\multiprocessing\popen_spawn_win32.py", line 89, in init File "C:\Program Files\Python37\lib\multiprocessing\reduction.py", line 60, in dump TypeError: cannot serialize '_io.TextIOWrapper' object |
@Joolius That's an interesting new module. However, I'm not sure it's suitable for Loguru. It's useful to share data between processes, but it requires to be serialized to a bytes, and it doesn't do much else. Actually, I think the @rclapp Indeed, this is the "expected" behavior with the new version. While initializing sub-processes on Windows, Python requires the init arguments to be picklable. However, |
@Delgan . Thanks, I forgot to add that parameter, that resolves the error. However I now have a new set of errors. Any light you can shed on this would be helpful.
|
@rclapp Hum... Would this be possible to see the handlers you are using please? |
Sure thing:
```python
# Logging Set-Up Loguru
logger.remove()
#Intercept Logging Messages
#Disabled for testing
#logging.basicConfig(handlers=[InterceptHandler()], level=0)
if arguments["debug"]:
#logger.add(sys.stdout, level='DEBUG')
logger.add(sys.stdout, level='DEBUG', filter='__main__', enqueue=True)
else:
logger.add(sys.stdout, level='INFO', filter='__main__', enqueue=True)
try:
os.makedirs(logfile.parents[0], exist_ok=True)
logger.info(f"Directory {logfile.parents[0]} created successfully")
except OSError as error:
logger.exception(f"Directory {logfile.parent()} can not be created", exc_info=True)
if logfile is not None:
try:
if arguments["debug"]:
logger.add(logfile, level="DEBUG", enqueue=True, retention="10 days", backtrace=True, diagnose=True)
else:
logger.add(logfile, level="INFO", enqueue=True, retention="10 days", backtrace=True, diagnose=True)
logging.info('logging setup complete')
# return listener
except Exception as e:
logger.exception('Failed to add file handlers', exc_info=True)
```
|
@rclapp Thanks. It does not seem to be related to your handlers. It's hard to tell without seeing your whole code, but I bet you are somehow attaching an non-picklable instance to the When Actually, I have been misled by the logging message I think you need to figure out where is such "non picklable" object bound to the |
I can finally close this issue! As a reminder:
|
Got error: cannot pickle '_io.TextIOWrapper' object......Any idea? |
@wytxty The If you need further assistance, please open a new issue and detail your configuration. |
I am trying to log to
sys.stderr
using multiprocessing but the log messages are getting overlapped. Is this the intended behavior?Here is the code that generated the above output.
Cheers,
Chris
The text was updated successfully, but these errors were encountered: