Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not log warnings to stdout #3135

Closed
apyrgio opened this issue Feb 6, 2024 · 5 comments
Closed

Do not log warnings to stdout #3135

apyrgio opened this issue Feb 6, 2024 · 5 comments
Labels

Comments

@apyrgio
Copy link
Contributor

apyrgio commented Feb 6, 2024

Description of the bug

What happened?

If one installs the latest version of PyMuPDF, and they initialize a Pixmap object using Python bytes, they will see a log message in the process' stdout:

__init__: using mupdf.python_buffer_data()

(note, this applies only to the new fitz module, not fitz_old)

The underlying cause is that PyMuPDF has the following logging function:

PyMuPDF/src/__init__.py

Lines 31 to 37 in 0a6f26d

def log( text, caller=1):
frame_record = inspect.stack( context=0)[ caller]
filename = os.path.relpath(frame_record.filename)
line = frame_record.lineno
function = frame_record.function
print( f'{filename}:{line}:{function}: {text}', file=sys.stdout)
sys.stdout.flush()

This function does not employ regular Python logging, but simply writes to stdout. The stdout of a process though is not the place to write warnings, since it may interfere with binary data.

How to reproduce the bug

What I expected to happen?

Initializing a Pixmap object should not write anything to stdout.

How to reproduce it?

  1. Import the fitz_new module
  2. Initialize a Pixmap class using a bytes object for the samples argument (see Pixmap constructor)

Or simply run:

import fitz                                                                                                                                                                                                                    
fitz.Pixmap(fitz.Colorspace(fitz.CS_RGB), 10, 10, b'a' * 100 * 3, False)  

PyMuPDF version

1.23.14

Operating system

Linux

Python version

3.11

@apyrgio
Copy link
Contributor Author

apyrgio commented Feb 6, 2024

Also, one thing I'm not sure about is if this log() invocation is necessary at all, or if it's a dev relic. It's probably best to remove it altogether.

apyrgio added a commit to apyrgio/PyMuPDF that referenced this issue Feb 6, 2024
Do not use sys.stdout / sys.stderr directly, since it's difficult to
have common logging logic this way. Instead, replace these functions
with just our log() helper.

Fixes pymupdf#3135
@apyrgio
Copy link
Contributor Author

apyrgio commented Mar 7, 2024

If someone else has stumbled on this issue, note that you can circumvent it with https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout


@julian-smith-artifex-com: Following up on your comment here, since I can no longer comment in the closed PR:

I'll discuss internally about your use of PyMuPDF in a pipeline. If you're doing it, it needs to be supported.

Cool, thanks a lot for taking this use case into consideration. I'll follow up with a new PR in the meantime.

Perhaps we could control where logs/diagnostics go using environment variables, as i can't see a single solution that will address the different concerns that we've been discussing here.

The way that core Python libraries like urllib3 are handling this is via the logging infrastructure. By default nothing is logged, but if the user wants to see log messages, they can enable them via:

requests_log = logging.getLogger("requests.packages.urllib3")                                                                                                                                                                         
requests_log.setLevel(logging.DEBUG)                                                                                                                                                                                                  
requests_log.propagate = True

(this snippet actually targets requests, which uses urllib3 under the hood)

If the user wants even more logging (e.g., request/response bodies), they can further enable them with a switch:

http.client.HTTPConnection.debuglevel = 1

Just sharing those two ways in case they help.

apyrgio added a commit to apyrgio/PyMuPDF that referenced this issue Mar 7, 2024
Comment out a debug statement, so that PyMuPDF does not write to stdout
whenever a user initializes a Pixmap object.

Refs pymupdf#3135
julian-smith-artifex-com pushed a commit to apyrgio/PyMuPDF that referenced this issue Mar 8, 2024
Comment out a debug statement, so that PyMuPDF does not write to stdout
whenever a user initializes a Pixmap object.

Refs pymupdf#3135
julian-smith-artifex-com pushed a commit that referenced this issue Mar 8, 2024
Comment out a debug statement, so that PyMuPDF does not write to stdout
whenever a user initializes a Pixmap object.

Refs #3135
@julian-smith-artifex-com
Copy link
Collaborator

I've just pushed a change to branch main that should allow normal diagnostics ("messages") and debug/dev diagnostics ("logs") to be separately sent to file descriptors or files using environmental variables.

# PYMUPDF_MESSAGE controls the destination of user messages (the `message()`
# function).
#
# PYMUPDF_LOG controls the destination of internal development logging (the
# `log()` function).
#
# Each should be either `fd:<int>` to set to a file descriptor (e.g. `fd:1`
# for stdout, `fd:2` for stderr), `path:<string>` to write to a file or
# `path+:<string>` to append to a file. If not specified, the default is
# stdout.

Most of the time i'd expect only PYMUPDF_MESSAGE to be used. For example, setting it to fd:2 will send all user messages to stderr.

Regarding the logging module, i want to like it because it comes with Python, but i really struggle with its design and complexity; most of it seems to be stuff that i'd never want to use. PyMuPDF has simple messages which have worked for ages and unless we end up having a need for more control, we don't really need logging i think.

@apyrgio
Copy link
Contributor Author

apyrgio commented Mar 13, 2024

That's a great development, thanks for giving us this option Julian. I'll check it out once the new PyMuPDF release is out. In the meantime, feel free to close this issue.

@julian-smith-artifex-com
Copy link
Collaborator

Fixed in 1.24.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants