Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stacklevel in logging to log the actual user call site (instead of the call site inside the logger wrapper) of log functions #2730

Merged
merged 2 commits into from
May 6, 2024

Conversation

luowyang
Copy link
Contributor

@luowyang luowyang commented May 1, 2024

Fix stacklevel in logging

Fixes #2114

Currently, accelerate.logging does not correctly log the user call site of log functions such as .log() and .info(). Consider the following example modified from issue #2114:

import logging

from accelerate import Accelerator
from accelerate.logging import get_logger
from rich.logging import RichHandler

logger = get_logger(__name__)

if __name__ == "__main__":
    accelerator = Accelerator()
    logging.basicConfig(
        format="%(filename)s:%(name)s:%(lineno)s - %(message)s",
        datefmt="%m/%d %H:%M:%S",
        level=logging.INFO,
        handlers=[RichHandler(show_time=True, show_level=True, show_path=True)],
    )
    logger.info("Test")

In its current state, the above script shows something like

05/01 19:26:22 INFO     logging.py:__main__:61 - Test

The calling site of the logging function .info() is incorrectly shown as line 61 of accelerate.logging.MultiProcessAdapter.log, which is the calling site inside the logger wrapper MultiProcessAdapter. This happens because the stacklevel argument of the logging function is set to its default value 1, which shows the calling site of the wrapped logger's .log(). See the docs of Python logging for more details of what stacklevel means.

There exists a simple fix to the above issue: modify stacklevel in MultiProcessAdapter.log to 2 (skip MultiProcessAdapter.log in the calling stack), unless the users explicitly provide their own stacklevel. This fix is summarized as one line of additional code:

kwargs.setdefault("stacklevel", 2)

After applying the above fix, the script mentioned earlier prints the correct message (test_logging.py is the script's name):

05/01 19:27:00 INFO     test_logging.py:__main__:17 - Test

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings. No. This PR is a bug fix and contains no API change.
  • Did you write any new necessary tests? No. This PR is a simple fix that does not change any expected behavior.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@kiyoon @BenjaminBossan

@muellerzr muellerzr requested a review from BenjaminBossan May 1, 2024 12:20
@BenjaminBossan
Copy link
Member

BenjaminBossan commented May 2, 2024

Sounds good, thanks for the fix. Would you be so kind to add a test for this bug? Your example should be a good template for testing.

The failing doc CI should be fixed if you merge with/rebase on main.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@luowyang
Copy link
Contributor Author

luowyang commented May 2, 2024

Sounds good, thanks for the fix. Would you be so kind to add a test for this bug? Your example should be a good template for testing.

The failing doc CI should be fixed if you merge with/rebase on main.

During writing tests for logging, I notice another possible bug: the processes hang if one passes main_process_only=True and in_order=True to the logging functions (and there is more than one process). An example is the following script:

import logging

from accelerate import Accelerator
from accelerate.logging import get_logger
from rich.logging import RichHandler

logger = get_logger(__name__)

if __name__ == "__main__":
    accelerator = Accelerator()
    logging.basicConfig(
        format="%(filename)s:%(name)s:%(lineno)s - %(message)s",
        datefmt="%m/%d %H:%M:%S",
        level=logging.INFO,
        handlers=[RichHandler(show_time=True, show_level=True, show_path=True)],
    )
    logger.info(f"process index: {accelerator.process_index}", in_order=True, main_process_only=True)

The reason is that all processes other than the main process wait for the main process in this case, but the main process goes into another conditional branch that never waits the remaining processes.

Since this PR is specific to stacklevel, I will leave this issue now, and omit it in the tests. But I think it should be fixed some time later.

@luowyang
Copy link
Contributor Author

luowyang commented May 2, 2024

Sounds good, thanks for the fix. Would you be so kind to add a test for this bug? Your example should be a good template for testing.

The failing doc CI should be fixed if you merge with/rebase on main.

@BenjaminBossan Tests are added for logging. However, note that some bugs exist with in_order as mentioned above, so I omit tests for it.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests to this PR. In fact, you added quite a lot of tests, since there are no existing dedicated tests for the logger (at least that I could find, ping @muellerzr).

However, I feel like this extensive testing detracts from the core issue that we want to check in this PR and I would suggest to create a separate PR to add general tests for the logger (and potentially solve the bug you found).

For this specific PR, I think a very simple, focused test should be sufficient, for instance:

import logging
import pytest

from accelerate import Accelerator
from accelerate.logging import MultiProcessAdapter, get_logger


@pytest.fixture(scope="module")
def accelerator():
    return Accelerator()


@pytest.mark.usefixtures("accelerator")
def test_log_stack(caplog):
    logger = get_logger(__name__)
    logging.basicConfig(
        format="%(filename)s:%(name)s:%(lineno)s - %(message)s",
        datefmt="%m/%d %H:%M:%S",
    )
    logger.warning("Test")
    assert len(caplog.records) == 1
    rec = caplog.records[0]
    assert rec.filename == <name of test file>
    # ... more attributes if so desired

Note that I use pytest to simplify some stuff and if you indeed want to create a separate PR to add general tests, I would recommend to take a similar approach. As is, I find the tests hard to understand, as they use a lot of custom classes and use the logger in a very different way from what an accelerate user would typically do.

)


def assert_equals(value, expected, prefix=""):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this, pytest already generates readable error messages.

Copy link
Contributor Author

@luowyang luowyang May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite an expert in testing, thanks for your comments. I have re-written the tests into two simple tests: 1) check the logged message is correct in a normal setting; and 2) check the logged message is correct when the user overrides stacklevel in a custom logger wrapper. These tests are most relevant to this PR, and hopefully they can serve as a starting point for future tests.

@luowyang luowyang force-pushed the fix-logging-stacklevel branch 2 times, most recently from f3bfc2c to 0667340 Compare May 3, 2024 19:03
@luowyang luowyang force-pushed the fix-logging-stacklevel branch from 0667340 to 17a8f45 Compare May 3, 2024 19:38
@luowyang luowyang requested a review from BenjaminBossan May 4, 2024 14:37
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for simplifying the tests, they are much easier to understand now.

Something I like to do with tests for bugfixes is to add a comment that references the original issue to quickly find out why a test exists, but it's not a must have.

@muellerzr please check if this can be merged.

@BenjaminBossan BenjaminBossan requested a review from muellerzr May 6, 2024 10:44
Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this and nicely done with the tests!

@muellerzr muellerzr merged commit 0f1b91b into huggingface:main May 6, 2024
23 checks passed
@luowyang luowyang deleted the fix-logging-stacklevel branch May 6, 2024 13:39
muellerzr pushed a commit that referenced this pull request May 10, 2024
… of the call site inside the logger wrapper) of log functions (#2730)

* fix stacklevel in logging to log info about the actual user callsite

* Add two tests for stacklevel in logging

---------

Co-authored-by: luowyang <luowyang@github.com>
muellerzr pushed a commit that referenced this pull request May 10, 2024
… of the call site inside the logger wrapper) of log functions (#2730)

* fix stacklevel in logging to log info about the actual user callsite

* Add two tests for stacklevel in logging

---------

Co-authored-by: luowyang <luowyang@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logging does not respect the "%(lineno)s" and "%(filename)s".
4 participants