Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(logging): Distinguish logs from different models #1302

Merged

Conversation

vtaskow
Copy link
Contributor

@vtaskow vtaskow commented Jul 17, 2023

Why

Inspecting logs from different ML models and knowing which log comes from which model is difficult. Currently, all such logs are grouped under the same logger - mlserver.

How

Generally, there are several ways to accomplish this behaviour(by using built-in adapter, formatter, filter). Using a custom logger formatter with context variables(used in metrics package with contextmanager) is a simple solution to this problem. Furthermore, external libraries or custom config files are not needed.

Outcome

In context scope, where the model name and version are set by the context vars, all logs will contain this format: [mlserver][model_name:model_version] LOGLEVEL MSG. E.g.
2023-07-17 14:23:10,102 [mlserver][mushroom-xgboost:v0.1.0] INFO - Loaded model 'mushroom-xgboost' succesfully.
Or if the model_version is not present:
2023-07-17 14:23:10,102 [mlserver][mushroom-xgboost] INFO - Loaded model 'mushroom-xgboost' succesfully.
Furthermore, enabling the model context for un/re/loading models and predicting

Resolves #602

@vtaskow vtaskow marked this pull request as ready for review July 17, 2023 15:20
mlserver/metrics/__init__.py Outdated Show resolved Hide resolved
@adriangonz adriangonz self-requested a review July 17, 2023 15:24
@adriangonz adriangonz requested a review from agrski July 18, 2023 08:21
Copy link
Contributor

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @vtaskow !! I've added a small comment, but this should be ready to go. 🚀

mlserver/logging.py Outdated Show resolved Hide resolved
@adriangonz adriangonz merged commit 0258a64 into SeldonIO:master Jul 19, 2023
Copy link
Contributor

@agrski agrski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing after the fact, but leaving a few suggestions that I think are worth discussing.

These are nice, simple changes - nice work @vtaskow 😄

I'm personally a big fan of structured, contextual logging, although I normally prefer it to be explicit--passing a configured logger into the appropriate logic. That's purely personal preference and a thought for offline discussion, rather than a comment on the changes in this PR.


model_name_var: ContextVar[str] = ContextVar("model_name")
model_version_var: ContextVar[str] = ContextVar("model_version")
from ..context import model_name_var, model_version_var

SELDON_MODEL_NAME_LABEL = "model_name"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Given that the name and version are used in both logging and metrics, it'd be nice to have constants (as far as Python allows) to ensure the same labels are available in both contexts. If the context vars are the source of truth, that be var.name and var.get(), which I think is pretty straightforward.

Comment on lines +41 to +42
A logging formatter that uses context variables to inject
the model name and version in the log message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the general pattern for Python docstrings is that the summary should be a single line, even for multi-line docstrings. E.g.

"""Log formatter incorporating model details, e.g. name and version."""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of this PEP rule. Good spot, I will refactor 👍

Comment on lines +47 to +49
if not name:
return ""
model_fmt = f"{name}:{version}" if version else name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙃 Personally I find this slightly hard to read, even if it's concise. Having 3 explicit cases would be much more immediately obvious to me. Assuming Python 3.10, this could use match-case:

def fmt(name, version):
    match name, version:
        case n, _ if not n:
            return ""
        case n, v if not v:
            return f"[{n}]"
        case n, v:
            return f"[{n}:{v}]"

This also only needs 1 format string per case--minor performance improvement and slightly easier to understand what's output.

For older Python versions, this could be:

if not name:
    return ""
elif not version:
    return f"[{name}]"
else:
    return f"[{n}:{v}]"

This is arguably the most legible and broad definition due to the concise use of Boolean contexts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the pattern matching looks very elegant. Thanks, Alex! I will refactor it



@pytest.mark.parametrize(
"name, version, expected_fmt",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 The implementation of the model logger allows the model name to be empty. This case should either be captured by a test case or it shouldn't be permitted because it indicates something has gone wrong (or hasn't been wired up correctly).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've missed that. Will fix in the incoming PR.

@dtpryce
Copy link
Contributor

dtpryce commented Dec 14, 2023

I was just testing using master branch (1.4.0.dev3 or similar) and found that the model name does work in logs for loading but not on inference. This is because the REST and gRPC servers inherit their own loggers but that is not configured using the mlserver logger so you need to update the mlserver.rest and mlserver.grpc loggers to use the same configuration - if you want that model and version in the log line and it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend configurability of the built-in logger to account for runtime/model names
4 participants