-
-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving stdlib integration #395
Comments
name should be easy, but isn't the second problem almost there with https://www.structlog.org/en/stable/standard-library.html#rendering-using-structlog-based-formatters-within-logging? Might be just a matter of documentation? 🤔 |
Hm, close. All of these terminate in a stdlib handler. It's also making the flow complex - the last diagram is My thought was that if we write a So my code would be |
Yeah I guess it could be kept simpler. Just |
We’ve been using structlog for a few projects since several years now and I’ve been wanting to write an article about this for some time. Our requirements (wich are quite similar to @Tinche’s:
Here is a (nearly) minimal example for how we setup logging: import logging
import functools
import sys
import logging.config
from datetime import UTC, datetime
import orjson
import structlog
import structlog._log_levels
from structlog.processors import _json_fallback_handler as orjson_fallback_handler
from structlog.tracebacks import ExceptionDictTransformer
from structlog.types import EventDict, Processor, WrappedLogger
import structlog
import structlog._log_levels
def render_orjson(
dict_tracebacks: structlog.processors.ExceptionRenderer,
/,
logger: WrappedLogger,
name: str,
event_dict: EventDict,
) -> str:
"""
Render log records as JSON with ``orjson`` and use dict tracebacks.
This function must be wrapped with :func:`functools.partial()` and the
*dict_tracebacks* function has to be passed to it::
"processor": functools.partial(render_orjson, dict_tracebacks),
We cannot use :class:`structlog.processors.JSONRenderer` because it returns bytes
and the :class:`structlog.stdlib.LoggerFactory` needs strings. There is also the
:class:`structlog.BytesLoggerFactory` but we cannot combine them.
We also do the dict traceback formatting here, which allows us to simplify the
stdlib log hander config::
"()": structlog.stdlib.ProcessorFormatter,
"processor": functools.partial(render_orjson, dict_tracebacks),
vs::
"()": structlog.stdlib.ProcessorFormatter,
"processors": [
structlog.stdlib.ProcessorFormatter.remove_processors_meta,
structlog.processors.dict_tracebacks,
render_orjson,
],
"""
event_dict = dict_tracebacks(logger, name, event_dict)
if event_dict.get("exception"):
e = event_dict["exception"][0]
event_dict["exc_type"] = e.get("exc_type", "?")
event_dict["exc_value"] = e.get("exc_value", "?")
try:
# Create a new event dict with improved key sorting
# because even JSON logs will also be viewed by humans. ;-)
sorted_dict: EventDict = {
"timestamp": event_dict["timestamp"],
"level": event_dict["level"],
"event": event_dict["event"],
**event_dict,
}
except KeyError:
# pytest-structlog removes the "timestamp" key.
# In that case, just use the original event dict.
sorted_dict = event_dict
return orjson.dumps(sorted_dict, default=orjson_fallback_handler).decode()
def add_extra_fields(
logger: logging.Logger, method_name: str, event_dict: EventDict
) -> EventDict:
"""
A structlog processor that adds extra info fields to the event dict.
This stems largely from structlog but this way, it is more customizable and maybe
a bit faster, too.
"""
# ISO UTC tiemstamp
# structlog.processors.TimeStamper(fmt="iso", utc=True),
event_dict["timestamp"] = datetime.now(tz=UTC).isoformat().replace("+00:00", "Z")
# Logger name
# structlog.stdlib.add_logger_name,
record = event_dict.get("_record")
if record is None:
event_dict["logger"] = logger.name
else:
event_dict["logger"] = record.name
# Log level in UPPER CASE
# structlog.stdlib.add_log_level,
if method_name == "warn":
# The stdlib has an alias
method_name = "warning"
elif method_name == "exception":
# exception("") method is the same as error("", exc_info=True)
method_name = "error"
event_dict["level"] = method_name.upper()
# Log level number
# structlog.stdlib.add_log_level_number, # not needed right now
# event_dict["level_number"] = structlog._log_levels._NAME_TO_LEVEL[method_name]
return event_dict
def setup_logging() -> None:
shared_processors: list[Processor] = [
add_extra_fields,
structlog.stdlib.ExtraAdder(),
]
# Setup structlog
#
# The performance guide (https://www.structlog.org/en/stable/performance.html)
# suggest to not pump structlog log entries through stdlib and use
# a BytesLoggerFactory instead.
#
# However, that would not work for us, because structlog would than only use single
# logger instance in all libs and apps. But we want to be able to configure
# different log levels for our app and the various libs. This only works if we
# use the stdlib logger factory that can create different loggers with different
# names.
pre_chain: list[Processor] = [
structlog.stdlib.filter_by_level,
structlog.contextvars.merge_contextvars,
]
post_chain: list[Processor] = [
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.processors.StackInfoRenderer(),
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
]
structlog.configure(
cache_logger_on_first_use=True,
# The filtering_bound_logger is fast, but it does not allow us to change the
# log level at a later stage since the filtering is hard code in the logger
# instances themselves.
# In addition, we'd have to compute the minimum log level for all loggers and
# use this, which might drastically reduce the positive performance impact.
# wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
wrapper_class=structlog.stdlib.BoundLogger,
processors=pre_chain + shared_processors + post_chain,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
)
dict_tracebacks = structlog.processors.ExceptionRenderer(
ExceptionDictTransformer(
# ... # Config from settings
)
)
rich_tracebacks = structlog.dev.RichTracebackFormatter(
# ... # Config from settings
)
# Setup stdlib
cfg = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"json": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": functools.partial(render_orjson, dict_tracebacks),
"foreign_pre_chain": shared_processors,
},
"colored": {
"()": structlog.stdlib.ProcessorFormatter,
"processor": structlog.dev.ConsoleRenderer(
colors=True, exception_formatter=rich_tracebacks
),
"foreign_pre_chain": shared_processors,
},
},
"handlers": {
"stream": {
"formatter": "colored" if sys.stdout.isatty() else "json",
"class": "logging.StreamHandler",
"stream": "ext://sys.stderr",
},
},
"root": {
"handlers": ["stream"],
"level": logging.INFO, # Acutally loaded from settings
},
# Configure log levels for individual loggers from settings
# "loggers": {name: {"level": level.value} for name, level in levels.items()},
}
logging.config.dictConfig(cfg)
def main() -> None:
setup_logging()
logging.getLogger("app").info("logging")
structlog.get_logger("app").info("structlog")
if __name__ == "__main__":
main() |
This is more of a discussion than an issue to be fixed.
I have recently refactored how our services log. Other folks might choose to do so too (and we could do some marketing to nudge them), so it might be useful to polish up this use case.
All our modules use structlog to perform logging calls. Before, the logs would pass through structlog and then enter the standard library logging system, where they would be printed out by a handler.
The flow was:
structlog -> logging -> stdout
.Now, we still use structlog, but the messages never touch the standard library - they get emitted by the PrintLogger. As you yourself noted, it's faster and there's 0 reason for us to go through the stdlib.
So now the flow is:
structlog -> stdout
.While doing this, I encountered two issues. One is that the standard library loggers support
name
, and thePrintLogger
doesn't. I solved that one by subclassing and sticking a name attribute on it. I think logger names are very useful and maybe we could improve the situation somehow. Not necessarily adding a name attribute toPrintLogger
, maybe we can be more clever somehow (maybe inject the 'name' key into a context somewhere when creating the logger? Not sure). But this required a little boilerplate and it got done.The second issue is: other libraries still use the standard library to log. This was actually a problem before as well, but I just realized it as I was doing the transition. There's nothing we can really do about that. But I would like those logs to honor my structlog configuration.
So I was thinking we should write a
StructlogHandler
class that users can register with the stdlib logging system. It'd essentially just forward the logs into structlog. I want my exceptions to be a nice little JSON line instead of a multiline stack trace like it is by default.The flow for 3rd party logs would be:
logging -> structlog -> stdout
.The main benefit would be configuring the logging output only once.
The text was updated successfully, but these errors were encountered: