Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate logs to stdout when in k8s executor pod #28440

Merged

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Dec 18, 2022

Currently live logging (viewing from webserver while task running) for k8s executor pods doesn't work.

The cause has a few components from things we do during task run:

  1. we remove the console handler from root logger. So any logs that make it to root don't get emitted.
  2. even if we keep console handler at root, it is designed to respect stdout redirection
  3. even if we make it not respect stdout redirection in this context, task logger does not propagate, so anything going through task logger does not make it to root logger

The reason for (3) is we copy the task handler to root logger instead of move, which requires that we disable propagation at the task logger.

This PR fixes all of these things and, ultimately, simplifies the setup.

  1. instead of keeping task handler in two places during execution -- airflow.task and root -- we keep it just in root. This means we can let airflow.task logger propagate. this we now do always.
  2. when running k8s executor pod, we keep the console handler at root and set it to not respect redirection, so that even though stdout gets redirected to task logger, when it makes it to root logger, it is still emitted to stdout.

@dstandish dstandish force-pushed the enable-stdout-when-k8s-executor-worker branch 3 times, most recently from 9195ee5 to c90e631 Compare December 24, 2022 07:21
@malthe
Copy link
Contributor

malthe commented Dec 29, 2022

@dstandish I think an abstraction point in the executor framework would be better rather than the if-statements in this changeset. That is, we should make the executor pluggable in terms of supporting this logging capability.

In terms of actual implementation, rather than adjusting log handlers at runtime, perhaps an internal pub/sub mechanism could be devised.

@dstandish
Copy link
Contributor Author

dstandish commented Dec 29, 2022

@dstandish I think an abstraction point in the executor framework would be better rather than the if-statements in this changeset. That is, we should make the executor pluggable in terms of supporting this logging capability.

In terms of actual implementation, rather than adjusting log handlers at runtime, perhaps an internal pub/sub mechanism could be devised.

We already adjust handlers. This fixes a deficiency

This does not add meaningful backcompat surface area, is basically a small adjustment to what we already do. Your suggestion may be a good one, but that's not what this PR does, and this PR does not in any way make your approach more difficult / less likely / less practical to implement. Meanwhile, logging in k8s executor is broken and this fixes it without, as far as I can tell, much in the way of downside.

Comment on lines -298 to -304
orig_level = root_logger.level
root_logger.setLevel(task_logger.level)
orig_handlers = root_logger.handlers.copy()
root_logger.handlers[:] = task_logger.handlers
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malthe observe, we already copy (temporarily) to root logger the thandlers from airflow.task

This causes problems because we have to have complicated propagation rules at the airflow.task logger because we leave it there.

While my solution here is a few more lines (and a lot more comments) it's not very complicated. There are two parts.

  1. Instead of copying the handlers to root, I move them to root. We don't need them at task if they are already at root. This could ultimately allow for simplification of our propagation logic.
  2. Previously we remove our console handler from root at run time. Now, if we're in a k8s executor pod, I keep the console handler there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite understand what "stdout redirection" means – is that within Python or is that something happening outside of Python (in a shell script perhaps) – ?

When I think about this problem, it seems that there is an orthogonal concern in the logging setup of Airflow which is whether or not the task logger (during execution) should be emitted to the stdout stream.

That's something you could want in any situation, not just K8S.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite understand what "stdout redirection" means – is that within Python or is that something happening outside of Python (in a shell script perhaps) – ?

i am not sure what comment you refer to but, what this means is there are stdlib helpers to take things that would ordinarily go to stdout (e.g. print) and send them to some other stream.

see from contextlib import redirect_stdout. it just monkey patches sys.stdout temporarily.

we use them (for better or worse) to redirect stdout to task logs.

our log read logic when task is running is usually, read from flask log server. most task logging is redirected to log (and therefore file) and our "console" handler respects this redirection. for obvious reasons -- the celery worker log, or local executor log would get unreasonably chatty. but in k8s executor context, there is no log server on the worker, and there is no problem with keeping the stdout, and importantly our log read logic assumes everything will be forwarded to stdout -- and this is what's broken that i am here fixing.

@dstandish
Copy link
Contributor Author

ok updated the description so hopefully it's a little clearer whats going on

@dstandish dstandish force-pushed the enable-stdout-when-k8s-executor-worker branch from 1f50899 to 735a395 Compare December 30, 2022 06:35
@dstandish dstandish force-pushed the enable-stdout-when-k8s-executor-worker branch from 29239b0 to 2a514f1 Compare December 31, 2022 07:41
@dstandish
Copy link
Contributor Author

@dstandish dstandish force-pushed the enable-stdout-when-k8s-executor-worker branch from 1d8461a to 776b819 Compare January 4, 2023 05:51
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small nits, but looks good to me overall.

# processing hundreds of simultaneous tasks.
# this should be last thing before running, to reduce likelihood of an open session
# which can cause trouble if running process in a fork.
settings.reconfigure_orm(disable_connection_pool=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to this logging PR in anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this will fail if we remove this change. you could call it a drive by and i can remove if you like.

task runner tests failed with this change. then i was like "oh why is it just throwing away whole log config, let's try and not do that". then after leaving the FTH intact for that test, that i discovered there was an extraneous session being created in created in _render_filename. as part of troubleshooting the failures and discovering that though, also saw that this reconfigure is happening earlier than it probably ought to. should be done as late as possible i think.

airflow/utils/log/file_task_handler.py Outdated Show resolved Hide resolved
tests/cli/commands/test_task_command.py Outdated Show resolved Hide resolved
tests/cli/commands/test_task_command.py Outdated Show resolved Hide resolved
tests/task/task_runner/test_standard_task_runner.py Outdated Show resolved Hide resolved
dstandish and others added 6 commits January 9, 2023 09:20
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
@dstandish dstandish merged commit 3ececb2 into apache:main Jan 9, 2023
@dstandish dstandish deleted the enable-stdout-when-k8s-executor-worker branch January 9, 2023 20:27
@pierrejeambrun pierrejeambrun added the type:misc/internal Changelog: Misc changes that should appear in change log label Jan 9, 2023
dstandish added a commit to astronomer/airflow that referenced this pull request Jan 27, 2023
…text

After apache#28440, instead of having a task logger both at `airflow.task` and root logger, we only have it at root logger.  This means we can remove the logic to set propagate to False, because there's no longer a risk of record processed by FTH twice.  It also means we can remove the logic to walk up the logger hierarchy and set context because we don't need to hit both airflow.task and root -- there will only ever be one such handler instance.  So in effect we deprecate the MAINTAIN_PROPAGATE logic and no longer set propagate=False by default.

While we could probably remove the "DISABLE_PROPAGATE" logic too (it's only used by file processor) it doesn't really hurt to leave it.

(cherry picked from commit b9ed441f9127503f55e338f728e68f10bc77f3df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI area:logging type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants