Skip to content

Conversation

@ashb
Copy link
Member

@ashb ashb commented Jun 19, 2025

When we are running normally (without impersonation) the supervisor sets up a new socketpair for logging before forking, and then the task procees configures structlog in the forked process to send logs over that socket. This all works as forking a process gives the new process a copy of all open file descriptors.

However sudo by default will close all open file descriptors other than stdin, stdout and stderr, so our logs socket (sockets, and files, are all file descriptors).

We could ask people to change their sudoers config file to add the closefrom_overide and invoke sudo -C <logfd> however many people either might not have access to do this, or might not feel comfortable in making this change.

There is however another option to us: On both unix and windows there is the ability to pass open file descriptors (which remember, sockets are file descriptors) between two processes!

So what this PR does is introduce a new Request and Response pair, and customize the send+receive code to send a new FD (since we've already closed the child end for normal start up before we knew the task was actually going to run as another user, and we can't get it back, so we just open another) that is configured to receive and handle JSON logs.

Relates to #51780

Enough with the words, lets see what it does.

Logging from a task

Before

before-logging-in-task

Note the double timestamp and level etc (one formatted nicely by the UI, the other in the log message etc)

After

after-logging-in-task

The two highlighted sections are the same source in each task run

Unhandled exception

Now on to the real "ick". An unhandled exception in an operator/task:

Before

before-uncaught-exc

Ick. Not helped by rich at all here

After

after-uncaught-exc

@ashb
Copy link
Member Author

ashb commented Jun 19, 2025

This is currently draft as I have added zero unit tests, but I need to step away from my desk

Done.

@ashb ashb force-pushed the user-impersonation-log-handling branch from 0247226 to c445203 Compare June 19, 2025 12:22
…xt-over-stdout

When we are running normally (without impersonation) the supervisor sets up a
new socketpair for logging before forking, and then the task procees
configures structlog in the forked process to send logs over that socket. This
all works as forking a process gives the new process a copy of all open file
descriptors.

However sudo by default will close all open file descriptors other than stdin,
stdout and stderr, so our logs socket (sockets, and files, are all file
descriptors).

We could ask people to change their `sudoers` config file to add the
[`closefrom_overide`][1] and invoke `sudo -C <logfd>` however many people
either might not have access to do this, or might not feel comfortable in
making this change.

There is however another option to us: On both unix and windows there is the
ability to pass _open_ file descriptors (which remember, sockets are file
descriptors) between two processes!

So what this PR does is introduce a new Request and Response pair, and
customize the send+receive code to send a new FD (since we've already closed
the child end for normal start up before we knew the task was actually going
to run as another user, and we can't get it back, so we just open another)
that is configured to receive and handle JSON logs.

[1]: https://linux.die.net/man/5/sudoers#:~:text=on%20by%20default.-,closefrom_override,is%20off%20by%20default.,-compress_io'%20If%20set
@ashb ashb force-pushed the user-impersonation-log-handling branch from c445203 to a9eccd3 Compare June 19, 2025 13:24
@ashb ashb marked this pull request as ready for review June 19, 2025 13:25
@ashb ashb requested review from amoghrajesh and kaxil as code owners June 19, 2025 13:25
Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good

@ashb ashb merged commit 348b292 into apache:main Jun 19, 2025
159 of 162 checks passed
@ashb ashb deleted the user-impersonation-log-handling branch June 19, 2025 15:39
RoyLee1224 pushed a commit to RoyLee1224/airflow that referenced this pull request Jun 21, 2025
…xt-over-stdout (apache#51934)

When we are running normally (without impersonation) the supervisor sets up a
new socketpair for logging before forking, and then the task procees
configures structlog in the forked process to send logs over that socket. This
all works as forking a process gives the new process a copy of all open file
descriptors.

However sudo by default will close all open file descriptors other than stdin,
stdout and stderr, so our logs socket (sockets, and files, are all file
descriptors).

We could ask people to change their `sudoers` config file to add the
[`closefrom_overide`][1] and invoke `sudo -C <logfd>` however many people
either might not have access to do this, or might not feel comfortable in
making this change.

There is however another option to us: On both unix and windows there is the
ability to pass _open_ file descriptors (which remember, sockets are file
descriptors) between two processes!

So what this PR does is introduce a new Request and Response pair, and
customize the send+receive code to send a new FD (since we've already closed
the child end for normal start up before we knew the task was actually going
to run as another user, and we can't get it back, so we just open another)
that is configured to receive and handle JSON logs.

[1]: https://linux.die.net/man/5/sudoers#:~:text=on%20by%20default.-,closefrom_override,is%20off%20by%20default.,-compress_io'%20If%20set
@kaxil kaxil added this to the Airflow 3.0.3 milestone Jul 2, 2025
kaxil pushed a commit that referenced this pull request Jul 2, 2025
…xt-over-stdout (#51934)

When we are running normally (without impersonation) the supervisor sets up a
new socketpair for logging before forking, and then the task procees
configures structlog in the forked process to send logs over that socket. This
all works as forking a process gives the new process a copy of all open file
descriptors.

However sudo by default will close all open file descriptors other than stdin,
stdout and stderr, so our logs socket (sockets, and files, are all file
descriptors).

We could ask people to change their `sudoers` config file to add the
[`closefrom_overide`][1] and invoke `sudo -C <logfd>` however many people
either might not have access to do this, or might not feel comfortable in
making this change.

There is however another option to us: On both unix and windows there is the
ability to pass _open_ file descriptors (which remember, sockets are file
descriptors) between two processes!

So what this PR does is introduce a new Request and Response pair, and
customize the send+receive code to send a new FD (since we've already closed
the child end for normal start up before we knew the task was actually going
to run as another user, and we can't get it back, so we just open another)
that is configured to receive and handle JSON logs.

[1]: https://linux.die.net/man/5/sudoers#:~:text=on%20by%20default.-,closefrom_override,is%20off%20by%20default.,-compress_io'%20If%20set

(cherry picked from commit 348b292)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants