-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI airflow scheduler -D --pid <PIDFile>
fails silently if PIDFile given is a relative path
#13200
Comments
Fixing this issue is straigth-forward: when a PID file path is provided, always extend it to a full absolute path first. But I would like to have someone to help confirm they can reproduce the error first, meanwhile I will find out the root-cause as well of course. |
Confirmed. The reason is that default "working_directory" is used in in DaemonContext
By default the "working_directory" is set to "/" which means that the pid file is attempted to be created as "/" - which for non-root user will fail. This works for example in Breeze as we use root user there, but in production setup with non-root user it will fail. I think we should assume (extending to the code in Line 222 in 97eee35
|
Thanks @potiuk for helping identify the root cause. I have different opinion: if it's a relative path, it should be relative to current directory where the user is running the command. I.e., as I mentioned above, if it's relative path X, extend it to CURRENT_DIR/X (rather than assuming it to be AORFLOW_HOME) Let me know if you agree? |
I think when you deamonize a process, you actually want to run it in a "specific" directory. The whole idea about running process as a daemon in *NIX is to detach it completely from the originating process/command (that's why for example if you want to make a really detached daemon you should fork it twice https://stackoverflow.com/questions/881388/what-is-the-reason-for-performing-a-double-fork-when-creating-a-daemon to avoid zombies (making sure that the daemon process is parented by init process). That's why also the "DaemonContext' has the "/" as working directory set. This is not a mistake - it is pretty deliberate behaviour. For example such daemonized process should survive and work if someone decides do delete the whole "current directory" whatever it was. The current working directory is pretty much accidental, so creating an output of "daemon" there is not really what I'd expect. |
I'm more thinking from an "expectation" perspective: Say I'm in directory "/home/xd" (my AIRFLOW_HOME may be On the other hand, when we daemonize this process, the only thing to do with path/directory is this file where PID is stored. We don't run a process in a "specific" directory. |
So maybe it's better to require a non-relative file and fail with error if it is not ? |
Haha, I agree that's a "compromise" which can make both of us happy. Why not😉 I will create a PR shortly tonight for this. |
I view it as the opposite. Every* unix program that deals with filenames on the command line treats a non-absolute path input as relative to the current directory. We should behave the same. (The fact the program is deamonized is an implementation detail.) * every one I can think of |
I think we should take https://linux.die.net/man/1/daemonize as the canonical behaviour. It supports both - changing the working directory (default as "/") and specifying pidfile at the command line. It is not really specified in the man page what happens if you specify relative path, but maybe we can check and do the same. |
But for our case, it may not be a good idea to further expose "working directory" or allow changing it. Making it simple should be better. I prefer to my very first proposal (if the path is relative, expand it with the current directory) which aligns with @ashb 's suggestion. Meanwhile I'm open to different solution if we come to an agreement by majority. |
I am not sure we should reinvent the wheel. I just wonder what "deamonize" does when you do not specify the working dir. I think we should do exactly the same. Not sure how relative paths are treated by it, but I believe we should do exactly what it does. |
What deamonize does is not relevant for the behaviour we want. We are accepting a file input from the command line: what is the most common pattern, and therefor least friction/most expected to a user. |
I start to doubt if we are discussing on the same thing now :-) |
Since you asked though - daemonize(1) creates the pid file, then changes directory -- i.e. relative pid file would be relative to the cwd. https://github.com/bmc/daemonize/blob/master/daemonize.c#L461-L500 |
No strong opinion there :). If you want to use relative from cwd, I am fine with it :). |
Yes please - let's do that then @XD-DENG |
…r, kerberos, etc) Closes apache#13200. Currently, if the PID file path provided is relative, the process silently fails to launch. This fix ensures the PID path specified to be a normalized absolutized path eventually (expand with cwd), no matter the given value is relative or absolute.
…r, kerberos, etc) (apache#13232) Closes apache#13200. Currently, if the PID file path provided is relative, the process silently fails to launch. This fix ensures the PID path specified to be a normalized absolutized path eventually (expand with cwd), no matter the given value is relative or absolute. (cherry picked from commit 93e4787)
…r, kerberos, etc) (#13232) Closes #13200. Currently, if the PID file path provided is relative, the process silently fails to launch. This fix ensures the PID path specified to be a normalized absolutized path eventually (expand with cwd), no matter the given value is relative or absolute. (cherry picked from commit 93e4787)
Apache Airflow version: 2.0.0
Environment: Linux & MacOS, venv
uname -a
):What happened:
Say I'm in my home dir, running command
airflow scheduler -D --pid test.pid
(test.pid
is a relative path) is supposed to start the scheduler in daemon mode, and the PID will be stored in the filetest.pid
(if it doesn't exist, it should be created).However, the scheduler is NOT started. This can be validated by running
ps aux | grep airflow | grep scheduler
(no process is shown). In the whole process, I don't see any error message.However, if I change the pid file path to an absolute path, i.e.
airflow scheduler -D --pid ${PWD}/test.pid
, it successfully start the scheduler in daemon mode (can be validated via the method above).What you expected to happen:
Even if the PID file path provided is a relative path, the scheduler should be started properly as well.
How to reproduce it:
Described above
Anything else we need to know:
The text was updated successfully, but these errors were encountered: