-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fsmonitor/darwin: fix hangs for submodules #1802
fsmonitor/darwin: fix hangs for submodules #1802
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's awesome to see someone working on FSMonitor fixes.
By the way, you can ignore the osx-*
failures: They are happening due to a known issue with libcurl v8.10.0 and will go away once the macos-13
runners are updated to include v8.10.1 instead (I expected this to be happening on Sep 23rd or soon thereafter, but it seems that there are delays).
builtin/fsmonitor--daemon.c
Outdated
buf = state.path_gitdir_watch.buf; | ||
len = state.path_gitdir_watch.len; | ||
if (len >= 2 && buf[len - 2] == '/' && buf[len - 1] == '.') | ||
strbuf_setlen(&state.path_gitdir_watch, len - 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using strbuf_strip_suffix()
instead?
strbuf_strip_suffix(&state.path_gitdir_watch, "/.");
Also, did you want to add tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'll update the commit.
142c518
to
1889cbb
Compare
/preview |
Preview email sent as pull.1802.git.1727576141637.gitgitgadget@gmail.com |
Preview email sent as pull.1802.git.1727576180808.gitgitgadget@gmail.com |
/submit |
Submitted as pull.1802.git.1727577690390.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Koji Nakamaru <koji.nakamaru@gree.net>
>
> fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> the value with trailing "/." (as repo_get_git_dir(the_repository) on
> Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> IS_OUTSIDE_CONE.
>
> In this case, fsevent_callback() doesn't update cookie_list so that
> fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> not invoked.
>
> As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> that with_lock__mark_cookies_seen() should unlock, the whole daemon
> hangs.
The above very nicely describes the cause, the mechansim that leads
to the end-user observable effect, and the (bad) effect the bug has.
I wish everybody wrote their proposed commit messages like this ;-)
> Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> and add a corresponding test in t7527-builtin-fsmonitor.sh.
>
> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
> ---
> fsmonitor/darwin: fix hangs for submodules
> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 730f3c7f810..7acd074a97f 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -82,6 +82,28 @@ have_t2_data_event () {
> grep -e '"event":"data".*"category":"'"$c"'".*"key":"'"$k"'"'
> }
>
> +start_git_in_background () {
> + git "$@" &
> + git_pid=$!
> + nr_tries_left=10
> + while true
> + do
> + if test $nr_tries_left -eq 0
> + then
> + kill $git_pid
> + exit 1
> + fi
> + sleep 1
> + nr_tries_left=$(($nr_tries_left - 1))
> + done > /dev/null 2>&1 &
So, the command is allowed to run for 10 seconds and then a signal
is sent to the process (by the way, we do not write the SP between
">" and "/dev/null").
> + watchdog_pid=$!
> + wait $git_pid
And the process to ensure the command gets killed in 10 seconds is
called the "watchdog". We let the command run for completion (and
we'd be happy if it did without watchdog needing to forcibly kill
it).
Which means that even after the test finishes normally (e.g., the
command completes without getting killed by the watchdog, because it
is on a fast box and finishes in 0.5 second), we have leftover
watchdog process hanging around for 10 seconds, which might interfere
with the removal of the $TRASH_DIRECTORY at the end of the test.
There is a helper function to kill both (below), which probably is
used to avoid it. Let's keep reading.
> +}
> +
> +stop_git_and_watchdog () {
> + kill $git_pid $watchdog_pid
> +}
This sends a signal and let the process die. Without waiting to
make sure they indeed died, at which point we can safely remove the
$TRASH_DIRECTORY on filesystems that refuse to remove a directory
when a process still has it as its current working directory.
Shouldn't it loop, like
for pid in $git_pid $watchdog_pid
do
until kill -0 $pid
do
kill $pid
done
done
or something? Or is there a mechanism already to ensure that we
return after they get killed that I am failing to find?
> test_expect_success 'explicit daemon start and stop' '
> test_when_finished "stop_daemon_delete_repo test_explicit" &&
>
> @@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
> test_subcommand git fsmonitor--daemon start <super-sub.trace
> '
>
> +test_expect_success "submodule implicitly starts daemon by pull" '
> + test_atexit "stop_git_and_watchdog" &&
Hmph, this is _atexit and not _when_finished because...?
> + test_when_finished "rm -rf cloned; \
> + rm -rf super; \
> + rm -rf sub" &&
Makes me wonder why it is not written like so:
test_when_finished "rm -rf cloned super sub" &&
which is short enough to still fit on a line. Is there something I
am missing that these directories must be removed separately and in
this order?
> + create_super super &&
> + create_sub sub &&
> +
> + git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> + git -C super commit -m "add sub" &&
> + git clone --recurse-submodules super cloned &&
> +
> + git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> + start_git_in_background -C cloned pull --recurse-submodules
> +'
Other than that, very nicely done.
Thanks.
> # On a case-insensitive file system, confirm that the daemon
> # notices when the .git directory is moved/renamed/deleted
> # regardless of how it is spelled in the FS event.
>
> base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6 |
On the Git mailing list, Koji Nakamaru wrote (reply to this): Thank you very much for carefully checking the patch and suggesting
better ways. I'll later revise it and submit a new one.
> > +}
> > +
> > +stop_git_and_watchdog () {
> > + kill $git_pid $watchdog_pid
> > +}
>
> This sends a signal and let the process die. Without waiting to
> make sure they indeed died, at which point we can safely remove the
> $TRASH_DIRECTORY on filesystems that refuse to remove a directory
> when a process still has it as its current working directory.
>
> Shouldn't it loop, like
>
> for pid in $git_pid $watchdog_pid
> do
> until kill -0 $pid
> do
> kill $pid
> done
> done
>
> or something? Or is there a mechanism already to ensure that we
> return after they get killed that I am failing to find?
I agree that we have to wait for pids. I also realized that we should
run git in another process group and kill the group for killing all git
child processes. I'll fix the code.
> > test_expect_success 'explicit daemon start and stop' '
> > test_when_finished "stop_daemon_delete_repo test_explicit" &&
> >
> > @@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
> > test_subcommand git fsmonitor--daemon start <super-sub.trace
> > '
> >
> > +test_expect_success "submodule implicitly starts daemon by pull" '
> > + test_atexit "stop_git_and_watchdog" &&
>
> Hmph, this is _atexit and not _when_finished because...?
This is because README describes _atexit to run unconditionally to clean
up before the test script exits, e.g. to stop (kill) a daemon. More
appropriately, we should kill git before "rm -rf cloned super sub" in
_when_finished and kill watchdog in _atexit. I'll adjust the code.
> > + test_when_finished "rm -rf cloned; \
> > + rm -rf super; \
> > + rm -rf sub" &&
>
> Makes me wonder why it is not written like so:
>
> test_when_finished "rm -rf cloned super sub" &&
>
> which is short enough to still fit on a line. Is there something I
> am missing that these directories must be removed separately and in
> this order?
There is no special reason, I simply followed the style used in
t7527-builtin-fsmonitor.sh. I'll fix this part.
Koji Nakamaru |
1889cbb
to
decf684
Compare
/submit |
Submitted as pull.1802.v2.git.1727759371110.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@f8f9e71. |
On the Git mailing list, Junio C Hamano wrote (reply to this): "Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Koji Nakamaru <koji.nakamaru@gree.net>
>
> fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> the value with trailing "/." (as repo_get_git_dir(the_repository) on
> Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> IS_OUTSIDE_CONE.
>
> In this case, fsevent_callback() doesn't update cookie_list so that
> fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> not invoked.
>
> As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> that with_lock__mark_cookies_seen() should unlock, the whole daemon
> hangs.
>
> Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> and add a corresponding test in t7527-builtin-fsmonitor.sh.
>
> Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Suggested-by: Junio C Hamano <gitster@pobox.com>
In none of the changes described above, I have any input to deserve
such credit, though.
> +start_git_in_background () {
> + git "$@" &
> + git_pid=$!
> + git_pgid=$(ps -o pgid= -p $git_pid)
> + nr_tries_left=10
> + while true
> + do
> + if test $nr_tries_left -eq 0
> + then
> + kill -- -$git_pgid
> + exit 1
> + fi
> + sleep 1
> + nr_tries_left=$(($nr_tries_left - 1))
> + done >/dev/null 2>&1 &
> + watchdog_pid=$!
> + wait $git_pid
> +}
> +
> +stop_git () {
> + while kill -0 -- -$git_pgid
> + do
> + kill -- -$git_pgid
> + sleep 1
> + done
> +}
On the "git" side you use process group because you expect that
"git" would spawn subprocesses and you want to catch all of them,
...
> +stop_watchdog () {
> + while kill -0 $watchdog_pid
> + do
> + kill $watchdog_pid
> + sleep 1
> + done
> +}
... but "watchdog" you know is a single process, so you'd only need
a single process id, is that the idea?
What is the motivation behind the change in this iteration to use
process group? Was it observed that leftover processes hang around
if we killed only the $git_pid, or something?
> +test_expect_success "submodule implicitly starts daemon by pull" '
> + test_atexit "stop_watchdog" &&
> + test_when_finished "stop_git && rm -rf cloned super sub" &&
If stop_git ever returns with non-zero status, "rm -rf" will be
skipped, which I am not sure is a good idea.
The whole test_when_finished would fail in such a case, so you would
notice the problem right away, which is a plus, though.
> + create_super super &&
> + create_sub sub &&
> +
> + git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> + git -C super commit -m "add sub" &&
> + git clone --recurse-submodules super cloned &&
> +
> + git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> + set -m &&
I have to wonder how portable (and necessary) this is.
POSIX says it shall be supported if the implementation supports the
User Portability Utilities option. It also says that it was added
to apply only to the UPE because it applies primarily to interactive
use, not shell script applications. And our test scripts are of
course not interactive.
Thanks. |
On the Git mailing list, Koji Nakamaru wrote (reply to this): On Tue, Oct 1, 2024 at 8:57 PM Junio C Hamano <gitster@pobox.com> wrote:
> > fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> > has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> > the value with trailing "/." (as repo_get_git_dir(the_repository) on
> > Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> > IS_OUTSIDE_CONE.
> >
> > In this case, fsevent_callback() doesn't update cookie_list so that
> > fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> > not invoked.
> >
> > As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> > that with_lock__mark_cookies_seen() should unlock, the whole daemon
> > hangs.
> >
> > Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> > and add a corresponding test in t7527-builtin-fsmonitor.sh.
> >
> > Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > Suggested-by: Junio C Hamano <gitster@pobox.com>
>
> In none of the changes described above, I have any input to deserve
> such credit, though.
Your points are very helpful :)
> > +start_git_in_background () {
> > + git "$@" &
> > + git_pid=$!
> > + git_pgid=$(ps -o pgid= -p $git_pid)
> > + nr_tries_left=10
> > + while true
> > + do
> > + if test $nr_tries_left -eq 0
> > + then
> > + kill -- -$git_pgid
> > + exit 1
> > + fi
> > + sleep 1
> > + nr_tries_left=$(($nr_tries_left - 1))
> > + done >/dev/null 2>&1 &
> > + watchdog_pid=$!
> > + wait $git_pid
> > +}
> > +
> > +stop_git () {
> > + while kill -0 -- -$git_pgid
> > + do
> > + kill -- -$git_pgid
> > + sleep 1
> > + done
> > +}
>
> On the "git" side you use process group because you expect that
> "git" would spawn subprocesses and you want to catch all of them,
> ...
>
> > +stop_watchdog () {
> > + while kill -0 $watchdog_pid
> > + do
> > + kill $watchdog_pid
> > + sleep 1
> > + done
> > +}
>
> ... but "watchdog" you know is a single process, so you'd only need
> a single process id, is that the idea?
Yes, that is the idea.
> What is the motivation behind the change in this iteration to use
> process group? Was it observed that leftover processes hang around
> if we killed only the $git_pid, or something?
Yes, if the issue occurs, three processes remains:
git fetch --update-head-ok --recurse-submodules=on
git fetch --no-prune --no-prune-tags --update-head-ok
--recurse-submodules --recurse-submodules-default yes
--submodule-prefix=dir_1/dir_2/sub/
git fsmonitor--daemon run --detach --ipc-threads=8
If there is no issue, only the fsmonitor process remains.
> > +test_expect_success "submodule implicitly starts daemon by pull" '
> > + test_atexit "stop_watchdog" &&
> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>
> If stop_git ever returns with non-zero status, "rm -rf" will be
> skipped, which I am not sure is a good idea.
>
> The whole test_when_finished would fail in such a case, so you would
> notice the problem right away, which is a plus, though.
t/README discusses that test_when_finished and test_atexit differ about
the "--immediate" option. As git and its subprocesses are the test
target, I moved stop_git to the current place. This might be however
confusing when someone later reads this test. Should we simply put
stop_git and stop_watchdong in test_atexit?
> > + create_super super &&
> > + create_sub sub &&
> > +
> > + git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> > + git -C super commit -m "add sub" &&
> > + git clone --recurse-submodules super cloned &&
> > +
> > + git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> > + set -m &&
>
> I have to wonder how portable (and necessary) this is.
>
> POSIX says it shall be supported if the implementation supports the
> User Portability Utilities option. It also says that it was added
> to apply only to the UPE because it applies primarily to interactive
> use, not shell script applications. And our test scripts are of
> course not interactive.
How about the following modification? It still utilizes $git_pgid to
filter processes, but avoids "set -m".
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 2dd1ca1a7b..23d9a7c953 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -916,7 +916,7 @@ start_git_in_background () {
do
if test $nr_tries_left -eq 0
then
- kill -- -$git_pgid
+ kill $git_pid
exit 1
fi
sleep 1
@@ -927,10 +927,13 @@ start_git_in_background () {
}
stop_git () {
- while kill -0 -- -$git_pgid
+ for p in $(ps -o pgid=,pid=,comm= | grep "^$git_pgid .*git"
| sed 's/^[0-9][0-9]* \([0-9][0-9]*\) .*/\1/')
do
- kill -- -$git_pgid
- sleep 1
+ while kill -0 $p
+ do
+ kill $p
+ sleep 1
+ done
done
}
@@ -954,7 +957,6 @@ test_expect_success "submodule implicitly starts
daemon by pull" '
git clone --recurse-submodules super cloned &&
git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
- set -m &&
start_git_in_background -C cloned pull --recurse-submodules
'
Koji Nakamaru |
On the Git mailing list, Junio C Hamano wrote (reply to this): Koji Nakamaru <koji.nakamaru@gree.net> writes:
>> > +test_expect_success "submodule implicitly starts daemon by pull" '
>> > + test_atexit "stop_watchdog" &&
>> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>>
>> If stop_git ever returns with non-zero status, "rm -rf" will be
>> skipped, which I am not sure is a good idea.
>>
>> The whole test_when_finished would fail in such a case, so you would
>> notice the problem right away, which is a plus, though.
>
> t/README discusses that test_when_finished and test_atexit differ about
> the "--immediate" option. As git and its subprocesses are the test
> target, I moved stop_git to the current place. This might be however
> confusing when someone later reads this test. Should we simply put
> stop_git and stop_watchdong in test_atexit?
That is not what I meant.
I was merely questioning the &&-chaining that stops "rm -fr" from
running if stop_git ever fails (and your earlier iteration you had
multiple "rm -fr" ;-chained, not &&-chained---not using && is often
more appropriate in a when_finished handler).
>> > + set -m &&
>>
>> I have to wonder how portable (and necessary) this is.
>>
>> POSIX says it shall be supported if the implementation supports the
>> User Portability Utilities option. It also says that it was added
>> to apply only to the UPE because it applies primarily to interactive
>> use, not shell script applications. And our test scripts are of
>> course not interactive.
>
> How about the following modification? It still utilizes $git_pgid to
> filter processes, but avoids "set -m".
Nah, your original reads much better, and the code is grabbing and
using the process group information anyway (and my question about
"-m" was more about "should we be relying on process group features
in this test to kill them all?").
I am OK with the idea that we can assume, at least among the
platforms that support fsmonitor, that sending a signal to a process
group would cause the signal delivered to the member processes just
as we expect.
Thanks. |
On the Git mailing list, Koji Nakamaru wrote (reply to this): On Wed, Oct 2, 2024 at 3:04 AM Junio C Hamano <gitster@pobox.com> wrote:
>>> > +test_expect_success "submodule implicitly starts daemon by pull" '
>>> > + test_atexit "stop_watchdog" &&
>>> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>>>
>>> If stop_git ever returns with non-zero status, "rm -rf" will be
>>> skipped, which I am not sure is a good idea.
>>>
>>> The whole test_when_finished would fail in such a case, so you would
>>> notice the problem right away, which is a plus, though.
>>
>> t/README discusses that test_when_finished and test_atexit differ about
>> the "--immediate" option. As git and its subprocesses are the test
>> target, I moved stop_git to the current place. This might be however
>> confusing when someone later reads this test. Should we simply put
>> stop_git and stop_watchdong in test_atexit?
>
> That is not what I meant.
>
> I was merely questioning the &&-chaining that stops "rm -fr" from
> running if stop_git ever fails (and your earlier iteration you had
> multiple "rm -fr" ;-chained, not &&-chained---not using && is often
> more appropriate in a when_finished handler).
I see. I'll fix this part.
>>> > + set -m &&
>>>
>>> I have to wonder how portable (and necessary) this is.
>>>
>>> POSIX says it shall be supported if the implementation supports the
>>> User Portability Utilities option. It also says that it was added
>>> to apply only to the UPE because it applies primarily to interactive
>>> use, not shell script applications. And our test scripts are of
>>> course not interactive.
>>
>> How about the following modification? It still utilizes $git_pgid to
>> filter processes, but avoids "set -m".
>
> Nah, your original reads much better, and the code is grabbing and
> using the process group information anyway (and my question about
> "-m" was more about "should we be relying on process group features
> in this test to kill them all?").
>
> I am OK with the idea that we can assume, at least among the
> platforms that support fsmonitor, that sending a signal to a process
> group would cause the signal delivered to the member processes just
> as we expect.
Thank you for the clarification and the support.
Koji Nakamaru |
decf684
to
aabc1c2
Compare
/submit |
Submitted as pull.1802.v3.git.1727810964571.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@923a789. |
On the Git mailing list, Koji Nakamaru wrote (reply to this): Although I submitted [PATCH v3], it was incomplete about the following:
On Wed, Oct 2, 2024 at 3:04 AM Junio C Hamano <gitster@pobox.com> wrote:
> I am OK with the idea that we can assume, at least among the
> platforms that support fsmonitor, that sending a signal to a process
> group would cause the signal delivered to the member processes just
> as we expect.
On windows, there is no process group so the test cannot run
correctly. As hangs corrected with the patch occur only for darwin, I
would like to skip MINGW in the test. Is it okay?
Koji Nakamaru |
This branch is now known as |
This patch series was integrated into seen via git@99031f8. |
There was a status update in the "Cooking" section about the branch macOS with fsmonitor daemon can hang forever when a submodule is involved, which has been corrected. Expecting a reroll. cf. <CAOTNsDygwBkNdFX4K_ixL5kP-AvDxWWVXkvfkmV4Kh04ozdYkA@mail.gmail.com> source: <pull.1802.v3.git.1727810964571.gitgitgadget@gmail.com> |
On the Git mailing list, Junio C Hamano wrote (reply to this): Koji Nakamaru <koji.nakamaru@gree.net> writes:
> On windows, there is no process group so the test cannot run
> correctly. As hangs corrected with the patch occur only for darwin, I
> would like to skip MINGW in the test. Is it okay?
Surely. But can we do so without spelling MINGW or WINDOWS out?
That is, if your test requires process group features available, can
we come up with a lazy prerequisite and use that to decide if we
skip the test?
Earlier in the discussion, you said who are left behind if we did so
on systems with process groups, but I wonder what happens when we
throw a signal at the top-level "git" process on Windows, and if it
behaves better, perhaps we can implement stop_git differently where
process groups are not available, instead of skipping the tests
altogether?
Thanks.
|
On the Git mailing list, Koji Nakamaru wrote (reply to this): On Thu, Oct 3, 2024 at 3:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>> On windows, there is no process group so the test cannot run
>> correctly. As hangs corrected with the patch occur only for darwin, I
>> would like to skip MINGW in the test. Is it okay?
>
> Surely. But can we do so without spelling MINGW or WINDOWS out?
>
> That is, if your test requires process group features available, can
> we come up with a lazy prerequisite and use that to decide if we
> skip the test?
I tweaked fsm-listen-win32.c to cause hangs and tested on windows. I'm
sorry that simply saying "there is no process group" was not quite
correct. Each mingw process has a process group and its win32
subprocesses can be killed by "kill -- -$pgid"
For example, when a hang occurs, the following processes remain.
PID PPID PGID WINPID TTY UID STIME
COMMAND
56782 40923 56782 9484 pty0 1052296 16:23:22
/mingw64/bin/git
# mingw git process
78100 0 0 12564 ? 0 16:23:23
C:\git-sdk-64\mingw64\libexec\git-core\git.exe
# win32 process
86108 0 0 20572 ? 0 16:23:23
C:\git-sdk-64\mingw64\libexec\git-core\git.exe
# win32 subprocess
73328 0 0 7792 ? 0 16:23:23
C:\git-sdk-64\mingw64\libexec\git-core\git.exe
# win32 fsmonitor
> Earlier in the discussion, you said who are left behind if we did so
> on systems with process groups, but I wonder what happens when we
> throw a signal at the top-level "git" process on Windows, and if it
> behaves better, perhaps we can implement stop_git differently where
> process groups are not available, instead of skipping the tests
> altogether?
If we do "kill 56782" or "kill -- -56782" for the above example, most of
processes are terminated except the win32 fsmonitor. This is because the
win32 fsmonitor is detached by FreeConsole().
I also tried "git fsmonitor--daemon stop". It was able to communicate
with the win32 fsmonitor and the internal status of the win32 fsmonitor
changed, but the win32 daemon didn't terminate.
Because it's getting complicated, how about the following:
* specify MINGW
* note in the commit log:
The test is disabled for MINGW because hangs treated with this patch
occur only for Darwin and there is no simple way to terminate the
win32 fsmonitor daemon that hangs.
Koji Nakamaru |
fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets the value with trailing "/." (as repo_get_git_dir(the_repository) on Darwin returns ".") so that fsmonitor_classify_path_absolute() returns IS_OUTSIDE_CONE. In this case, fsevent_callback() doesn't update cookie_list so that fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is not invoked. As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond that with_lock__mark_cookies_seen() should unlock, the whole daemon hangs. Remove trailing "/." from state->path_gitdir_watch.buf for submodules and add a corresponding test in t7527-builtin-fsmonitor.sh. The test is disabled for MINGW because hangs treated with this patch occur only for Darwin and there is no simple way to terminate the win32 fsmonitor daemon that hangs. Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
On the Git mailing list, Junio C Hamano wrote (reply to this): Koji Nakamaru <koji.nakamaru@gree.net> writes:
> Because it's getting complicated, how about the following:
>
> * specify MINGW
> * note in the commit log:
> The test is disabled for MINGW because hangs treated with this patch
> occur only for Darwin and there is no simple way to terminate the
> win32 fsmonitor daemon that hangs.
Sounds good to me.
Thanks. |
This patch series was integrated into seen via git@951056a. |
aabc1c2
to
7b7224e
Compare
/submit |
Submitted as pull.1802.v4.git.1728000433326.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Koji Nakamaru wrote (reply to this): On Fri, Oct 4, 2024 at 2:44 AM Junio C Hamano <gitster@pobox.com> wrote:
> > Because it's getting complicated, how about the following:
> >
> > * specify MINGW
> > * note in the commit log:
> > The test is disabled for MINGW because hangs treated with this patch
> > occur only for Darwin and there is no simple way to terminate the
> > win32 fsmonitor daemon that hangs.
>
> Sounds good to me.
Thank you. I've submitted [PATCH v4].
Koji Nakamaru |
On the Git mailing list, Junio C Hamano wrote (reply to this): "Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:
> ... The test is
> disabled for MINGW because hangs treated with this patch occur only for
> Darwin and there is no simple way to terminate the win32 fsmonitor
> daemon that hangs.
> ...
> @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
> + done
> +}
> +
> -+test_expect_success "submodule implicitly starts daemon by pull" '
> ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
> + test_atexit "stop_watchdog" &&
> + test_when_finished "stop_git; rm -rf cloned super sub" &&
> +
Let me update !MINGW to !WINDOWS while queuing. |
This patch series was integrated into seen via git@41ed0ab. |
On the Git mailing list, Ramsay Jones wrote (reply to this): On 04/10/2024 18:44, Junio C Hamano wrote:
> "Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> ... The test is
>> disabled for MINGW because hangs treated with this patch occur only for
>> Darwin and there is no simple way to terminate the win32 fsmonitor
>> daemon that hangs.
>> ...
>> @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
>> + done
>> +}
>> +
>> -+test_expect_success "submodule implicitly starts daemon by pull" '
>> ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
>> + test_atexit "stop_watchdog" &&
>> + test_when_finished "stop_git; rm -rf cloned super sub" &&
>> +
>
> Let me update !MINGW to !WINDOWS while queuing.
>
While this won't hurt, this test file is skipped on cygwin:
[23:19:33] t7527-builtin-fsmonitor.sh ......................... skipped: fsmonitor--daemon is not supported on this platform
(my eternal TODO list has an 'fsmonitor on cygwin?' item ...)
Thanks.
ATB,
Ramsay Jones
|
User |
On the Git mailing list, Junio C Hamano wrote (reply to this): Ramsay Jones <ramsay@ramsayjones.plus.com> writes:
> While this won't hurt, this test file is skipped on cygwin:
>
> [23:19:33] t7527-builtin-fsmonitor.sh ......................... skipped: fsmonitor--daemon is not supported on this platform
>
> (my eternal TODO list has an 'fsmonitor on cygwin?' item ...)
Thanks, then let's not bother.
|
This patch series was integrated into seen via git@e8b19c8. |
This patch series was integrated into next via git@5a9a877. |
This patch series was integrated into seen via git@2ab53b5. |
This patch series was integrated into master via git@2ab53b5. |
This patch series was integrated into next via git@2ab53b5. |
Closed via 2ab53b5. |
On the Git mailing list, Koji Nakamaru wrote (reply to this): On Sat, Oct 5, 2024 at 2:44 AM Junio C Hamano <gitster@pobox.com> wrote:
> > ... The test is
> > disabled for MINGW because hangs treated with this patch occur only for
> > Darwin and there is no simple way to terminate the win32 fsmonitor
> > daemon that hangs.
> > ...
> > @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
> > + done
> > +}
> > +
> > -+test_expect_success "submodule implicitly starts daemon by pull" '
> > ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
> > + test_atexit "stop_watchdog" &&
> > + test_when_finished "stop_git; rm -rf cloned super sub" &&
> > +
>
> Let me update !MINGW to !WINDOWS while queuing.
I see, Thank you.
Koji Nakamaru |
cc: Ramsay Jones ramsay@ramsayjones.plus.com