Skip to content
This repository has been archived by the owner on Apr 29, 2021. It is now read-only.

Stuck on the last test of each file #80

Open
oleiade opened this issue Nov 20, 2014 · 12 comments
Open

Stuck on the last test of each file #80

oleiade opened this issue Nov 20, 2014 · 12 comments

Comments

@oleiade
Copy link

oleiade commented Nov 20, 2014

Hi,

I'm extensively using the bats testing framework in trousseau. I have written a bunch of tests. Everything works fine on my laptop (OSX), and in vagrant boxes (Ubuntu 14.04), however when I run the tests in a CI environment like codeship or travis I noticed that bats always end up stuck on the last test of each files.

For example:

~/bats/bin/bats -t tests/create.bats
1..6
ok 1 create store with one valid recipient succeeds
ok 2 create symmetric store succeeds
ok 3 create generates a file in 0600 mode
ok 4 create store with one invalid recipient fails
ok 5 create store without a recipient fails
ok 6 create store with one valid recipient and one invalid recipient fails
# Hangs here, forever

Any ideas what could cause this? a wait4 syscall never ending? A SIGCHLD never received? An improper redirection of stdout/stderr hiding the real problem?

Thanks for your help :-)
Theo

@Sylvain303
Copy link
Contributor

Could you isolate the failing test in a short example?

I got some hangs too, not reproducible yet, it may be related to IFS see #89.

could you give your bash version?

@oleiade
Copy link
Author

oleiade commented Jan 29, 2015

Unfortunately, I can't isolate any specific test failing. The only thing common to all the hang ups is that they always happen on the last test of the first test file evaluated.

After I read it, I'm not so sure it is related to #89 (I actually don't know what this IFS is neither)

@Sylvain303
Copy link
Contributor

$IFS is internal bash variable, used in auto spliting, I've pushed on my forked version and preparing the pull request.

from man bash:

IFS    The  Internal  Field  Separator  that is used for word splitting
       after expansion and to split lines  into  words  with  the  read 
       builtin  command.   The  default  value  is ``<space><tab><newine>''.

@oleiade
Copy link
Author

oleiade commented Jan 29, 2015

Thanks for the info :-)

Gonna try with your fork see if it fixes things, never know...

@filex
Copy link

filex commented Mar 8, 2015

I might have the same issue: In one of my tests I call an /etc/init.d script. As long as I do this, bats doesn't return after the last check.

This is a simple way to reproduce it:

#!/usr/bin/env bats

@test "fork something" {
        sleep 5 &
        echo 'forked'
}

@test "hanging check" {
        echo 'hang'
}

This stalls bats until the 5s are over. This version may come closer to my init-script case:

#!/usr/bin/env bats

@test "fork something" {
        (while true; do sleep 1; done) &
        echo 'forked'
}

@test "hanging check" {
        echo 'hang'
}

You have to kill the wait with ctrl-c, then the check mark appears next to "hanging check" and bats terminates.

I see the same behaviour in master-branch and in tags/v0.4.0.

I'm new to bats. thus is blind guessing: Does bats look after forked processes and waits for them to terminate?

@filex
Copy link

filex commented Mar 8, 2015

In my init-script example I can terminate the hang by stopping the started daemon from another shell.

Furthermore, I have found a race condition in my test. One check has started the Apache httpd and another was supposed to stop it. Without anything else to test and do between start and stop, I believe the stop-test to be triggered before the forked httpd was ready to accept shutdown signals.

With more tests and some tactical sleeps I can make sure the forked processes are stopped before the end of the test suite. Then, bats terminates as expected.

@oleiade, do you have any "asymmetric" forks in your test, too?

@Sylvain303
Copy link
Contributor

@filex why not waiting for the pid? maybe starting the process with nohup or something similar writing the pid into a file. You don't start a test in a one test, and check the result in another, aren't you?

http://stackoverflow.com/questions/356100/how-to-wait-in-bash-for-several-subprocesses-to-finish-and-return-exit-code-0

I think you have to detach the process looping forever in order do finish the test.

@filex
Copy link

filex commented Mar 10, 2015

Yes, @Sylvain303, I start a daemon, run several checks and stop it lastly. This is split up in several tests to have somewhat meaningful test names for everything. My problem was the race condition between start and stop. Everything is working fine now.

But apparently bats does some kind of waiting itself. Do you know how and why?

When I call an init script (apache and my own software) in a shell, the processes are forked and vanish from the shells jobs list. Thus waiting does not work and returns immediately.

However, bats manages to wait until the forked processes are shut down. I wouldn't exactly call that a bug, but the behaviour was unexepected.

@Sylvain303
Copy link
Contributor

hum, no idea for the moment. I stepped in bats code for a bug, it parses the .bats, generate a valid shell script in a tmp and execute each function, I gonna try your bug to see I can see where it hangs. I will probably to have to write similar tests. But dont expect a fast reply from me ;)…

@myoung34
Copy link

myoung34 commented Jun 8, 2015

I ran into this and tried nohup, &, pid catching, all of those.
The only way to solve it was to write an init.d script to do it for me....

@esiegerman
Copy link

It's not stuck on the last test, but after it...

The problem is that bats uses file descriptor 3 as a communication channel among its internal processes. Your background sleep process inherits FD#3 from bats, as the write end of a pipe. Since your process is holding FD#3 open, the pipe doesn't report EOF to its reader, and bats, which is waiting for that to happen, hangs. When the sleep process exits, its FD#3 gets closed, and since that was the last writer, the pipe finally reports EOF.

I've tried to come up with a patch for bats, but haven't suceeded. I'd have expected this to work:

diff --git a/libexec/bats-exec-test b/libexec/bats-exec-test
index 044b2c3..0ef541f 100755
--- a/libexec/bats-exec-test
+++ b/libexec/bats-exec-test
@@ -321,7 +321,7 @@ bats_perform_test() {
     trap "bats_debug_trap \"\$BASH_SOURCE\"" debug
     trap "bats_error_trap" err
     trap "bats_teardown_trap" exit
-    "$BATS_TEST_NAME" >>"$BATS_OUT" 2>&1
+    "$BATS_TEST_NAME" >>"$BATS_OUT" 2>&1 3>-
     BATS_TEST_COMPLETED=1

   else

but it doesn't; with it, "pretty output" only contains the checkmarks, no test names! I don't know why.

As a workaround, try changing the sleep in your test to:

sleep 5 3>- &

stuart-c added a commit to stuart-c/gomplate that referenced this issue Aug 7, 2017
mbland added a commit to mbland/go-script-bash that referenced this issue Dec 5, 2017
Closes #226. From the comment within `run_in_background`:

Bats duplicates standard output as file descriptor 3 so that output from
its framework functions isn't captured along with any output from the
code under test. If the code under test contains a `sleep` or other
blocking operation, this file descriptor will be held open until the
process becomes unblocked, preventing Bats from exiting. Hence, we
explicitly close file descriptor 3.

Any other code running under Bats that opens a background process should
close this file descriptor as well. See:
sstephenson/bats#80

Much thanks to @marascio for discovering and researching the problem,
and proposing the actual fix.
yarikoptic pushed a commit to neurodebian/bats that referenced this issue Aug 6, 2019
Closes sstephenson#72 and sstephenson#81. Both of those issues deal with the `bats_error_trap`
executing with an incorrect value for `$?`. Bats exited with an error,
but provided no stack trace information to pinpoint its source.

* sstephenson#72 deals with the inconsistent failure-mode behavior of the `source`
  builtin between Bash versions. I believe it's feasible to update
  `load` with robust error checking around `source` and strongly
  recommend its use in Bats test files instead of using `source`
  directly. I've already opened sstephenson#80 to track this.

* sstephenson#81 deals with inconsistent failure-mode behavior of `set -u`,
  particularly when code within a Bats test file itself references an
  unbound variable.

With credit to sstephenson, Bats was smart enough to know and to report
that an error occurred, even when it couldn't tell exactly where the
error came from.

Building on that existing mechanism, this change produces output for
these failure cases even when `bats_error_trap` isn't called with the
correct value for `$?`. It passes all the existing tests under Bash
3.2.57(1)-release and 4.4.19(1)-release. I also timed the original
suites with and without the change to ensure the runtime cost was
negligible.

A couple more notes:

* This change obsoletes the suggestion in sstephenson#81 that a test case to
  validate testing code running under `set -u` is necessary. `run()`
  disables `set -e` for the code under test to begin with, sidestepping
  the problem with `set -eu` interactions in Bash versions prior to
  4.1.0.

* I've noticed that this mechanism does _not_ work when the problematic
  behavior is in `teardown()`. The change preserves existing behavior,
  the remedy for that issue requires further thinking (and a new issue).
@Dentrax
Copy link

Dentrax commented Nov 23, 2020

@esiegerman Thanks for your workaround. Can you please explain what 3>- does? I could not find this on Google, this usage is the first time I've seen. Looks interesting!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants