Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child_processes_test.test_fails_nonzero_with_bad_exec is flaky on CI #102

Closed
chriskuehl opened this issue Jul 26, 2016 · 2 comments
Closed
Labels

Comments

@chriskuehl
Copy link
Contributor

This happened on CI during make test (not inside Docker):

21:49:33 py27 runtests: commands[0] | python -m pytest
21:49:33 ============================= test session starts ==============================
21:49:33 platform linux2 -- Python 2.7.6, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
21:49:33 rootdir: /ephemeral/jenkins_prod_slave/workspace/mirrors-Yelp-dumb-init, inifile: pytest.ini
21:49:33 plugins: timeout-1.0.0
21:49:33 collected 173 items
21:49:33 
21:49:39 tests/child_processes_test.py ....................F...
21:49:39 tests/cli_test.py ......................................................................
21:49:39 tests/exit_status_test.py ................................................
21:49:40 tests/proxies_signals_test.py ..........................
21:49:41 tests/shell_background_test.py ....
21:49:41 tests/tty_test.py .
21:49:41 
21:49:41 =================================== FAILURES ===================================
21:49:41 _________________ test_fails_nonzero_with_bad_exec[0-0-args0] __________________
21:49:41 
21:49:41 args = ('/doesnotexist',)
21:49:41 
21:49:41     @pytest.mark.parametrize('args', [
21:49:41         ('/doesnotexist',),
21:49:41         ('--', '/doesnotexist'),
21:49:41         ('-c', '/doesnotexist'),
21:49:41         ('--single-child', '--', '/doesnotexist'),
21:49:41     ])
21:49:41     @pytest.mark.usefixtures('both_debug_modes', 'both_setsid_modes')
21:49:41     def test_fails_nonzero_with_bad_exec(args):
21:49:41         """If dumb-init can't exec as requested, it should exit nonzero."""
21:49:41         proc = Popen(('dumb-init',) + args, stderr=PIPE)
21:49:41 >       _, stderr = proc.communicate()
21:49:41 
21:49:41 tests/child_processes_test.py:141: 
21:49:41 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
21:49:41 /usr/lib/python2.7/subprocess.py:793: in communicate
21:49:41     stderr = _eintr_retry_call(self.stderr.read)
21:49:41 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
21:49:41 
21:49:41 func = <built-in method read of file object at 0x2a56f60>, args = ()
21:49:41 
21:49:41     def _eintr_retry_call(func, *args):
21:49:41         while True:
21:49:41             try:
21:49:41 >               return func(*args)
21:49:41 E               Failed: Timeout >5s
21:49:41 
21:49:41 /usr/lib/python2.7/subprocess.py:476: Failed
21:49:41 ===================== 1 failed, 172 passed in 7.75 seconds =====================

Not quite sure what could have caused this...

@chriskuehl
Copy link
Contributor Author

I was able to reproduce this locally after running the test several thousand times. dumb-init (the parent) was alive and responding to signals, but the child was a zombie and not being reaped.

I then tried this: while :; do echo =========== && ./dumb-init -v /asdf; done

and again, eventually it stops:

===========
[dumb-init] setsid complete.
[dumb-init] /asdf: No such file or directory
[dumb-init] Child spawned with PID 72249.
ckuehl     3427  0.4  0.0  40644  5456 pts/149  Ss   22:13   0:02  \_ -zsh
ckuehl    72248  0.0  0.0   1104     4 pts/149  S+   22:22   0:00  |   \_ ./dumb-init -v /asdf
ckuehl    72249  0.0  0.0      0     0 ?        Zs   22:22   0:00  |       \_ [dumb-init] <defunct>

@chriskuehl chriskuehl added bug and removed tests labels Jul 26, 2016
@chriskuehl
Copy link
Contributor Author

I believe the problem is that we don't set the signal mask until after the fork in the parent, so if the child exits quickly enough, the SIGCHLD might be missed. Preparing a PR now.

chriskuehl added a commit to chriskuehl/dumb-init that referenced this issue Jul 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant