bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

vstinner · 2017-08-10T10:04:14Z

The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.

https://bugs.python.org/issue31173

The current test_child_terminated_in_stopped_state() function test creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and then crash (SIGSEGV). The problem is that calling os.waitpid() in the parent process is not enough to close the process: the child process remains alive and so the unit test leaks a child process in a strange state. Closing the child process requires non-trivial code, maybe platform specific. Remove the functional test and replaces it with an unit test which mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to test the WIFSTOPPED() path.

gpshead · 2017-08-10T22:19:36Z

I think your mocking test makes sense here. LGTM.

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

vstinner · 2017-08-10T22:49:39Z

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

The leaking resource is a blocker point for me. When I ran "./python -m test -m test_child_terminated_in_stopped_state -F test_subprocess", my laptop started to die: the total memory usage was close to 100%!

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

I hate ptrace() because it makes processes enter very weird states. For example, sending SIGKILL to a traced process is not enough to kill it... "traced process": a process which just called ptrace(PTRACE_ME) with no other process attached to it ...

I spent some time working on a debugger written in pure Python: https://github.com/haypo/python-ptrace/

I was always amazed by the weird ptrace bugs I had. Sometimes, I just had to reboot my computer. It was impossible to kill some processes, my computer was stuck.

So yeah, usually I prefer functional tests over unit tests, but here I think that's an unit test is an acceptable compromise.

What do you think?

--

To be honest, I didn't try hard to workaround the weird zombie process. I tried two additional os.waitpid() on macOS, but I had another behaviour on Linux... So I just decided to give up and try something completely different ;-)

bedevere-bot · 2017-08-11T00:15:21Z

GH-3070 is a backport of this pull request to the 3.6 branch.

gpshead · 2017-08-11T00:33:37Z

btw for reference, what platform (os and version) were you on where you saw the problem?

vstinner · 2017-08-11T00:36:15Z

btw for reference, what platform (os and version) were you on where you saw the problem?

The increasing memory and https://bugs.python.org/issue31173#msg300066 message was on Linux.

My attempt to use additional os.waitpid() to workaround the issue was tried on macOS.

The current test_child_terminated_in_stopped_state() function test creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and then crash (SIGSEGV). The problem is that calling os.waitpid() in the parent process is not enough to close the process: the child process remains alive and so the unit test leaks a child process in a strange state. Closing the child process requires non-trivial code, maybe platform specific. Remove the functional test and replaces it with an unit test which mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to test the WIFSTOPPED() path. (cherry picked from commit 7b7c6dc)

vstinner requested a review from gpshead August 10, 2017 10:04

the-knights-who-say-ni added the CLA signed label Aug 10, 2017

vstinner merged commit 7b7c6dc into python:master Aug 10, 2017

vstinner deleted the stopped branch August 10, 2017 10:37

gpshead self-assigned this Aug 10, 2017

vstinner added the needs backport to 3.6 label Aug 11, 2017

bedevere-bot removed the needs backport to 3.6 label Aug 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

vstinner commented Aug 10, 2017 •

edited by bedevere-bot

Loading

gpshead commented Aug 10, 2017

vstinner commented Aug 10, 2017

bedevere-bot commented Aug 11, 2017

gpshead commented Aug 11, 2017

vstinner commented Aug 11, 2017

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

Conversation

vstinner commented Aug 10, 2017 • edited by bedevere-bot Loading

gpshead commented Aug 10, 2017

vstinner commented Aug 10, 2017

bedevere-bot commented Aug 11, 2017

gpshead commented Aug 11, 2017

vstinner commented Aug 11, 2017

vstinner commented Aug 10, 2017 •

edited by bedevere-bot

Loading