Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

Merged
merged 1 commit into from
Aug 10, 2017
Merged

bpo-31173: Rewrite WSTOPSIG test of test_subprocess #3055

merged 1 commit into from
Aug 10, 2017

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Aug 10, 2017

The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.

https://bugs.python.org/issue31173

The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
@vstinner vstinner requested a review from gpshead August 10, 2017 10:04
@vstinner vstinner merged commit 7b7c6dc into python:master Aug 10, 2017
@vstinner vstinner deleted the stopped branch August 10, 2017 10:37
@gpshead
Copy link
Member

gpshead commented Aug 10, 2017

I think your mocking test makes sense here. LGTM.

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

@gpshead gpshead self-assigned this Aug 10, 2017
@vstinner
Copy link
Member Author

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

The leaking resource is a blocker point for me. When I ran "./python -m test -m test_child_terminated_in_stopped_state -F test_subprocess", my laptop started to die: the total memory usage was close to 100%!

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

I hate ptrace() because it makes processes enter very weird states. For example, sending SIGKILL to a traced process is not enough to kill it... "traced process": a process which just called ptrace(PTRACE_ME) with no other process attached to it ...

I spent some time working on a debugger written in pure Python: https://github.com/haypo/python-ptrace/

I was always amazed by the weird ptrace bugs I had. Sometimes, I just had to reboot my computer. It was impossible to kill some processes, my computer was stuck.

So yeah, usually I prefer functional tests over unit tests, but here I think that's an unit test is an acceptable compromise.

What do you think?

--

To be honest, I didn't try hard to workaround the weird zombie process. I tried two additional os.waitpid() on macOS, but I had another behaviour on Linux... So I just decided to give up and try something completely different ;-)

@bedevere-bot
Copy link

GH-3070 is a backport of this pull request to the 3.6 branch.

@gpshead
Copy link
Member

gpshead commented Aug 11, 2017

btw for reference, what platform (os and version) were you on where you saw the problem?

@vstinner
Copy link
Member Author

btw for reference, what platform (os and version) were you on where you saw the problem?

The increasing memory and https://bugs.python.org/issue31173#msg300066 message was on Linux.

My attempt to use additional os.waitpid() to workaround the issue was tried on macOS.

vstinner added a commit that referenced this pull request Aug 11, 2017
The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
(cherry picked from commit 7b7c6dc)
vstinner added a commit that referenced this pull request Aug 11, 2017
The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
(cherry picked from commit 7b7c6dc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants