Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't setpgid() on actors #3347

Merged
merged 3 commits into from
Nov 20, 2018
Merged

Don't setpgid() on actors #3347

merged 3 commits into from
Nov 20, 2018

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Nov 18, 2018

What do these changes do?

Don't call setpgid() on actors (introduced in #3297)
pros: Ctrl-C won't leak actor processes
cons: actors can more easily leak subprocesses (but you can manage these via atexit hooks)

Related issue number

Closes #3345

@@ -976,7 +972,7 @@ def _wait_for_and_process_task(self, task):
driver_id, function_id.id()) == execution_info.max_calls)
if reached_max_executions:
self.local_scheduler_client.disconnect()
os._exit(0)
sys.exit(0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertnishihara this doesn't have to be an os._exit right? That prevents exit hooks from running.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be anything in particular. Just needs to exit the process.

@@ -29,12 +29,13 @@ def __init__(self, config):
self.action_space = Discrete(2)
self.observation_space = Discrete(2)
# Subprocess that should be cleaned up
self.subproc = subprocess.Popen(UNIQUE_CMD, shell=True)
self.subproc = subprocess.Popen(UNIQUE_CMD.split(" "), shell=False)
Copy link
Contributor Author

@ericl ericl Nov 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't launch in a shell now, since we don't have process group containment. Launching it in a shell causes the process to escape.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9430/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9429/
Test FAILed.

@robertnishihara
Copy link
Collaborator

You seem to be reverting the code change from #3297 but leaving the test in place, is that right? What's the point of the test if we revert the pgid change?

@ericl
Copy link
Contributor Author

ericl commented Nov 18, 2018 via email

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9431/
Test FAILed.

@ericl ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 19, 2018
@robertnishihara robertnishihara merged commit afc48d7 into ray-project:master Nov 20, 2018
@robertnishihara robertnishihara deleted the no-pgid branch November 20, 2018 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants