Skip to content

Add new API that can correctly wait for termination of processes forked with exec on Windows. #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jan 30, 2017

Conversation

Mistuke
Copy link
Contributor

@Mistuke Mistuke commented Dec 11, 2016

See #77

In the end I had to make a new API, since the handle returned by the Job should never be waited on.
Existing code would probably call the old wait function on it and then block indefinitely (or until a timeout if any was specified).

Instead I have created a new API and replaced the internals functions that do a fork/exec/wait with the new api.

I have also fixed the issue with the exit code from processes forked with exec not correctly being returned.

-- ---------------------------------------------------------------------------
-- executeAndWait

-- | Create a new process and wait for it's termination.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"its"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really should stop making this typo...

-- ---------------------------------------------------------------------------
-- executeAndWait

-- | Create a new process and wait for it's termination.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain how this is different from createProcess followed by waitForProcess. It's quite worrying to have two APIs for the same thing that have subtly different behaviour, but I didn't follow all the details of the problem. Is there really no way to make waitForProcess do the right thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so the basic problem is this:
Say you have two programs A and B. Now A calls B. There are two ways to do this.

  1. You can use the normal CreateProcess API, which is what normal Windows code do. Using this approach, the current waitForProcess works absolutely fine.
  2. You can call the emulated POSIX function _exec, which of course is supposed to allow the child process to replace the parent.

With approach 2) waitForProcess falls apart because the Win32's process model does not allow this the same way as linux. _exec is emulated by first making a call to CreateProcess to spawn B and then immediately exiting from A. So you have two different processes.

waitForProcess is waiting on the termination of A. Because A is immediately killed, waitForProcess will return even though B is still running. This is why for instance the testsuite on Windows had lots of file locked errors.

This approach creates a new Job and assigned A to the job, but also all future processes spawned by A. This allows us to listen in on events, such as, when all processes in the job are finished, but also allows us to propagate exit codes from _exec calls.

The only reason we need this at all is because we don't interact with just actual native code on Windows, and instead have a lot of ported POSIX code.

I could add a new constructor to ProcessHandle and make the decision on what to do internally in the functions. This would probably generate warnings in user code that deconstruct ProcessHandle though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm my understanding: with the current API on Linux, if we have:

  1. Process A calls rawSystem "B" []
  2. Process B calls createProcess $ proc "C" [] and then exits

Then the rawSystem call in A will return when B exits, even if C is still running. With this new API on Windows, A would block until both B and C exit. Is this correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mistuke that's a really good explanation, thanks. (it would be great to include this as a comment in the code)

I could add a new constructor to ProcessHandle

But how would you know when to do that? Would this entail creating the Job for every process we spawn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snoyberg By default, yes. However CreateProcess has a flag CREATE_BREAKAWAY_FROM_JOB that allows a child process to break away from any parent job. So you can get the same behaviour as on linux by B using this flag when it spawns C. Though I agree, it is subtle.. (There's also a flag to prevent any children from breaking away, so it goes both ways :))

@simonmar I'll update the comment.
So there are two alternate approaches that could be taken here:

  1. always use a job. In which case, if a user hasn't been deconstructing ProcessHandle then it should work transparently.
  2. Combine the new function with the old one instead of having two calls now, and make the behaviour implicit on flag. But this would not be backwards compatible, since of course there's a new parameter.

I went with separate functions to avoid always using a job and breaking backwards compatibility.

If we go for 1) then, according to some very statistically unsound testing, the added overhead is about 0.0021 ms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should note, there's also one useful thing jobs give you: it allows you to get and set detailed resource limits on processes. e.g. not use more than x memory, don't run longer than y ms, etc.

Which is why I wanted to expose the job handle back to the user as well. Since while not provided here, the changes you would need to set these should be rather trivial now, this is what they are in general used for along with just monitoring processes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. always use a job. In which case, if a user hasn't been deconstructing ProcessHandle then it should work transparently.
    ...
    If we go for 1) then, according to some very statistically unsound testing, the added overhead is about 0.0021 ms.

Seems like a no-brainer to me. 2 microseconds per process creation, and only on Windows (where process creation is already many milliseconds, IIRC).

ProcessHandle is already abstract in the basic System.Process API, you have to import System.Process.Internals to get it, so I can't imagine it would break much code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so both of you seem to favor different options. How about a middle ground. I keep two separate functions for spawning, but only have one for waiting?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @simonmar and I do seem to be at odds. Simon: AFAICT, switching createProcess to have the new job-based behavior is a significantly breaking semantic change on Windows only, which is a huge recipe for disaster. Do you disagree with this assessment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see. What I'm concerned about is adding a new API that duplicates existing functionality but with subtly different behaviour on one platform. That's also a recipe for disaster.

What is using exec on Windows?

@Mistuke
Copy link
Contributor Author

Mistuke commented Dec 13, 2016

Also, I found a bug with this version when doing a GHC bootstrapping build. I'll fix that soon.

@Mistuke
Copy link
Contributor Author

Mistuke commented Dec 14, 2016 via email

@snoyberg
Copy link
Collaborator

What about the alternative of having a setting in CreateProcess that switches to the job-based behavior, with the clear documentation that:

  • It only affects Windows, and has no impact on other OSes
  • It does not guarantee identical behavior to other OSes, but instead will have slightly different exit code behavior as we've discussed here

I think that will address the API proliferation issue (which I agree is a problem), and should hopefully be about as easy to work with from a user perspective.

@Mistuke
Copy link
Contributor Author

Mistuke commented Dec 14, 2016 via email

@snoyberg
Copy link
Collaborator

That's my preference. There's been a move over the past few releases to make CreateProcess an easier-to-use part of the API, and I'd recommend continuing in that directly such that it's fairly easy to replace rawSystem with some code that sets the appropriate option.

@simonmar
Copy link
Member

Ok,sounds good to me too. Thanks!

@snoyberg
Copy link
Collaborator

BTW @Mistuke, I have not yet looked at the actual code on this at all, I figured we should hash out this design bit first. I'm assuming I should hold off on an actual code review until the PR is updated to reflect these decisions, do you agree?

@Mistuke
Copy link
Contributor Author

Mistuke commented Dec 14, 2016 via email

@Mistuke Mistuke force-pushed the gh-77-create-process-hook branch from f5027af to 03d5d37 Compare January 7, 2017 19:38
@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 10, 2017

Sorry for the delay, I'm almost done, just have to figure out a weird issue with stage-2 when using this version to build gcc.

@Mistuke Mistuke force-pushed the gh-77-create-process-hook branch 2 times, most recently from f5027af to c4d9a7a Compare January 16, 2017 02:48
@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 16, 2017

Fixed the issue, now ready for review. Thanks!

@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 17, 2017

Oops, forgot to add the note, I'll do that today.

@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 27, 2017

Ping

@snoyberg
Copy link
Collaborator

snoyberg commented Jan 28, 2017 via email

@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 28, 2017

Cool, no rush, just wasn't sure you knew it was updated :)

@snoyberg
Copy link
Collaborator

snoyberg commented Jan 28, 2017 via email

@@ -604,7 +607,14 @@ waitForProcess ph@(ProcessHandle _ delegating_ctlc) = do
when delegating_ctlc $
endDelegateControlC e
return e

OpenExtHandle _ job iocp -> do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: looks like the do is unnecessary

@@ -639,6 +650,10 @@ getProcessExitCode ph@(ProcessHandle _ delegating_ctlc) = do
Just e | was_open && delegating_ctlc -> endDelegateControlC e
_ -> return ()
return m_e
where getHandle :: ProcessHandle__ -> PHANDLE
getHandle (OpenHandle h) = h
getHandle (ClosedHandle _) = error "getHandle: handle closed."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a hard requirement, but I'd prefer to avoid the partiality here. I was thinking something like:

case getHandle p_ of
  Left e -> return (p_, (Just e, False))
  Right h -> do ...
where
  getHandle (OpenHandle h) = Right h
  getHandle (ClosedHandle e) = Left e
  getHandle (OpenExtHandle h _ _) = Right h

--
-- Default: @Nothing@
--
-- @since 1.4.0.0
use_process_jobs :: Bool -- ^ On Windows systems this flag indicates that we should wait for the entire process tree
-- to finish before unblocking. On POSIX system this flag is ignored.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/system/systems

--
-- Default: @False@
--
-- @since 1.x.x.x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this 1.5.0.0

-}
data ProcessHandle__ = OpenHandle PHANDLE | ClosedHandle ExitCode
data ProcessHandle__ = OpenHandle PHANDLE
| OpenExtHandle PHANDLE PHANDLE PHANDLE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to make these fields strict? It does not seem like laziness would ever be a virtue here. (I realize that will make it mismatched with the other constructors, still asking if it's the right thing to do.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm don't think so in this case, since the only way they're used is with a foreign value. I don't think you can ever get a lazy value from an FFI call right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is really: would we ever want the laziness here? I would generally make fields strict if there's no reason to do otherwise, since it excludes a potential source of bugs and confusion. Also, in this case, I think it can (slightly) help performance by allowing unpacking of a primitive type, but that's a minor concern. Either way, this isn't vital.

#ifndef WINDOWS
pPrPr_disableITimers, c_execvpe,
ignoreSignal, defaultSignal,
#else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity: could you switch the conditional for #ifdef WINDOWS? It's slightly easier to read that way.

@@ -66,6 +70,18 @@ import System.Process.Posix
-- * This function takes an extra @String@ argument to be used in creating
-- error messages.
--
-- * 'use_process_jobs' can set in CreateProcess since 1.4.?.? in order to create
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably "can be set", and again let's use 1.5.0.0.

@@ -1,4 +1,5 @@
{-# LANGUAGE CPP #-}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to review this file too, but my review will be much less authoritative. Basically: I'm assuming you know what's going on for Windows better than I do, which is a really safe assumption 😄.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, fair enough :)

@@ -9,6 +9,8 @@

* New exposed `withCreateProcess`
* Derive `Show` and `Eq` for `CreateProcess`, `CmdSpec`, and `StdStream`
* Add support for monitoring process tree for termination with the parameter `use_process_jobs`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go into "Unreleased changes"

@@ -1,5 +1,5 @@
name: process
version: 1.4.3.0
version: 1.4.3.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this would be 1.5.0.0

@Mistuke Mistuke force-pushed the gh-77-create-process-hook branch from 129dd48 to f8b53d8 Compare January 29, 2017 21:43
@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 29, 2017

Seems the appveyor build is having trouble with sourceforge.

@snoyberg
Copy link
Collaborator

I've restarted the AppVeyor build, which now passed.

@snoyberg snoyberg merged commit d3d637d into haskell:master Jan 30, 2017
@snoyberg
Copy link
Collaborator

Thanks! What are your thoughts on roll-out of a new release for this? Do you want to get more testing of this with GHC before we release?

@Mistuke
Copy link
Contributor Author

Mistuke commented Jan 30, 2017 via email

@snoyberg
Copy link
Collaborator

snoyberg commented Jan 30, 2017 via email

@Mistuke
Copy link
Contributor Author

Mistuke commented Feb 2, 2017

Ok, I think we're good. Whenever you want to make the release is fine with me! After you do I can take care of updating the rest of the dependencies to get it in the GHC tree if it's easier for you!

@snoyberg
Copy link
Collaborator

snoyberg commented Feb 6, 2017

I've made the release to Hackage. If you could take care of the updates on the GHC side that would be great, much appreciated!

@Mistuke
Copy link
Contributor Author

Mistuke commented Feb 7, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants