Skip to content

Add SDL_Process subsystem #10618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 13, 2024
Merged

Add SDL_Process subsystem #10618

merged 2 commits into from
Sep 13, 2024

Conversation

Semphriss
Copy link
Contributor

Description

I've created basic support for spawning processes, establishing communication with std{in,out,err}, setting environment variables and waiting/killing processes.

Progress

As of opening the PR, only Unix is supported. I would prefer to let someone else create the Windows version, since I'm not really familiar with Win32 and might end up doing rookie mistakes.

For the API, I went for the easiest interface I could make. It's probably riddled with Unix-isms; I suppose the implementation of the Windows version will expose that. Naturally, I'll remain attentive to the difficulties encountered on other platforms and I'll adapt the code to be convenient for all.

Generally speaking, I organized my code in the simplest way I could, but everything is open to discussion, whether it be enumeration names, API design decisions, choice of functions, supported features, and so on. I've taken great inspiration from the Dialog subsystem to design the Process one, but I'd like to let someone more familiar than me with SDL's internals review the design to make sure I didn't miss anything.

API

#include <SDL3/SDL.h>

int main(void)
{
  SDL_Init(0);

  const char * const args[] = { "/usr/bin/ls", "-al", NULL };
  const char * const env[] = { "LC_ALL=C", NULL };

  /* Create a process with the paren't environment */
  SDL_Process process = SDL_CreateProcess(args, NULL, 0);

  /* Create a process with a custom environment (does not inherit the paren't env) */
  SDL_Process process = SDL_CreateProcess(args, env, 0);

  /* Create a process with stdin, stdout and stderr support */
  SDL_Process process = SDL_CreateProcess(args, NULL, SDL_PROCESS_STDIN | SDL_PROCESS_STDOUT | SDL_PROCESS_STDERR);

  char buf[64];
  int size = 64;

  /* Read from the process' stdout (use SDL_ReadErrProcess for stderr) */
  size = SDL_ReadProcess(process, buf, size);

  /* Write to the process' stdin */
  size = SDL_WriteProcess(process, buf, size);

  /* Get the process' status (0 = still running, 1 = stopped, -1 = error) */
  SDL_WaitProcess(process, false);

  /* Block and wait for the process to finish running */
  SDL_WaitProcess(process, true);

  /* Stop the process */
  SDL_KillProcess(process, false);

  /* Forcibly stop the process, without giving it time to free its resources */
  SDL_KillProcess(process, true);

  SDL_DestroyProcess(process);

  SDL_Quit();
  return 0;
}

Existing Issue(s)

Closes #10599

@madebr
Copy link
Contributor

madebr commented Aug 29, 2024

Would it make sense to use SDL_IOStream for reading/writing from subprocesses?

@Semphriss
Copy link
Contributor Author

It would be possible to create a wrapper for it, but some functions wouldn't be meaningful (size, seek) and others would be hiding options which can vary from case to case (read from stdout, stderr, or a merge of both; close sending SIGTERM (which some apps choose to ignore) or SIGKILL; etc.), so I preferred to let users implement the interface themselves if they wanted to use processes as IOStreams, as they could make each of those decisions themselves.

I can try to make a default IOStream interface if we can decide how to handle these cases by default, but it would be preferable to document those decisions to avoid surprises and questions downstream.

@madebr
Copy link
Contributor

madebr commented Aug 29, 2024

I can try to make a default IOStream interface if we can decide how to handle these cases by default, but it would be preferable to document those decisions to avoid surprises and questions downstream.

SDL_GetIOSize, SDL_SeekIO and SDL_TellIO returning a "not supported" error would be expected.
Your api looks very similar to the python's subprocess.
There, doing proc.stdout.seek(0) throws a io.UnsupportedOperation exception with message "underlying stream is not seekable".
An advantage of re-using SDL_IOStreams is that you would also benefit from the various SDL_IO* helper functions, e.g. SDL_IOprintf.

@slouken slouken added this to the 3.2.0 milestone Aug 29, 2024
@namandixit
Copy link
Contributor

namandixit commented Aug 29, 2024

One suggestion: The env parameter provided in sample above should maybe have the format:

const char * const env[] = { "LC_ALL", "C", NULL };

or even:

struct Env { char *key, *val; };
Env env[] = { {"LC_ALL", "C"}, NULL };

Given that user will have separate strings for keys and values (that might come from some external configuration file), creating string like "LC_ALL=C" will require string processing in C, which is never a fun thing to do. It will also add requirements for quoting '=' character in keys, etc.

@madebr
Copy link
Contributor

madebr commented Aug 30, 2024

Another advantage of using SDL_IOStream for I/O, is that you can send EOF, or close the subprocess, by executing SDL_CloseIO on the streams.
We should also think about how the subprocess subsystem integrates with async I/O (#10605).

Perhaps we can use the SDL_Properties api to get the I/O streams from a SDL_Process pointer?
The SDL_Properties api can also be used to configure the environment variables of child processes.

So I suggest to create a SDL_CreateProcessFromProperties function, and implement SDL_CreateProcess on top of that.

@Semphriss
Copy link
Contributor Author

One suggestion: The env parameter provided in sample above should maybe have the format:

I did think about doing something like this, but I noticed that the Win32 API also uses the NAME=VALUE format, so I thought this wouldn't be a problem for cross-platform compatibility.

Also, from my habit, whenever environment variables are hardcoded or written in a file like .env, they're already stored in this form; the only case I can see where building the environment string would be required is when the key name or value itself would need to be built, in which case I assumed building the whole line wouldn't be too much extra burden. OTOH, having separate strings for keys and values would require users which already have NAME=VALUE format to split those strings, which would add on them the burden we're trying to avoid.

@namandixit
Copy link
Contributor

namandixit commented Aug 30, 2024

@Semphriss Sure, I don't have strong feelings about either; although

would require users which already have NAME=VALUE format to split

Splitting is just a byte swap (replace '=' by '\0') after an iteration to find the '=' character, whereas creating a combined string is two strlen, one malloc, two memcpy and one free.

@Semphriss
Copy link
Contributor Author

Splitting is just a byte swap (replace '=' by '\0'), whereas creating a combined string is two strlen, one malloc, two memcpy and one free.

If the strings are const, it would be one strchr(), one strndup(), one free(), and one pointer to the original string after the =, keeping in mind the differences in memory management. (Or two strndup(), or one strndup() and one strdup(), and two free() for those who prefer the simplicity in memory management)

I'm currently stuck on const stuff like this, because I accept a full-const list of strings for arguments, but exec*() expects a const list of mutable strings :)

@namandixit
Copy link
Contributor

Well, this nonsense is why I don't use const :)
But sure, as I said, it's not a hill worth dying on, this API will be used only a few times in some isolated part of code, so whatever.

@madebr
Copy link
Contributor

madebr commented Sep 2, 2024

@Semphriss
I created a PR to your fork with a sketch how I see interprocess communcation using SDL_IOStreams
Semphriss#1

Copy link
Contributor

@smcv smcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what I've seen in GLib's GSpawn low-level interface and GSubprocess high-level interface, this is the sort of topic where all the OS-level APIs are entirely made out of sharp edges, and it's possible to write useful code that doesn't have undefined behaviour, but only by making heroic efforts to do so.

The devil is in the details with this stuff, and I would suggest letting it take as long as it takes - I suspect that rushing this to get something into SDL 3.2 will only make it hard to fix API issues later.

Windows and Unix have very different models for process management, so trying to stabilize the API for this sort of thing without first getting it working on Windows is probably going to be a bad idea.

I'll leave some comments on implementation issues, but I'll also leave some comments on higher-level design stuff after.

@smcv
Copy link
Contributor

smcv commented Sep 5, 2024

A design decision: on Unix, if the parent process has any file descriptors open that do not have FD_CLOEXEC set (unfortunately, the "bad" state is the default), this is going to "leak" them all into the child process, which will result in those files/sockets/resources being held open unless it takes steps to close them. Is this considered an acceptable limitation?

On modern Linux and FreeBSD, you can close fds with close_range(3, ~0U, 0), or set them all close-on-exec with close_range(3, ~0U, CLOSE_RANGE_CLOEXEC) (recent Linux). But the fallback paths if your OS is too old, or is not Linux or FreeBSD, are really difficult to write correctly while taking into account the limitation that you can only safely use async-signal-safe functions. Take a look at GLib's GSpawn (which is not under a license suitable for copying into SDL) and you'll see what I mean.

@slouken slouken modified the milestones: 3.2.0, 3.x Sep 5, 2024
@smcv
Copy link
Contributor

smcv commented Sep 5, 2024

A design decision: on Unix, because of copy-on-write semantics, if your process has a lot of memory in use (I forget whether it's virtual or physical memory that matters), fork() will briefly allocate twice that much. This can result in out-of-memory conditions that make it impossible to fork(), especially on 32-bit systems. GNOME Shell (which allocates lots of virtual memory for window surfaces, because it's a compositor) suffered from this in the past.

GLib's GSpawn interface eventually solved this for GNOME Shell by adding a fast-path that uses posix_spawn(), which is a more limited kernel interface that doesn't allow arbitrary things to happen between fork() and exec(), which means the copying can be skipped. The down side is that you can't do arbitrary things, like reporting errors more clearly or ensuring that all "leaked" fds are closed or close-on-exec.

It might be worthwhile to look at the posix_spawn() interface and see whether that would be sufficient for SDL's needs.

@smcv
Copy link
Contributor

smcv commented Sep 5, 2024

A design decision: waitpid() interoperates badly with any other part of the same process that might waitpid() for the same pid, either explicitly or by using a wildcard like waitpid (-1, ...). Only the first caller to wait for a given pid will get a correct result. For the second caller, the information has already been lost and it's unrecoverable.

Is this something that SDL needs to try to detect and handle somewhat gracefully?

I think this should at least be documented - see the intimidating long list in https://docs.gtk.org/glib/func.child_watch_source_new.html for the sort of things I'm concerned about.

@Semphriss
Copy link
Contributor Author

I've merged @madebr's changed in this PR, I'll check to apply the suggestions in a while.

@Semphriss
Copy link
Contributor Author

Regarding copy-on-write, does a fail in exec*() immediately cause a copy, given that it sets errno? My best research couldn't yield any meaningful information about it.

@smcv
Copy link
Contributor

smcv commented Sep 9, 2024

Regarding copy-on-write, does a fail in exec*() immediately cause a copy, given that it sets errno?

In the GNOME Shell issue I was referring to, it never even got this far: according to https://gitlab.gnome.org/GNOME/glib/-/merge_requests/95 , fork() was failing with ENOMEM, because the kernel pessimistically assumed that it might need to duplicate all of the GNOME Shell process's memory.

@Semphriss
Copy link
Contributor Author

After searching, I found as alternatives for fork the functions vfork (anything other than a call to exec*() or _exit() is UB, no good here) and posix_spawn.

posix_spawn seems to be enough for our purposes here, but I can't find any documentation about its availability. Would it be a good option to implement and if so, depending on how widespread posix_spawn is, should I keep fork/exec as a documented fallback?

@madebr
Copy link
Contributor

madebr commented Sep 9, 2024

posix_spawn seems to be enough for our purposes here, but I can't find any documentation about its availability. Would it be a good option to implement and if so, depending on how widespread posix_spawn is, should I keep fork/exec as a documented fallback?

The cmake script can do tests for availability of symbols. You'll need to test for availability of (v)fork either ways.

@smcv
Copy link
Contributor

smcv commented Sep 9, 2024

posix_spawn itself has been part of POSIX since 2001, so any reasonable Unix platform ought to have it (and I personally think it would be OK for a Unix implementation of a subprocess API to be a stub that always fails if it isn't available). It isn't expected to exist on Windows or other non-Unixy OSs, which will need their own implementation of a subprocess API (for example using , but those probably don't have fork() anyway.

Individual functions like posix_spawn_file_actions_adddup2 might not be as portable as posix_spawn itself.

The main reason I brought up posix_spawn is that it's a more restrictive API, so there are three possible things that could be done with it, requiring a design decision:

  1. ignore it as not-useful-enough
  2. use it where possible but fall back to fork/exec if the user requests functionality that it can't implement (this is what GLib does, but that makes it more complicated)
  3. intentionally limit your subprocess API so that it only does things that are implementable in terms of posix_spawn

@Semphriss
Copy link
Contributor Author

You'll need to test for availability of (v)fork either ways.

It's worth noting that the Zenity implementation of dialogs uses fork/exec, so if the symbols aren't already detected for those, it might be good to check if they need to be added. I'm not sure if it was an oversight, or if the code is in some special situation where it doesn't need to detect fork/exec beforehand.

@madebr
Copy link
Contributor

madebr commented Sep 9, 2024

You'll need to test for availability of (v)fork either ways.

It's worth noting that the Zenity implementation of dialogs uses fork/exec, so if the symbols aren't already detected for those, it might be good to check if they need to be added. I'm not sure if it was an oversight, or if the code is in some special situation where it doesn't need to detect fork/exec beforehand.

Good point.
Maybe we can rebase the zenity dialog system on top of this SDL process system?

@smcv
Copy link
Contributor

smcv commented Sep 9, 2024

if the code is in some special situation where it doesn't need to detect fork/exec beforehand

The special situation is "runs on Unix" - any Unix OS worthy of the name has had fork() available since time immemorial.

@slouken
Copy link
Collaborator

slouken commented Sep 13, 2024

Okay, I'm expanding the output options into an enum and adding process and file redirection.

@slouken
Copy link
Collaborator

slouken commented Sep 13, 2024

I think this is in a good place to merge. @Semphriss, @icculus, @madebr, @smcv, any feedback before that happens?

@slouken slouken force-pushed the process branch 2 times, most recently from e22d75a to 64d395f Compare September 13, 2024 19:51
@Semphriss
Copy link
Contributor Author

I think this is in a good place to merge. Semphriss, icculus, madebr, smcv, any feedback before that happens?

I'll read the latest version, there's been quite a few changes since I lasted checked completely so it might take a while, I'll try to be as fast as I can!

Copy link
Contributor Author

@Semphriss Semphriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a quick review. I haven't checked in-depth for bugs, memory issues and the like; those can be fixed after merging when they will be found.

* - `SDL_PROP_PROCESS_CREATE_STDERR_POINTER`: an SDL_IOStream pointer used for standard error when `SDL_PROP_PROCESS_CREATE_STDERR_NUMBER` is set to `SDL_PROCESS_STDIO_REDIRECT`.
* - `SDL_PROP_PROCESS_CREATE_STDERR_TO_STDOUT_BOOLEAN`: true if the error output of the process should be redirected into the standard output of the process. This property has no effect if `SDL_PROP_PROCESS_CREATE_STDERR_NUMBER` is set.
*
* On POSIX platforms, wait() and waitpid(-1, ...) should not be called, and SIGCHLD should not be ignored or handled because those would prevent SDL from properly tracking the lifetime of the underlying process. You should use SDL_WaitProcess() instead.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be specified for SDL_CreateProcess() as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch of details here that SDL_CreateProcess() just says go look at SDL_CreateProcessWithProperties() to find out. That's the simplified interface, with simplified documentation.

Also added SDL_PROP_IOSTREAM_FILE_DESCRIPTOR_NUMBER and refactored the internal API to be able to create SDL_IOStream objects from native file handles.
@icculus
Copy link
Collaborator

icculus commented Sep 13, 2024

Oh man, can we get from a flush to an fdatasync on iostreams? That'll solve a pending async i/o issue.

@slouken
Copy link
Collaborator

slouken commented Sep 13, 2024

Okay, I've flattened this to 2 commits for merging. If everything looks good in CI, I'll go ahead and merge!

@slouken slouken force-pushed the process branch 2 times, most recently from f71e45d to 35163f6 Compare September 13, 2024 21:50
@slouken slouken merged commit 9eea823 into libsdl-org:main Sep 13, 2024
37 of 38 checks passed
@slouken
Copy link
Collaborator

slouken commented Sep 13, 2024

Huzzah! Finally merged! Thank you all for your hard work and feedback on this.

@sezero
Copy link
Contributor

sezero commented Sep 13, 2024

Just hit this testprocess test failure in my fork:

https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

@slouken
Copy link
Collaborator

slouken commented Sep 13, 2024

Just hit this testprocess test failure in my fork:

https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

I'm working on environment changes now, we can come back to this after those go in, if it's still an issue.

@sezero
Copy link
Contributor

sezero commented Sep 14, 2024

Just hit this testprocess test failure in my fork:
https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

I'm working on environment changes now, we can come back to this after those go in, if it's still an issue.

Note that there is not a failure in this PR's runs or in mainstream libsdl-org repo's CI runs -- i.e. I don't know how reproducible the failure is.

@slouken
Copy link
Collaborator

slouken commented Sep 14, 2024

Just hit this testprocess test failure in my fork:
https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

I'm working on environment changes now, we can come back to this after those go in, if it's still an issue.

Note that there is not a failure in this PR's runs or in mainstream libsdl-org repo's CI runs -- i.e. I don't know how reproducible the failure is.

The environment stuff has been reworked, please let me know if you run into this again!

@sezero
Copy link
Contributor

sezero commented Sep 15, 2024

Just hit this testprocess test failure in my fork:
https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

I'm working on environment changes now, we can come back to this after those go in, if it's still an issue.

Note that there is not a failure in this PR's runs or in mainstream libsdl-org repo's CI runs -- i.e. I don't know how reproducible the failure is.

The environment stuff has been reworked, please let me know if you run into this again!

Here it goes again: https://github.com/sezero/SDL/actions/runs/10873602064/job/30169991920#step:25:47729

@slouken
Copy link
Collaborator

slouken commented Sep 16, 2024

Just hit this testprocess test failure in my fork:
https://github.com/sezero/SDL/actions/runs/10856995459/job/30132725225

I'm working on environment changes now, we can come back to this after those go in, if it's still an issue.

Note that there is not a failure in this PR's runs or in mainstream libsdl-org repo's CI runs -- i.e. I don't know how reproducible the failure is.

The environment stuff has been reworked, please let me know if you run into this again!

Here it goes again: https://github.com/sezero/SDL/actions/runs/10873602064/job/30169991920#step:25:47729

Yep, not sure what is happening there. I'll add some more logging, maybe we can see what's up.

@sezero
Copy link
Contributor

sezero commented Sep 16, 2024

And here it shows its face in mainstream, meaning I'm not cursed:
https://github.com/libsdl-org/SDL/actions/runs/10886143592/job/30205557427

@madebr
Copy link
Contributor

madebr commented Sep 16, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDL3 Request: Process Spawning
8 participants