Skip to content

Support async signals#683

Open
CvvT wants to merge 23 commits intomainfrom
weiteng/support_async_signals
Open

Support async signals#683
CvvT wants to merge 23 commits intomainfrom
weiteng/support_async_signals

Conversation

@CvvT
Copy link
Contributor

@CvvT CvvT commented Feb 25, 2026

This PR supports SIGINT/SIGALRM and syscall alarm.

On Linux, the workflow is the following:

  1. when boot, register SIGINT/SIGALRM handlers --> Ctrl-C arrives --> record signal in TLS
  2. about to switch to guest (i.e., prepare_to_run_guest) --> check TLS and queue signals --> process signal
  3. any blocking operations --> before commit to wait, check_for_interrupt --> check TLS and queue signals --> process signal

For shim to be able to retrieve pending signals from platform, add a new SignalProvider with the interface fn take_pending_signals(&self, mut f: impl FnMut(litebox::shim::Signal)).

Additionally, to support alarm, we need a mechanism to set up a timer with callback. I also added a new interface fn schedule_interrupt(&self, thread: Option<&Self::ThreadHandle>, delay: core::time::Duration) to ThreadProvider. For platform that does not implement it, the fallback solution is to check alarm just before switching back to the guest (which causes delay in delivering signals or never delivering signals at all if the program is being blocked and waiting for a signal).

One tricky problem is that when a signal arrives, the running thread may be about to be blocked or blocked already (i.e., calling futex wait), and we don't interrupt a thread if it is running inside LiteBox code. In which case, we never have the chance to process that signal. It might be possible to examine the RIP of the thread in the signal handler and divert the execution to skip futex wait call it if RIP is close to it (e.g., use assembly code to build a critical region), but this may not work on Windows (where we don't call syscall instruction directly). The current solution I have is based on the observation that for all blocking operations (e.g., futex wait, sleep), they all boil down to RawMutex::block_or_timeout with the per-thread WaitState and expected value ThreadState::WAITING. WaitState stores the thread status like RUNNING_IN_HOST, WAITING, WOKEN, and etc. Thus, in the signal handler, besides recording the signal, we can update the thread state from WAITING to WOKEN and then calls futex wake. If the thread is about to call futex wait, futex call would return immediately because of the change of the value. If the thread is being blocked, it would wake up and return. To access WaitState in the signal handler, I added two new interfaces on_wait_start and on_wait_end to RawMutex so that platform call set and clear WaitState in its TLS.

The signal handlers for SIGNINT/SIGNALRM can be called on any threads. However, we don't want non-guest threads (e.g., background network worker, additional test thread spawned by cargo test) to be called upon. For background network worker, we can block those signals. For test thread, the signal handler would check if it has our custom TLS. If not, it blocks those signals and raise them again.

On Windows, the implementation is almost the same except that signal handler is called on a kernel thread (and thus cannot access litebox thread's TLS easily). I have to use a global variable to record all active threads.

I added some end-to-end C tests and Rust unit tests. To make Rust unit test work, I also introduced a new interface fn run_test_thread<R>(f: impl FnOnce() -> R) -> R to ThreadProvider. This is because spawning a guest thread (via sys_clone) and a regular thread in test (via std::thread::spawn) have different routes and require different setup. For example, while running in test, there is no guest code, and thus we don't need to do fs/gs swap (hence no setup for one of fs/gs). To make the code work for both guest thread and non-guest thread, we can set gs == fs before running the test thread.

@CvvT CvvT marked this pull request as ready for review February 25, 2026 02:11
@wdcui
Copy link
Member

wdcui commented Feb 25, 2026

@jaybosamiya-ms , it would be good to have you review the interface-level changes in this PR.

@github-actions
Copy link

🤖 SemverChecks 🤖 No breaking API changes detected

Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants