runtime: debug call injection tests contain many unsafe write barriers #49169
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Testing
An issue that has been verified to require only test changes, not just a test failure.
Milestone
It's possible for the debug call injection tests to segfault, because
(*debugCallHandler).inject
and(*debugCallHandler).handle
have write barriers but they're called in a signal handler. Normally write barriers would be prevented here via their caller,sighandler
, which hasgo:nowritebarrierrec
but unfortunately because they're called indirectly via thetestSigtrap
global variable, the analysis doesn't plumb through.The failure is very rare; a GC has to be actively running during the injection. But, it is possible, and has been reproduced locally by myself and is observable on trybots, if you run the runtime tests enough times.
This is only a test problem, also. Real debug call injection implementations will usually be coordinated by a separate process; the tricky bit here is we're doing all the machinery in the signal handler of the same process.
I tried to do the simple thing and mark the two methods above as
go:nowritebarrierrec
, but there are enough write barriers in each function that it's tough to capture them all without the code being mostly fragile comments about why it's safe to elide all these write barriers. I think this test code just needs a rethink, so that the primary driver is not on the thread getting all the signals.The trouble with that is dealing with the result of a panicking injected function. The panic value it returns may be the only reference to that value, so the signal handler needs to put it somewhere the GC can see. That's at odds with the fact that any such store would require a write barrier. There may be something clever here, but I haven't thought of it yet.
CC @prattmic @aclements
The text was updated successfully, but these errors were encountered: