-
Notifications
You must be signed in to change notification settings - Fork 92
Open
Labels
feature/resourcesrobustness/interruptsRelated to (lack of) robustness against interruptsRelated to (lack of) robustness against interrupts
Milestone
Description
I need to execute some machine learning algorithms on a remote GPU server. It is likely that some heavy training is going and all CPU and GPU resources are occupied. In this case, starting a remote process and communicating with it are both likely to hang. Therefore I need an external timeout (rather than an remote call on withTimeout as suggested in #169) to control the time it takes and if timeout occurs other measures will be taken.
library(future)
v <- R.utils::withTimeout({
p <- remote({
Sys.sleep(10)
1
}, workers = "<remote-ip>", user = "<remote-user>", persistent = FALSE, earlySignal = TRUE)
value(p)
}, timeout = 1)However, the behavior can be quite random as I run it repeatedly. In some cases, it works as expected, but sometimes it directly returns 1 without a sleep, and in other cases, it ends up in the following error:
Error: Unexpected result (of class ‘NULL’ != ‘FutureResult’) retrieved for ClusterFuture future (label = ‘<none>’, expression = ‘{; Sys.sleep(10); 1; }’):
which implies that the internal state of the future seems corrupted somehow.
Metadata
Metadata
Assignees
Labels
feature/resourcesrobustness/interruptsRelated to (lack of) robustness against interruptsRelated to (lack of) robustness against interrupts