Skip to content

Conversation

@OliverNChalk
Copy link

Problem

  • The scheduler controller will shutdown if the upstream sender disconnects. This is seen as an "unexpected exit" by the manager thread and the scheduler is restarted infinitely (until the shutdown signal reaches the manager).

Summary of Changes

  • If we shutdown the scheduler due to an upstream disconnect, then we shutdown he manager too as there is no point in spawning additional schedulers.

@OliverNChalk OliverNChalk marked this pull request as ready for review December 4, 2025 21:34
@codecov-commenter
Copy link

codecov-commenter commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 78.26087% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.5%. Comparing base (37f6e9e) to head (e091a3b).
⚠️ Report is 11 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #9417     +/-   ##
=========================================
- Coverage    82.5%    82.5%   -0.1%     
=========================================
  Files         895      895             
  Lines      322398   322410     +12     
=========================================
- Hits       266108   266098     -10     
- Misses      56290    56312     +22     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tao-stones
tao-stones previously approved these changes Dec 5, 2025
Copy link

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, assume you find way to test it?

@OliverNChalk OliverNChalk force-pushed the fix/banking-shutdown-log-spam branch from 4298e26 to 216272c Compare December 5, 2025 20:29
@OliverNChalk
Copy link
Author

lgtm, assume you find way to test it?

Yep, tested manually on devnet by starting & stopping an rpc

},
time::Instant,
},
tokio_util::sync::CancellationToken,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this suitable outside async contexts?

Copy link
Author

@OliverNChalk OliverNChalk Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure, it's just either an atomic operation if the runtime isnt parked or a syscall to unpark the runtime (should be a write syscall for our single threaded runtime).

Fairly sure this is where we get to on our sync caller side:

    pub fn wake(self) {
        // The actual wakeup call is delegated through a virtual function call
        // to the implementation which is defined by the executor.

        // Don't call `drop` -- the waker will be consumed by `wake`.
        let this = ManuallyDrop::new(self);

        // SAFETY: This is safe because `Waker::from_raw` is the only way
        // to initialize `wake` and `data` requiring the user to acknowledge
        // that the contract of `RawWaker` is upheld.
        unsafe { (this.waker.vtable.wake)(this.waker.data) };
    }

After this the queued waker runs which is type erased. Most likely this is the wake method on the other side:

    pub(crate) fn wake(&self) -> io::Result<()> {
        // The epoll emulation on some illumos systems currently requires
        // the eventfd to be read before an edge-triggered read event is
        // generated.
        // See https://www.illumos.org/issues/16700.
        #[cfg(target_os = "illumos")]
        self.reset()?;

        let buf: [u8; 8] = 1u64.to_ne_bytes();
        match (&self.fd).write(&buf) {
            Ok(_) => Ok(()),
            Err(ref err) if err.kind() == io::ErrorKind::WouldBlock => {
                // Writing only blocks if the counter is going to overflow.
                // So we'll reset the counter to 0 and wake it again.
                self.reset()?;
                self.wake()
            }
            Err(err) => Err(err),
        }
    }

@OliverNChalk OliverNChalk force-pushed the fix/banking-shutdown-log-spam branch from 35069b9 to e091a3b Compare December 5, 2025 21:28
Copy link

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sorting it out for both internal and external paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants