-
Notifications
You must be signed in to change notification settings - Fork 48.4k
Detect and prevent render starvation, per lane #18864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If an update is CPU-bound for longer than expected according to its priority, we assume it's being starved by other work on the main thread. To detect this, we keep track of the elapsed time using a fixed-size array where each slot corresponds to a lane. What we actually store is the event time when the lane first became CPU-bound. Then, when receiving a new update or yielding to the main thread, we check how long each lane has been pending. If the time exceeds a threshold constant corresponding to its priority, we mark it as expired to force it to synchronously finish. We don't want to mistake time elapsed while an update is IO-bound (waiting for data to resolve) for time when it is CPU-bound. So when a lane suspends, we clear its associated event time from the array. When it receives a signal to try again, either a ping or an update, we assign a new event time to restart the clock.
This pull request is automatically built and testable in CodeSandbox. To see build info of the built libraries, click here or the icon next to each commit SHA. Latest deployment of this branch, based on commit fc843d6:
|
Details of bundled changes.Comparing: fb3f0ac...fc843d6 react-dom
ReactDOM: size: 0.0%, gzip: 0.0% Size changes (stable) |
scheduleUpdateOnFiber(current, attemptHydrationAtLane); | ||
// TODO: The event time is used to prevent starvation if a lane takes | ||
// too long to finish rendering. Ideally this would be the time at which | ||
// we started rendering. For spawned work, it's not a big deal, though, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big deal 😎
priority >= InputContinuousLanePriority ? 1000 : 5000; | ||
if (eventTime + expirationHeuristic <= currentTime) { | ||
// This lane expired | ||
root.expiredLanes |= lane; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the implications for entangled updates if we only mark a subset of lanes as expired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entanglement would happen in getNextLanes
. First we'd detect that a lane had expired, so we'd choose it to work on. Then we'd combine it with the entangled lanes before returning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
@@ -555,7 +596,15 @@ export function higherPriorityLane(a: Lane, b: Lane) { | |||
return a !== NoLane && a < b ? a : b; | |||
} | |||
|
|||
export function markRootUpdated(root: FiberRoot, updateLane: Lane) { | |||
export function createLaneMap<T>(initial: T): LaneMap<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why T
instead of a more concrete type? Do we expect to have other types of lane maps in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the ones I have in mind are numbers, but not necessarily. Like the interactions map, for example.
while (lanes > 0) { | ||
const index = ctrz(lanes); | ||
|
||
// Assumes event times are monotonically increasing. Last one wins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, maybe this is me being dumb because it's early, but I'm not sure I understand this comment. Isn't the only assumption this code is making is that eventTime
is positive? What does "last one wins" mean in this context?
Was this comment more applicable to old expiration times implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I meant "first one wins"
The idea is that the times always increase so there's no need to check if an existing time is earlier than the new time. The existing time will always be earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, ok that makes a lot more sense. I even wondered if maybe that's what you meant but discarded it.
while (lanes > 0) { | ||
const index = ctrz(lanes); | ||
|
||
eventTimes[index] = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why we use an explicit constant for e.g. NoLane = 0
but we pass around a naked -1
in so many places. Should we use a constant for this too (to make it easier to grep for later if nothing else)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can do that. It won't add any type safety, unfortunately, but it does make it more obvious.
} | ||
|
||
// Count trailing zeros. Only used on lanes, so assume input is an integer. | ||
function ctrz(lanes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why no Flow type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing with Lanes won't add type safety because inside this module, Lane is only an alias for number. If it were exported, however, then adding Lanes would prevent someone from passing a number that's not of type Lanes. I'll add it anyway, though, in case this ever gets exported (though I don't think we would).
I originally stored the start time because I thought I could use this in the future to also measure Suspense timeouts. (Event times are currently stored on each update object for this purpose.) But that won't work because in the case of expiration times, we reset the clock whenever the update becomes IO-bound. So to replace the per-update field, I'm going to have to track those on the room separately from expiration times.
* Detect and prevent render starvation, per lane If an update is CPU-bound for longer than expected according to its priority, we assume it's being starved by other work on the main thread. To detect this, we keep track of the elapsed time using a fixed-size array where each slot corresponds to a lane. What we actually store is the event time when the lane first became CPU-bound. Then, when receiving a new update or yielding to the main thread, we check how long each lane has been pending. If the time exceeds a threshold constant corresponding to its priority, we mark it as expired to force it to synchronously finish. We don't want to mistake time elapsed while an update is IO-bound (waiting for data to resolve) for time when it is CPU-bound. So when a lane suspends, we clear its associated event time from the array. When it receives a signal to try again, either a ping or an update, we assign a new event time to restart the clock. * Store as expiration time, not start time I originally stored the start time because I thought I could use this in the future to also measure Suspense timeouts. (Event times are currently stored on each update object for this purpose.) But that won't work because in the case of expiration times, we reset the clock whenever the update becomes IO-bound. So to replace the per-update field, I'm going to have to track those on the room separately from expiration times.
If an update is CPU-bound for longer than expected according to its priority, we assume it's being starved by other work on the main thread.
To detect this, we keep track of the elapsed time using a fixed-size array where each slot corresponds to a lane.
What we actually store is the event time when the lane first became CPU-bound.Changed it to store the expiration time, not the start time. See second commit for explanation.Then, when receiving a new update or yielding to the main thread, we check how long each lane has been pending. If the time exceeds a threshold constant corresponding to its priority, we mark it as expired to force it to synchronously finish, instead of being interrupted again.
We don't want to mistake time elapsed while an update is IO-bound (waiting for data to resolve) for time when it is CPU-bound. So when a lane suspends, we clear its associated event time from the array. When it receives a signal to try again, either a ping or an update, we assign a new event time to restart the clock.