-
Notifications
You must be signed in to change notification settings - Fork 48.8k
[Flight] Track Awaits on I/O as Debug Info #33388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
450d81d
to
a7ee51b
Compare
packages/react-server/src/__tests__/ReactFlightAsyncDebugInfo-test.js
Outdated
Show resolved
Hide resolved
a187226
to
7bee1d6
Compare
@@ -293,3 +293,18 @@ if (process.env.REACT_CLASS_EQUIVALENCE_TEST) { | |||
return require('internal-test-utils/ReactJSDOM.js'); | |||
}); | |||
} | |||
|
|||
// We mock createHook so that we can automatically clean it up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because in a jest.resetModules() environments will keep installing a new hook every time it gets reset. Those compound to each one collecting the data independently which runs out of memory.
In a real environment we expect that the module is only initialized once but you’d have to be careful with something like HMR on the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be an issue if someone else runs RSC rendering in a jest set up that resets modules without restarting the node runtime.
2cce761
to
a906e38
Compare
}); | ||
|
||
it('can track the start of I/O when no native promise is used', async () => { | ||
function Component() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the snapshots above can change the line number of these tests, you might have to rerun the snapshots twice when updating them.
This lets us track both the thing we're awaiting and the path that led us to awaiting in the first place.
Track end time of I/O only if we need to
ReactAsyncInfo represents an "await" where as the ReactIOInfo represents when the start of that I/O was requested. Also make sure we don't try to encode debugStack or debugTask.
We'll use this as a heuristic to exclude awaits that happened before this.
…space stack frames
This conveniently ignores our own scheduling tasks.
We're not going to use this.
Just for tests.
…anceTrack We rely on enableComponentPerformanceTrack flag for the timing tracking in general but then we additionally need the async debug info flag to use the async tracking.
a6a71e6
to
dddd08a
Compare
Stacked on #33388. This encodes the I/O entries as their own row type (`"J"`). This makes it possible to parse them directly without first parsing the debug info for each component. E.g. if you're just interested in logging the I/O without all the places it was awaited. This is not strictly necessary since the debug info is also readily available without parsing the actual trees. (That's how the Server Components Performance Track works.) However, we might want to exclude this information in profiling builds while retaining some limited form of I/O tracking. It also allows for logging side-effects that are not awaited if we wanted to.
Stacked on facebook#33388. This encodes the I/O entries as their own row type (`"J"`). This makes it possible to parse them directly without first parsing the debug info for each component. E.g. if you're just interested in logging the I/O without all the places it was awaited. This is not strictly necessary since the debug info is also readily available without parsing the actual trees. (That's how the Server Components Performance Track works.) However, we might want to exclude this information in profiling builds while retaining some limited form of I/O tracking. It also allows for logging side-effects that are not awaited if we wanted to. DiffTrain build for [3fb17d1](facebook@3fb17d1)
Stacked on facebook#33388. This encodes the I/O entries as their own row type (`"J"`). This makes it possible to parse them directly without first parsing the debug info for each component. E.g. if you're just interested in logging the I/O without all the places it was awaited. This is not strictly necessary since the debug info is also readily available without parsing the actual trees. (That's how the Server Components Performance Track works.) However, we might want to exclude this information in profiling builds while retaining some limited form of I/O tracking. It also allows for logging side-effects that are not awaited if we wanted to. DiffTrain build for [3fb17d1](facebook@3fb17d1)
We highly recommend using Node Streams in Node.js because it's much faster and it is less likely to cause issues when chained in things like compression algorithms that need explicit flushing which the Web Streams ecosystem doesn't have a good solution for. However, that said, people want to be able to use the worse option for various reasons. The `.edge` builds aren't technically intended for Node.js. A Node.js environments needs to be patched in various ways to support it. It's also less optimal since it can't use [Node.js exclusive features](#33388) and have to use [the lowest common denominator](#27399) such as JS implementations instead of native. This adds a Web Streams build of Fizz but exclusively for Node.js so that in it we can rely on Node.js modules. The main difference compared to Edge is that SSR now uses `createHash` from the `"crypto"` module and imports `TextEncoder` from `"util"`. We use `setImmediate` instead of `setTimeout`. The public API is just `react-dom/server` which in Node.js automatically imports `react-dom/server.node` which re-exports the legacy bundle, Node Streams bundle and Node Web Streams bundle. The main downside is if your bundler isn't smart to DCE this barrel file. With Flight the difference is larger but that's a bigger lift.
This lets us track what data each Server Component depended on. This will be used by Performance Track and React DevTools.
We use Node.js
async_hooks
. This has a number of downside. It is Node.js specific so this feature is not available in other runtimes until something equivalent becomes available. It's discouraged by Node.js docs. It's also slow which makes this approach only really viable in development mode. At least with stack traces. However, it's really the only solution that gives us the data that we need.The Diagnostic Channel API is not sufficient. Not only is many Node.js built-in APIs missing but all libraries like databases are also missing. Were as
async_hooks
covers pretty much anything async in the Node.js ecosystem.However, even if coverage was wider it's not actually showing the information we want. It's not enough to show the low level I/O that is happening because that doesn't provide the context. We need the stack trace in user space code where it was initiated and where it was awaited. It's also not each low level socket operation that we want to surface but some higher level concept which can span a sequence of I/O operations but as far as user space is concerned.
Therefore this solution is anchored on stack traces and ignore listing to determine what the interesting span is. It is somewhat Promise-centric (and in particular async/await) because it allows us to model an abstract span instead of just random I/O. Async/await points are also especially useful because this allows Async Stacks to show the full sequence which is not supported by random callbacks. However, if no Promises are involved we still to our best to show the stack causing plain I/O callbacks.
Additionally, we don't want to track all possible I/O. For example, side-effects like logging that doesn't affect the rendering performance doesn't need to be included. We only want to include things that actually block the rendering output. We also need to track which data blocks each component so that we can track which data caused a particular subtree to suspend.
We can do this using
async_hooks
because we can track the graph of what resolved what and then spawned what.To track what suspended what, something has to resolve. Therefore it needs to run to completion before we can show what it was suspended on. So something that never resolves, won't be tracked for example.
We use the
async_hooks
inReactFlightServerConfigDebugNode
to build up anReactFlightAsyncSequence
graph that collects the stack traces for basically all I/O and Promises allocated in the whole app. This is pretty heavy, especially the stack traces, but it's because we don't know which ones we'll need until they resolve. We don't materialize the stacks until we need them though.Once they end up pinging the Flight runtime, we collect which current executing task that pinged the runtime and then log the sequence that led up until that runtime into the RSC protocol. Currently we only include things that weren't already resolved before we started rendering this task/component, so that we don't log the entire history each time.
Each operation is split into two parts. First a
ReactIOInfo
which represents an I/O operation and its start/end time. Basically the start point where it was start. This is basically represents where you callednew Promise()
or when entering anasync function
which has an implied Promise. It can be started in a different component than where it's awaited and it can be awaited in multiple places. Therefore this is global information and not associated with a specific Component.The second part is
ReactAsyncInfo
. This represents where this I/O wasawait
:ed or.then()
called. This is associated with a point in the tree (usually the Promise that's a direct child of a Component). Since you can have multiple different I/O awaited in a sequence technically it forms a dependency graph but to simplify the model these awaits as flattened into theReactDebugInfo
list. Basically it contains each await in a sequence that affected this part from unblocking.This means that the same
ReactAsyncInfo
can appear in mutliple components if they all await the sameReactIOInfo
but the same Promise only appears once.Promises that are only resolved by other Promises or immediately are not considered here. Only if they're resolved by an I/O operation. We pick the Promise basically on the border between user space code and ignored listed code (
node_modules
) to pick the most specific span but abstract enough to not give too much detail irrelevant to the current audience. Similarly, the deepestawait
in user space is marked as the relevantawait
point.This feature is only available in the
node
builds of React. Not if you use theedge
builds inside of Node.js.