-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: register external references for source code #47055
Conversation
Review requested:
|
OK, PTAL. The big change is that we now only have one map by moving the complexity into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much nicer, thanks, a couple of remaining suggestions, but over all looks good
Thanks for the comments! They should all be addressed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Can someone please re-run the failing Jenkins build for me? I want to get this merged in, it's been languishing for a while |
Sorry for the delay. Adding to my watch list to land. |
Landed in 0736d0b |
Currently we use external strings for internalized builtin source code. However when a snapshot is taken, any external string whose resource is not registered is flattened into a SeqString (see ref). The result is that module source code stored in the snapshot does not use external strings after deserialization. This patch registers an external string resource for each internalized builtin's source. The savings are substantial: ~1.9 MB of heap memory per isolate, or ~44% of an otherwise empty isolate's heap usage: ```console $ node --expose-gc -p 'gc(),process.memoryUsage().heapUsed' 4190968 $ ./node --expose-gc -p 'gc(),process.memoryUsage().heapUsed' 2327536 ``` The savings can be even higher for user snapshots which may include more internal modules. The existing UnionBytes implementation was ill-suited, because it only created an external string resource when ToStringChecked was called, but we need to register the external string resources before the isolate even exists. We change UnionBytes to no longer own the data, and shift ownership of the data to a new external resource class called StaticExternalByteResource. StaticExternalByteResource are either statically allocated (for internalized builtin code) or owned by the static `externalized_builtin_sources` map, so they will only be destructed when static resources are destructed. We change JS2C to emit statements to register a string resource for each internalized builtin. Refs: https://github.com/v8/v8/blob/d2c8fbe9ccd1a6ce5591bb7dd319c3c00d6bf489/src/snapshot/serializer.cc#L633 PR-URL: #47055 Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
} | ||
// OK to get the raw pointer, since externalized_builtin_sources owns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kvakil Are you sure that v8::String::ExternalStringResourceBase
doesn't expect to own the allocation? I have another PR that is affected by this and in my case v8::String::ExternalStringResourceBase::Dispose
calls free
on the received string when destroying the isolate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mmomtchev this is V8's documentation/default implementation of v8::String::ExternalStringResourceBase::Dispose
:
/**
* Internally V8 will call this Dispose method when the external string
* resource is no longer needed. The default implementation will use the
* delete operator. This method can be overridden in subclasses to
* control how allocated external string resources are disposed.
*/
virtual void Dispose() { delete this; }
Since this diff has StaticExternalByteResource
override Dispose
to be empty, we never execute delete this
:
void Dispose() override {
// We ignore Dispose calls from V8, even if we "own" a resource via
// owning_ptr_. All instantiations of this class are static or owned by a
// static map, and will be destructed when static variables are destructed.
}
so I believe this is correct.
Currently we use external strings for internalized builtin source code. However when a snapshot is taken, any external string whose resource is not registered is flattened into a SeqString (see ref). The result is that any module source code stored in the snapshot does not use external strings after deserialization. This patch registers an external string resource for each internalized builtin's source. The savings are substantial: ~1.9 MB of heap memory per isolate, or ~44% of an otherwise empty isolate's heap usage:
The savings can be even higher for user snapshots which may include more internal modules.
Doing this with the existing UnionBytes abstraction was tricky, because UnionBytes only creates an external string resource when ToStringChecked is called. We change UnionBytes to no longer own the data, and shift ownership of the data (in the case where the data is not static) to a new external resource class. We can then register any resource whose data is static, since we know those are precisely the internalized builtins.
Refs: https://github.com/v8/v8/blob/d2c8fbe9ccd1a6ce5591bb7dd319c3c00d6bf489/src/snapshot/serializer.cc#L633