-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v8.x-backport] n-api: improve runtime perf of n-api func call #21733
[v8.x-backport] n-api: improve runtime perf of n-api func call #21733
Conversation
a6299d6
to
90cde52
Compare
@gabrielschulhof I assume this did not land cleanly so a merge was required. Is there a better way to review than looking at all of the changes. Hopeing there is a smaller subset that needs a review. |
@mhdawson I was actually proactive about this one, because it's performance related. We definitely want this backported, and it didn't land cleanly. I may have to rebase because these other commits are definitely not intended to be part of the backport. |
Added a new struct CallbackBundle to eliminate all GetInternalField() calls. The principle is to store all required data inside a C++ struct, and then store the pointer in the JavaScript object. Before this change, the required data are stored in the JavaScript object in 3 or 4 seperate pointers. For every napi fun call, 3 of them have to be fetched out, which are 3 GetInternalField() calls; after this change, the C++ struct will be directly fetched out by using v8::External::Value(), which is faster. Profiling data show that GetInternalField() is slow. On an i7-4770K (3.50GHz) box, a C++ V8-binding fun call is 8 ns, before this change, napi fun call is 36 ns; after this change, napi fun call is 20 ns. The above data are measured using a modified benchmark in 'benchmark/misc/function_call'. The modification adds an indicator of the average time of a "chatty" napi fun call (max 50M runs). This change will speed up chatty case 1.8x (overall), and will cut down the delay of napi mechanism to approx. 0.5x. Background: a simple C++ binding function (e.g. receiving little from JS, doing little and returning little to JS) is called 'chatty' case for JS<-->C++ fun call routine. This improvement also applies to getter/setter fun calls. PR-URL: nodejs#21072 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Gabriel Schulhof <gabriel.schulhof@intel.com>
76a237c
to
9a3cd9e
Compare
@mhdawson rebased – the delta is much smaller now. |
Added a new struct CallbackBundle to eliminate all GetInternalField() calls. The principle is to store all required data inside a C++ struct, and then store the pointer in the JavaScript object. Before this change, the required data are stored in the JavaScript object in 3 or 4 seperate pointers. For every napi fun call, 3 of them have to be fetched out, which are 3 GetInternalField() calls; after this change, the C++ struct will be directly fetched out by using v8::External::Value(), which is faster. Profiling data show that GetInternalField() is slow. On an i7-4770K (3.50GHz) box, a C++ V8-binding fun call is 8 ns, before this change, napi fun call is 36 ns; after this change, napi fun call is 20 ns. The above data are measured using a modified benchmark in 'benchmark/misc/function_call'. The modification adds an indicator of the average time of a "chatty" napi fun call (max 50M runs). This change will speed up chatty case 1.8x (overall), and will cut down the delay of napi mechanism to approx. 0.5x. Background: a simple C++ binding function (e.g. receiving little from JS, doing little and returning little to JS) is called 'chatty' case for JS<-->C++ fun call routine. This improvement also applies to getter/setter fun calls. Backport-PR-URL: #21733 PR-URL: #21072 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Gabriel Schulhof <gabriel.schulhof@intel.com>
landing in 5026ab4 |
Added a new struct CallbackBundle to eliminate all GetInternalField() calls. The principle is to store all required data inside a C++ struct, and then store the pointer in the JavaScript object. Before this change, the required data are stored in the JavaScript object in 3 or 4 seperate pointers. For every napi fun call, 3 of them have to be fetched out, which are 3 GetInternalField() calls; after this change, the C++ struct will be directly fetched out by using v8::External::Value(), which is faster. Profiling data show that GetInternalField() is slow. On an i7-4770K (3.50GHz) box, a C++ V8-binding fun call is 8 ns, before this change, napi fun call is 36 ns; after this change, napi fun call is 20 ns. The above data are measured using a modified benchmark in 'benchmark/misc/function_call'. The modification adds an indicator of the average time of a "chatty" napi fun call (max 50M runs). This change will speed up chatty case 1.8x (overall), and will cut down the delay of napi mechanism to approx. 0.5x. Background: a simple C++ binding function (e.g. receiving little from JS, doing little and returning little to JS) is called 'chatty' case for JS<-->C++ fun call routine. This improvement also applies to getter/setter fun calls. Backport-PR-URL: #21733 PR-URL: #21072 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Gabriel Schulhof <gabriel.schulhof@intel.com>
Added a new struct CallbackBundle to eliminate all
GetInternalField() calls.
The principle is to store all required data inside a C++ struct,
and then store the pointer in the JavaScript object. Before this
change, the required data are stored in the JavaScript object in
3 or 4 seperate pointers. For every napi fun call, 3 of them
have to be fetched out, which are 3 GetInternalField() calls;
after this change, the C++ struct will be directly fetched out
by using v8::External::Value(), which is faster.
Profiling data show that GetInternalField() is slow.
On an i7-4770K (3.50GHz) box, a C++ V8-binding fun call is 8 ns,
before this change, napi fun call is 36 ns; after this change,
napi fun call is 20 ns.
The above data are measured using a modified benchmark in
'benchmark/misc/function_call'. The modification adds an indicator
of the average time of a "chatty" napi fun call (max 50M runs).
This change will speed up chatty case 1.8x (overall), and will cut
down the delay of napi mechanism to approx. 0.5x.
Background: a simple C++ binding function (e.g. receiving little
from JS, doing little and returning little to JS) is called
'chatty' case for JS<-->C++ fun call routine.
This improvement also applies to getter/setter fun calls.
PR-URL: #21072
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes