Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] trace_events: V8 Intrinsic (do not land) #20618

Closed
wants to merge 2 commits into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented May 9, 2018

WIP: Do not land

@ofrobots @eugeneo @hashseed @bmeurer @mcollina @nodejs/diagnostics @nodejs/chakracore ...

Here is the current work in progress prototype for a V8 builtin for emitting trace events. This is the first working iteration so it likely has much room for improvement.

There are two builtins installed on extras_binding: IsTraceCategoryEnabled and Trace. These are access by Node.js using context->GetExtrasBindingObject() and exposed to the JS side via process.binding('trace_events').

IsTraceCategoryEnabled(category) is simple enough... it takes a category group string and returns a boolean true or false.

Trace(phase, category, name, id, data) emits the actual trace event. If data is null or undefined, no additional data is emitted with the trace event. Otherwise data must either be a primitive value or an Array of primitive values, which are then serialized out as part of the trace event. A TypeError is thrown if non-primitive values are passed.

A benchmark compares the performance difference between this approach and the existing emit function in node_trace_events.cc. The intrinsics are faster and more flexible than the current functions in node_trace_events.cc but there are some issues to look at.

This would need to be upstreamed into V8 before it can land here. I'm opening this PR only to show the current status and to solicit input on the implementation. Once it's done, this will be my first V8 PR so I'm quite certain there is lots of room for improvement here.

Example use:

const { trace, isTraceCategoryEnabled } = process.binding('trace_events');

isTraceCategoryEnabled('foo');

trace('b'.charCodeAt(0), 'foo', 'name', 123, 'hello world');
trace('b'.charCodeAt(0), 'foo', 'name', 123, 1);
trace('b'.charCodeAt(0), 'foo', 'name', 123, true);
trace('b'.charCodeAt(0), 'foo', 'name', 123, 1.2);
trace('b'.charCodeAt(0), 'foo', 'name', 123, ['hello world', 1, true, 1.2]);
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label May 9, 2018
@AndreasMadsen
Copy link
Member

This is awesome. How is the data serialized. As { ..., "args": { "data": data }}?

@mscdex mscdex added the wip Issues and PRs that are still a work in progress. label May 9, 2018
@jasnell
Copy link
Member Author

jasnell commented May 9, 2018

Yeah, the extra data is serialized like...

{..., "args": {"data": "hello world"}}
{..., "args": {"data": 1}}
{..., "args": {"data": true}}
{..., "args": {"data": 1.2||
{..., "args": {"data": {"values": ["hello world", 1, true, 1.2]}}}

the extra "values" is there because of a limitation in the current implementation of v8::tracing::TracedValue. I'll likely include an update to that that would allow more direct serialization of the array without the additional level of depth.

@jasnell
Copy link
Member Author

jasnell commented May 10, 2018

Ok, now that it's working, I'm working on making it faster. Current perf compared to the existing methods in node_trace_events are looking good:

// builtin APIS: trace and isTraceCategoryEnabled
// node_trace_events: emit and categoryGroupEnabled

misc/trace.js method="trace" n=100000: 542,540.8652494889
misc/trace.js method="emit" n=100000: 495,038.2170988715
misc/trace.js method="isTraceCategoryEnabled" n=100000: 4,827,670.0418839
misc/trace.js method="categoryGroupEnabled" n=100000: 1,799,450.9407322581

misc/trace.js method="trace" n=100000: 578,128.2987639398
misc/trace.js method="emit" n=100000: 463,073.0392833426
misc/trace.js method="isTraceCategoryEnabled" n=100000: 4,673,457.792810941
misc/trace.js method="categoryGroupEnabled" n=100000: 1,759,463.0963172326

misc/trace.js method="trace" n=100000: 610,435.0763075894
misc/trace.js method="emit" n=100000: 458,415.3616051197
misc/trace.js method="isTraceCategoryEnabled" n=100000: 4,592,245.983472415
misc/trace.js method="categoryGroupEnabled" n=100000: 1,933,579.0835005299

misc/trace.js method="trace" n=100000: 545,688.9089193098
misc/trace.js method="emit" n=100000: 427,857.19128140854
misc/trace.js method="isTraceCategoryEnabled" n=100000: 4,978,736.315071959
misc/trace.js method="categoryGroupEnabled" n=100000: 1,870,799.5522278687

@jasnell
Copy link
Member Author

jasnell commented May 10, 2018

Unfortunately, in extremely hot paths the intrinsics are still a bottleneck. In a local experiment, adding trace events to EventEmitter#emit to mark the start and end of a span results in a 50-75% loss in performance in the benchmark/events/ee-emit micro-benchmarks... and that's without the tracing category being enabled... Basically, that's the cost of adding a single const trace_enabled = isTraceCategoryEnabled('node.eventemitter') to the start of the emit function plus a couple if (trace_enabled) {} branches.

Granted, the ee-emit.js micro-benchmark is far from being a realistic case, but still...

@devsnek
Copy link
Member

devsnek commented May 10, 2018

it might make sense to use something besides the extras object as v8 wants to deprecate it, and why not upstream all these changes to v8? otherwise this looks pretty cool.

@jasnell
Copy link
Member Author

jasnell commented May 10, 2018

Using the extras object was @hashseed's suggestion :-)

Also, these will be upstreamed to v8... I opened this work-in-progress PR to put the work in context and solicit review. This PR should not land in core as is.

Copy link
Member

@bmeurer bmeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far. First round of comments.

namespace v8 {
namespace internal {

enum Result { NODATA, SUCCESS, FAILED };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per chat discussion: all these helpers should be inside an anonymous namespace.

@@ -0,0 +1,430 @@
// Copyright 2016 the V8 project authors. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Copyright years.

HandleScope scope(isolate);
Handle<Object> category = args.atOrUndefined(isolate, 1);
Handle<String> string;
ASSIGN_RETURN_FAILURE_ON_EXCEPTION(isolate, string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToString comes with potential side effects, which are usually unwanted. As per chat discussion: Let's just throw here, and below if the category is not already a string.


const uint8_t* GetCategoryGroupEnabled(Handle<String> string) {
const uint8_t* category_group_enabled;
if (string->IsOneByteRepresentation()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely use one of the existing abstractions instead of peaking into the string directly. But that can be done when we upstream to V8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do it now, but either way works. Any pointer on which of the existing abstractions to use ;-)

Handle<JSValue> object,
uint8_t* arg_type,
uint64_t* arg_value) {
Object* raw = object->value();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just ToPrimitive for JSValues?

@jasnell
Copy link
Member Author

jasnell commented May 10, 2018

Ok... current numbers are looking even better...

misc/trace.js method="trace" n=100000: 516,925.19310347975
misc/trace.js method="emit" n=100000: 357,798.67015894863
misc/trace.js method="isTraceCategoryEnabled" n=100000: 4,538,007.262989864
misc/trace.js method="categoryGroupEnabled" n=100000: 1,348,183.3573047076

misc/trace.js method="trace" n=100000: 692,459.7010612693
misc/trace.js method="emit" n=100000: 483,000.75229782175
misc/trace.js method="isTraceCategoryEnabled" n=100000: 6,252,794.999364716
misc/trace.js method="categoryGroupEnabled" n=100000: 2,458,368.996072903

misc/trace.js method="trace" n=100000: 619,335.0430035905
misc/trace.js method="emit" n=100000: 502,719.3573058779
misc/trace.js method="isTraceCategoryEnabled" n=100000: 6,994,996.409118594
misc/trace.js method="categoryGroupEnabled" n=100000: 2,026,069.1480786472

One side note (/cc @AndreasMadsen and @mafintosh) ... moving to this would change the argument syntax for things like the async hooks trace events emitted by internal/trace_events_async_hooks.js. We would need to decide if we want to keep the variable-number-of-arguments syntax, which could have an impact on the optimization path.

@AndreasMadsen
Copy link
Member

We would need to decide if we want to keep the variable-number-of-arguments syntax, which could have an impact on the optimization path.

Why is it the array is the most performant? And could we not implement flat objects with something like:

trace(phase, category, name, id, keys, values);
trace('b', 'node.async_hooks', 'example', 0, ['executeAsyncId', 'triggerAsyncId'], [1, 2]);

Which would output:

args: {
  executeAsyncId: 1,
  triggerAsyncId: 2
}

The { data: Array } is kinda itchy as it always a bad option from an API backward compatibility perspective.

@jasnell
Copy link
Member Author

jasnell commented May 11, 2018

Not really a fan of the ['name1', 'name2'], ['val1', 'val2']` pattern from an API ergonomics point of view.

I just added a commit that adds basic support for an object passed in as data ... e.g.

trace('e'.charCodeAt(0), 'foo', 'name', 0, {a: 12, b: "hello"})

Would produce:

{..., "args":{"data":{"a":12,"b":"hello"}}}

Not as flat, but better ergonomically than just the Array option.

@AndreasMadsen
Copy link
Member

Not really a fan of the ['name1', 'name2'], ['val1', 'val2']` pattern from an API ergonomics point of view.

It was more from the consideration that I would be more type consistent and thus perhaps easier to optimize. Making a friendly API on top would then be trivial. However, if a classical object is not an issue that is preferable.

@jasnell
Copy link
Member Author

jasnell commented May 11, 2018

... would be more type consistent

Yep, that's definitely a consideration. @bmeurer ... which pattern would make this easiest to optimize, or will it matter? The fast path here keys off whether category is enabled or not.

@jasnell
Copy link
Member Author

jasnell commented May 11, 2018

Cleaned up the code a bit and added the ability to get the data from a function that will only be called if the category is enabled... e.g.

trace('e'.charCodeAt(0), 'foo', 'name', 0, () => "construct expensive trace data here.");

@bmeurer
Copy link
Member

bmeurer commented May 14, 2018

@jasnell I'd highly recommend to stick to objects for ergonomic reasons. I doubt that the performance impact is significant.

This commit will be dropped once this lands upstream. It will be
replaced with a backport commit from the upstream
@hashseed
Copy link
Member

Finally had a chance to take look at the V8 part of this change.

In general this looks pretty good.

->GetChars());
} else {
data_ = nullptr;
Local<v8::String> local = Utils::ToLocal(string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not do this round trip through the V8 API. Also one-byte strings can include Latin1 characters, which are not UTF8 encoding?

@jasnell
Copy link
Member Author

jasnell commented May 31, 2018

Awesome, thank you @hashseed ... I was going to open the v8 pr on monday this week but had some other bits come up. I'll be trying to get that opened by end of the day tomorrow and we can work through whatever code changes are necessary on the v8 side.

@jasnell
Copy link
Member Author

jasnell commented Jun 29, 2018

Status update on this... the PR in V8 is getting closer to landing (https://chromium-review.googlesource.com/c/v8/v8/+/1103294) ... once it does, I will update this PR with a backport of the commits that land there and the additional pieces on top.

@jasnell
Copy link
Member Author

jasnell commented Jul 20, 2018

Superseded by #21899

@jasnell jasnell closed this Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib / src Issues and PRs related to general changes in the lib or src directory. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants