Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add bindings to run microtasks from Isolate #2793

Merged
merged 10 commits into from
Sep 12, 2019

Conversation

bartlomieju
Copy link
Member

No description provided.

@bartlomieju bartlomieju force-pushed the feat-explicit_microtask_policy branch from fb57c54 to 838e811 Compare August 19, 2019 19:59
core/libdeno/deno.h Outdated Show resolved Hide resolved
core/isolate.rs Outdated Show resolved Hide resolved
core/isolate.rs Outdated
@@ -690,6 +695,7 @@ impl Future for Isolate {
drop(locker);
}

self.run_microtasks()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to be inside a locker scope.

Copy link
Contributor

@afinch7 afinch7 Aug 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't. Lock is being acquired already within libdeno::deno_run_microtasks. We only need it on respond, because we don't create a lock in libdeno::deno_respond. It might make sense to remove LockerScope, and just add a lock to the async part of libdeno::deno_respond(We shouldn't need it for sync responses).

core/libdeno/api.cc Outdated Show resolved Hide resolved
@bartlomieju
Copy link
Member Author

Results

This PR

13:41 $ third_party/wrk/mac/wrk -d 10s --latency http://127.0.0.1:4555
Running 10s test @ http://127.0.0.1:4555
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   204.46us  465.60us  10.14ms   95.44%
    Req/Sec    38.68k     3.56k   42.98k    89.60%
  Latency Distribution
     50%  111.00us
     75%  117.00us
     90%  168.00us
     99%    2.67ms
  777553 requests in 10.10s, 37.82MB read
Requests/sec:  76974.46
Transfer/sec:      3.74MB

Latest release:

13:47 $ third_party/wrk/mac/wrk -d 10s --latency http://127.0.0.1:4555
Running 10s test @ http://127.0.0.1:4555
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   183.57us  372.93us   5.53ms   95.52%
    Req/Sec    40.72k     2.25k   44.34k    75.25%
  Latency Distribution
     50%  106.00us
     75%  111.00us
     90%  157.00us
     99%    2.46ms
  818606 requests in 10.10s, 39.81MB read
Requests/sec:  81050.03
Transfer/sec:      3.94MB

I'll profile V8 as well in the evening

@bartlomieju
Copy link
Member Author

With check_last_exception removed, the results are more promising:

third_party/wrk/mac/wrk -d 10s --latency http://127.0.0.1:4555
Running 10s test @ http://127.0.0.1:4555
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   192.76us  444.83us  12.51ms   95.73%
    Req/Sec    39.87k     2.52k   43.02k    87.00%
  Latency Distribution
     50%  109.00us
     75%  114.00us
     90%  149.00us
     99%    2.55ms
  793274 requests in 10.00s, 38.58MB read
Requests/sec:  79319.90
Transfer/sec:      3.86MB

@bartlomieju
Copy link
Member Author

I'm not sure about this PR anymore... I still can't manage to get tests to pass even though I tried calling run_microtasks after each dynamic import operation and still deno_tcp has lower score than on master branch. We still should get better perf increase addressing #2758 and this PR introduces another discreet complexity. Opinions @ry @afinch7?

@bartlomieju
Copy link
Member Author

I managed to figure out the problems 😅feel a bit ashamed that I couldn't figure it out before

Here are the benchmarks:

master

"req_per_sec": {
    "deno_tcp": 49435, 
    "deno_core_single": 57750, 
    "node_proxy_tcp": 21056, 
    "deno_proxy_tcp": 17669, 
    "deno_core_multi": 60796, 
    "deno_tcp_current_thread": 44114, 
    "hyper": 57055, 
    "node_tcp": 59013, 
    "deno_http": 18755, 
    "node_http": 23919, 
    "deno_proxy": 1004, 
    "node_proxy": 2785
  }, 
  "max_latency": {
    "deno_tcp": 8.7, 
    "deno_core_single": 0.359, 
    "node_proxy_tcp": 1.12, 
    "deno_proxy_tcp": 5.22, 
    "deno_core_multi": 1.15, 
    "deno_tcp_current_thread": 8.91, 
    "hyper": 3.19, 
    "node_tcp": 0.543, 
    "deno_http": 2.63, 
    "node_http": 1.97, 
    "deno_proxy": 17.71, 
    "node_proxy": 9.95
  }, 

this PR

  "req_per_sec": {
    "deno_tcp": 46789, 
    "deno_core_single": 48533, 
    "node_proxy_tcp": 19905, 
    "deno_proxy_tcp": 16999, 
    "deno_core_multi": 50461, 
    "deno_tcp_current_thread": 44352, 
    "hyper": 59689, 
    "node_tcp": 59066, 
    "deno_http": 13950, 
    "node_http": 23674, 
    "deno_proxy": 970, 
    "node_proxy": 2641
  }, 
  "max_latency": {
    "deno_tcp": 8.72, 
    "deno_core_single": 0.41, 
    "node_proxy_tcp": 1.19, 
    "deno_proxy_tcp": 5.66, 
    "deno_core_multi": 0.89, 
    "deno_tcp_current_thread": 8.98, 
    "hyper": 0.88, 
    "node_tcp": 0.529, 
    "deno_http": 2.89, 
    "node_http": 1.78, 
    "deno_proxy": 17.93, 
    "node_proxy": 9.97
  } 

@bartlomieju
Copy link
Member Author

bartlomieju commented Sep 3, 2019

Flamegraphs

master:
deno_tcp_master

this PR:
deno_tcp

@ry
Copy link
Member

ry commented Sep 4, 2019

@bartlomieju so given the fix in #2858, is this an improvement?

@bartlomieju
Copy link
Member Author

@ry I just checked benchmark:

third_party/wrk/mac/wrk -d 10s --latency http://127.0.0.1:4544
Running 10s test @ http://127.0.0.1:4544
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   301.58us    0.94ms  21.50ms   96.02%
    Req/Sec    31.80k     5.71k   41.25k    62.87%
  Latency Distribution
     50%  124.00us
     75%  156.00us
     90%  268.00us
     99%    4.69ms
  639117 requests in 10.10s, 31.08MB read
Requests/sec:  63266.79
Transfer/sec:      3.08MB

That's significantly slower, I'd assume that now microtasks are run too rarely.
I still want to experiment a bit running microtasks only in libdeno code. I'll get back to you.

@bartlomieju
Copy link
Member Author

@ry so I also removed calls to LockerScope as suggested by @afinch7 here and moved calls to RunMicrotasks into deno_respond.

It looks like performance is comparable to master branch so I'm not entirely sure microtasks are bottleneck in our performance problems.

@bartlomieju
Copy link
Member Author

bartlomieju commented Sep 5, 2019

Latest benchmarks

v0.17.0

  "req_per_sec": {
    "deno_tcp": 50392, 
    "deno_core_single": 51854, 
    "node_proxy_tcp": 18498, 
    "deno_proxy_tcp": 17372, 
    "deno_core_multi": 57736, 
    "deno_tcp_current_thread": 48354, 
    "hyper": 60346, 
    "node_tcp": 50332, 
    "deno_http": 17544, 
    "node_http": 19701, 
    "deno_proxy": 984, 
    "node_proxy": 2546
  }, 
  "max_latency": {
    "deno_tcp": 9.55, 
    "deno_core_single": 0.401, 
    "node_proxy_tcp": 1.14, 
    "deno_proxy_tcp": 6.17, 
    "deno_core_multi": 1.23, 
    "deno_tcp_current_thread": 10.05, 
    "hyper": 0.776, 
    "node_tcp": 0.618, 
    "deno_http": 2.95, 
    "node_http": 2.22, 
    "deno_proxy": 18.16, 
    "node_proxy": 10.43
  }, 

this PR

  "req_per_sec": {
    "deno_tcp": 51200, 
    "deno_core_single": 50504, 
    "node_proxy_tcp": 20413, 
    "deno_proxy_tcp": 18359, 
    "deno_core_multi": 57639, 
    "deno_tcp_current_thread": 47770, 
    "hyper": 60629, 
    "node_tcp": 59573, 
    "deno_http": 18560, 
    "node_http": 24048, 
    "deno_proxy": 959, 
    "node_proxy": 2746
  }, 
  "max_latency": {
    "deno_tcp": 8.75, 
    "deno_core_single": 0.455, 
    "node_proxy_tcp": 1.05, 
    "deno_proxy_tcp": 5.66, 
    "deno_core_multi": 1.18, 
    "deno_tcp_current_thread": 9.29, 
    "hyper": 0.732, 
    "node_tcp": 0.544, 
    "deno_http": 2.7, 
    "node_http": 2.04, 
    "deno_proxy": 18.0, 
    "node_proxy": 9.99
  }, 

Side note: benchmarks are fluctuating a lot, maybe we should run each benchmark 3-5 times and average the results to smooth out charts?

core/isolate.rs Outdated
}

// self.run_microtasks();
Copy link
Member

@ry ry Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not calling run_microtasks... I would think this will cause test failures

Copy link
Member Author

@bartlomieju bartlomieju Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ry I call isolate->RunMicrotasks() in libdeno code. Those commented bits need to be removed.

@@ -198,6 +200,7 @@ void deno_respond(Deno* d_, void* user_data, deno_op_id op_id, deno_buf buf) {
}

auto v = recv_->Call(context, context->Global(), argc, args);
d->isolate_->RunMicrotasks();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as kAuto... I benchmarked it and there's no change

master

~/src/deno> wrk http://127.0.0.1:4500/
Running 10s test @ http://127.0.0.1:4500/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   291.26us  494.34us  11.83ms   96.74%
    Req/Sec    20.33k     2.10k   32.32k    82.50%
  404493 requests in 10.00s, 19.67MB read
Requests/sec:  40447.21
Transfer/sec:      1.97MB
~/src/deno> wrk http://127.0.0.1:4500/
Running 10s test @ http://127.0.0.1:4500/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   272.21us  386.03us   6.74ms   97.01%
    Req/Sec    20.61k     1.59k   24.46k    68.50%
  410139 requests in 10.00s, 19.95MB read
Requests/sec:  41001.46
Transfer/sec:      1.99MB

this branch

~/src/deno> wrk http://127.0.0.1:4500/
Running 10s test @ http://127.0.0.1:4500/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   272.63us  402.47us   6.44ms   97.03%
    Req/Sec    20.81k     1.43k   28.58k    83.17%
  418073 requests in 10.10s, 20.33MB read
Requests/sec:  41393.01
Transfer/sec:      2.01MB
~/src/deno> wrk http://127.0.0.1:4500/
Running 10s test @ http://127.0.0.1:4500/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   271.93us  400.53us   6.75ms   97.05%
    Req/Sec    20.84k     1.22k   26.85k    80.69%
  418670 requests in 10.10s, 20.36MB read
Requests/sec:  41454.11
Transfer/sec:      2.02MB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my experience as well, calling run_microtasks inside Isolate::poll gives about 10-15% less req/sec.

@bartlomieju
Copy link
Member Author

Ping @ry @piscisaureus do you have any thoughts about this PR?

There's no perf gain from this PR and it introduces additional complexity so IMHO it's not really worth landing

@ry
Copy link
Member

ry commented Sep 12, 2019

@bartlomieju I'd like to have the libdeno binding available for future experimentation, but left it in kAuto for now. I'd be happy to land that.

@bartlomieju
Copy link
Member Author

@bartlomieju I'd like to have the libdeno binding available for future experimentation, but left it in kAuto for now. I'd be happy to land that.

Sure, I'll update the PR

@bartlomieju bartlomieju force-pushed the feat-explicit_microtask_policy branch from ef53cf0 to 2fc6aa5 Compare September 12, 2019 16:36
@bartlomieju bartlomieju changed the title feat: use v8::MicrotaskPolicy::kExplicit feat: add bindings to run microtasks from Isolate Sep 12, 2019
Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks - I’m sure this will come in handy soon.

@ry ry merged commit c03cdcc into denoland:master Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants