-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark: let refactor #28867
benchmark: let refactor #28867
Conversation
In benchmark url directory this changes for loops using var to let when it applies for consistency
In benchmark util directory this changes for loops using var to let when it applies for consistency
In benchmark buffers directory this changes for loops using var to let when it applies for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hopefully I did this right, but here's a CI that runs the benchmark tests (skipped in our usual CI): https://ci.nodejs.org/view/All/job/node-test-commit-custom-suites-freestyle/8280/ |
Before landing this, we'll want to do a few comparison benchmark runs on the same builds. If there is still a significant performance difference between using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Benchmark comparisons against master: buffer: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/411/ util: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/412/ (queued, won't exist until the one above finishes) url: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/413/ (queued, won't exist until the two above finish) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if the benchmark comparisons against master are OK.
Note that the benchmark results are sometimes demonstrably erroneous (see nodejs/build#1793) so we'll want to confirm any significant differences aren't simply that. |
Buffer benchmark results (which, again, can be erroneous, so we will need to confirm by running a benchmark on CI with identical binaries, which I'll set up) show some statistically-significant performance changes in both directions: buffers/buffer-indexof.js n=50000 type='buffer' encoding='ucs2' search='aaaaaaaaaaaaaaaaa' *** 7.07 % ±3.88% ±5.17% ±6.74%
buffers/buffer-creation.js n=600000 len=1024 type='slow-allocUnsafe' ** 9.44 % ±6.82% ±9.11% ±11.92%
buffers/buffer-from.js n=800000 len=100 source='buffer' ** 6.79 % ±4.55% ±6.06% ±7.89%
buffers/buffer-read-with-byteLength.js byteLength=3 n=1000000 type='IntBE' buffer='fast' ** 15.17 % ±10.67% ±14.32% ±18.89%
buffers/buffer-tostring.js n=1000000 len=1024 args=1 encoding='ascii' ** -1.61 % ±1.11% ±1.48% ±1.93%
buffers/buffer-tostring.js n=1000000 len=1024 args=3 encoding='ascii' ** -1.80 % ±1.31% ±1.75% ±2.27%
buffers/buffer-compare-instance-method.js n=1000000 args=5 size=16 * -1.96 % ±1.57% ±2.08% ±2.71%
buffers/buffer-compare-instance-method.js n=1000000 args=5 size=16386 * -5.19 % ±4.24% ±5.69% ±7.49%
buffers/buffer-compare-offset.js n=1000000 size=512 method='slice' * -2.11 % ±1.75% ±2.33% ±3.03%
buffers/buffer-concat.js n=800000 withTotalLength=1 pieceSize=256 pieces=16 * -16.64 % ±13.98% ±18.60% ±24.21%
buffers/buffer-indexof.js n=50000 type='buffer' encoding='ucs2' search='Gryphon' * -4.24 % ±3.33% ±4.45% ±5.84%
buffers/buffer-indexof.js n=50000 type='buffer' encoding='ucs2' search='venture to go near the house till she had brought herself down to' * 3.34 % ±2.56% ±3.44% ±4.52%
buffers/buffer-indexof.js n=50000 type='buffer' encoding='utf8' search='SQ' * 4.88 % ±4.33% ±5.76% ±7.50%
buffers/buffer-indexof.js n=50000 type='string' encoding='ucs2' search='Gryphon' * -6.29 % ±5.75% ±7.74% ±10.24%
buffers/buffer-iterate.js n=1000 method='for' type='fast' size=16386 * -5.80 % ±4.96% ±6.60% ±8.60%
buffers/buffer-normalize-encoding.js n=1000000 encoding='utf-16le' * -6.39 % ±5.28% ±7.04% ±9.20%
buffers/buffer-swap.js n=1000000 len=768 method='swap32' aligned='true' * -8.25 % ±7.05% ±9.50% ±12.61%
buffers/buffer-tostring.js n=1000000 len=1 args=1 encoding='hex' * 1.98 % ±1.96% ±2.61% ±3.41%
buffers/buffer-tostring.js n=1000000 len=64 args=1 encoding='hex' * -1.58 % ±1.39% ±1.86% ±2.41%
buffers/buffer-tostring.js n=1000000 len=64 args=3 encoding='hex' * 1.79 % ±1.55% ±2.06% ±2.69%
buffers/buffer-write-string.js n=1000000 len=2048 args='offset' encoding='ascii' * -5.20 % ±4.70% ±6.32% ±8.35%
buffers/buffer-write-string.js n=1000000 len=2048 args='offset' encoding='latin1' * -5.20 % ±4.25% ±5.70% ±7.51%
buffers/buffer-write-string.js n=1000000 len=2048 args='offset+length' encoding='' * 6.13 % ±5.50% ±7.41% ±9.83%
buffers/dataview-set.js n=1000000 type='Uint32LE' * -6.12 % ±5.60% ±7.46% ±9.72% |
Validation of buffer benchmark (running benchmarks using the same code base, so if there are significant differences, we know the issue is in CI or just problematic benchmarks, but not any actual performance difference): https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/414/ |
Util benchmark results that show statistically significant changes. Similar to buffer results, they will need to be validated. util/inspect.js option='colors' method='Array' n=20000 * 2.22 % ±1.95% ±2.60% ±3.39%
util/inspect.js option='showHidden' method='Object' n=20000 * -5.56 % ±5.06% ±6.74% ±8.79%
util/inspect.js option='showHidden' method='TypedArray' n=20000 * 4.23 % ±3.79% ±5.07% ±6.65%
util/normalize-encoding.js n=100000 input='base64' ** 8.52 % ±6.28% ±8.38% ±10.97%
util/normalize-encoding.js n=100000 input='group_misc' * -5.49 % ±5.01% ±6.67% ±8.69%
util/type-check.js n=100000 argument='false-primitive' version='native' type='Uint8Array' ** -6.23 % ±4.66% ±6.22% ±8.13% |
Validation of util benchmarks (running benchmarks using the same code base, so if there are significant differences, we know the issue is in CI or just problematic benchmarks, but not any actual performance difference): https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/415/ |
19:03:40 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='file' withBase='false' ** -9.95 % ±6.57% ±8.74% ±11.38%
19:03:40 url/legacy-vs-whatwg-url-parse.js method='legacy' e=1 type='percent' withBase='false' * -9.46 % ±8.22% ±10.96% ±14.30%
19:03:40 url/legacy-vs-whatwg-url-searchparams-parse.js n=1000000 method='legacy' searchParam='altspaces' * 2.13 % ±2.13% ±2.84% ±3.69%
19:03:40 url/legacy-vs-whatwg-url-searchparams-serialize.js n=1000000 method='legacy' searchParam='noencode' * 4.98 % ±4.47% ±6.01% ±7.93%
19:03:40 url/url-resolve.js n=100000 path='withscheme' href='javascript' * 5.85 % ±5.29% ±7.09% ±9.32%
19:03:40 url/whatwg-url-properties.js prop='pathname' e=1 type='wpt' withBase='true' * -8.04 % ±7.54% ±10.03% ±13.06%
|
Welp...running buffer benchmarks on CI using two identical code bases yields statistically significant differences. There is something really wrong (and has been for a while) with either the CI benchmark machine or the benchmarks themselves. 22:59:52 buffers/buffer-compare-offset.js n=1000000 size=16386 method='offset' * -1.93 % ±1.83% ±2.43% ±3.17%
22:59:52 buffers/buffer-fill.js n=20000 size=65536 type='fill(400)' * 4.22 % ±3.52% ±4.68% ±6.10%
22:59:52 buffers/buffer-from.js n=800000 len=2048 source='buffer' * 21.66 % ±16.63% ±22.12% ±28.79%
22:59:52 buffers/buffer-indexof.js n=50000 type='buffer' encoding='ucs2' search='neighbouring pool' ** -2.66 % ±1.89% ±2.52% ±3.29%
22:59:52 buffers/buffer-iterate.js n=1000 method='iterator' type='fast' size=4096 * 7.92 % ±6.27% ±8.36% ±10.90%
22:59:52 buffers/buffer-read.js n=1000000 type='BigUInt64BE' buffer='fast' * 4.69 % ±3.74% ±4.98% ±6.49%
22:59:52 buffers/buffer-read.js n=1000000 type='UInt32BE' buffer='fast' * -5.54 % ±4.88% ±6.50% ±8.48%
22:59:52 buffers/buffer-read-with-byteLength.js byteLength=4 n=1000000 type='UIntLE' buffer='fast' * -6.11 % ±5.55% ±7.38% ±9.61%
22:59:52 buffers/buffer-swap.js n=1000000 len=2056 method='swap16' aligned='false' *** 1.31 % ±0.31% ±0.41% ±0.55%
22:59:52 buffers/buffer-swap.js n=1000000 len=2056 method='swap64' aligned='false' *** 6.99 % ±3.22% ±4.30% ±5.64%
22:59:52 buffers/buffer-tostring.js n=1000000 len=1024 args=1 encoding='ascii' ** -1.70 % ±1.21% ±1.61% ±2.10%
22:59:52 buffers/buffer-tostring.js n=1000000 len=1 args=1 encoding='UCS-2' * -2.61 % ±2.20% ±2.93% ±3.82%
22:59:52 buffers/buffer-write.js n=1000000 type='BigInt64BE' buffer='fast' * -5.06 % ±5.01% ±6.74% ±8.91%
22:59:53 buffers/buffer-write-string.js n=1000000 len=2048 args='' encoding='hex' * 2.08 % ±2.04% ±2.72% ±3.56%
22:59:53 buffers/buffer-write-string.js n=1000000 len=2048 args='offset' encoding='ascii' ** -6.90 % ±5.13% ±6.86% ±8.99%
22:59:53 buffers/buffer-write-string.js n=1000000 len=2048 args='offset' encoding='latin1' *** -4.48 % ±1.89% ±2.51% ±3.27%
22:59:53 buffers/buffer-zero.js type='string' n=1000000 * -4.02 % ±3.81% ±5.08% ±6.62%
22:59:53 buffers/dataview-set.js n=1000000 type='Float32BE' * -7.94 % ±5.95% ±7.96% ±10.44% |
@Trott how many different benchmarks were tested in total during this run? Edit: I assume you pasted only the statistically significant ones. How many lines were in total? |
@targos Good point! A certain number of false positives should be expected. I've preserved the complete output of the R script at https://gist.github.com/Trott/a4e02d6d17ff0107ff0bef9220d750aa. At 0.1% confidence, 0.36 false positives are expected but there are 3. At 1% confidence, 3.63 false positives are expected but there are 6. On the other hand at 5% confidence, 18.15 false positives are expected and there are 18. |
Landing... |
Thanks. I wonder if we should use a method like the Benjamini–Hochberg procedure instead of computing a theoretical number of expected false positives... |
If anyone else needs a more detailed explanation that assumes less prior knowledge, I found https://www.statisticshowto.datasciencecentral.com/benjamini-hochberg-procedure/ helpful. It would be interesting to see what kind of results that would get in, for example, the buffer run we talked about above. |
Landed in 3d56e89...8be3766 |
In benchmark url directory this changes for loops using var to let when it applies for consistency PR-URL: nodejs#28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In benchmark util directory this changes for loops using var to let when it applies for consistency PR-URL: nodejs#28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In benchmark buffers directory this changes for loops using var to let when it applies for consistency PR-URL: nodejs#28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In benchmark url directory this changes for loops using var to let when it applies for consistency PR-URL: #28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In benchmark util directory this changes for loops using var to let when it applies for consistency PR-URL: #28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In benchmark buffers directory this changes for loops using var to let when it applies for consistency PR-URL: #28867 Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Khaidi Chu <i@2333.moe> Reviewed-By: Rich Trott <rtrott@gmail.com>
In
benchmark: url
,util
andbuffers
directories this changes for loops usingvar
tolet
when it applies for consistency.
I didn't touch for instance:
in
benchmark/buffers/buffer-iterate.js
becauselet
has block scoping and I think it might cause issues in previous versions in nested for loops, but I am not 100% about it.If this is OK, I will do it for the other directories in benchmark with single commits so it's easy to review a few files per commit.
Look at this for reference: #28791
make -j4 test
(UNIX)