Closed
Description
I've opened websockets/utf-8-validate#109 to use is_utf8
in utf-8-validate
.
When running the same benchmarks run in websockets/utf-8-validate#101 I noticed a significant performance drop.
$ npx envinfo --system
System:
OS: macOS 13.1
CPU: (16) x64 Intel(R) Xeon(R) W-2140B CPU @ 3.20GHz
Memory: 21.54 GB / 32.00 GB
Shell: 5.2.15 - /usr/local/bin/bash
is_utf8()
$ node bench.js
Loading https://en.wikipedia.org/wiki/Main_Page ...
utf-8-validate (5.0.10, C++) x 106,808 ops/sec ±0.27% (95 runs sampled)
utf-8-validate (is_utf8, C++) x 105,159 ops/sec ±0.11% (95 runs sampled)
------------------------------------------------------------
Loading https://ro.wikipedia.org/wiki/Pagina_principală ...
utf-8-validate (5.0.10, C++) x 25,410 ops/sec ±0.09% (97 runs sampled)
utf-8-validate (is_utf8, C++) x 54,815 ops/sec ±0.09% (96 runs sampled)
------------------------------------------------------------
Loading https://ru.wikipedia.org/wiki/Заглавная_страница ...
utf-8-validate (5.0.10, C++) x 15,160 ops/sec ±0.10% (98 runs sampled)
utf-8-validate (is_utf8, C++) x 63,985 ops/sec ±0.09% (99 runs sampled)
------------------------------------------------------------
Loading https://ar.wikipedia.org/wiki/الصفحة_الرئيسية ...
utf-8-validate (5.0.10, C++) x 12,766 ops/sec ±0.08% (98 runs sampled)
utf-8-validate (is_utf8, C++) x 57,442 ops/sec ±0.08% (98 runs sampled)
------------------------------------------------------------
Loading https://ja.wikipedia.org/wiki/メインページ ...
utf-8-validate (5.0.10, C++) x 23,306 ops/sec ±0.07% (96 runs sampled)
utf-8-validate (is_utf8, C++) x 79,199 ops/sec ±0.10% (95 runs sampled)
------------------------------------------------------------
Loading https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt ...
utf-8-validate (5.0.10, C++) x 66,890 ops/sec ±0.10% (99 runs sampled)
utf-8-validate (is_utf8, C++) x 622,514 ops/sec ±0.10% (99 runs sampled)
simdutf::validate_utf8()
$ node bench.js
Loading https://en.wikipedia.org/wiki/Main_Page ...
utf-8-validate (5.0.10, C++) x 107,373 ops/sec ±0.08% (99 runs sampled)
utf-8-validate (simdutf, C++) x 749,966 ops/sec ±0.20% (96 runs sampled)
------------------------------------------------------------
Loading https://ro.wikipedia.org/wiki/Pagina_principală ...
utf-8-validate (5.0.10, C++) x 25,413 ops/sec ±0.08% (98 runs sampled)
utf-8-validate (simdutf, C++) x 144,119 ops/sec ±0.19% (95 runs sampled)
------------------------------------------------------------
Loading https://ru.wikipedia.org/wiki/Заглавная_страница ...
utf-8-validate (5.0.10, C++) x 15,164 ops/sec ±0.08% (98 runs sampled)
utf-8-validate (simdutf, C++) x 176,840 ops/sec ±0.18% (94 runs sampled)
------------------------------------------------------------
Loading https://ar.wikipedia.org/wiki/الصفحة_الرئيسية ...
utf-8-validate (5.0.10, C++) x 12,781 ops/sec ±0.08% (100 runs sampled)
utf-8-validate (simdutf, C++) x 152,366 ops/sec ±0.16% (97 runs sampled)
------------------------------------------------------------
Loading https://ja.wikipedia.org/wiki/メインページ ...
utf-8-validate (5.0.10, C++) x 23,298 ops/sec ±0.09% (99 runs sampled)
utf-8-validate (simdutf, C++) x 222,036 ops/sec ±0.18% (95 runs sampled)
------------------------------------------------------------
Loading https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt ...
utf-8-validate (5.0.10, C++) x 66,875 ops/sec ±0.11% (95 runs sampled)
utf-8-validate (simdutf, C++) x 1,051,967 ops/sec ±0.09% (100 runs sampled)
I did not investigate but I think this is because there is no AVX-512 implementation in is_utf8()
, right?
Metadata
Metadata
Assignees
Labels
No labels