inspector: simplify dispatchProtocolMessage#49780
inspector: simplify dispatchProtocolMessage#49780nodejs-github-bot merged 2 commits intonodejs:mainfrom lemire:simplifies_dispatchprotocolmessage
Conversation
|
cc @nodejs/inspector @nodejs/cpp-reviewers |
|
Nit: the commit message does not follow our guidelines. Please change it to begin with an imperative verb after the subsystem prefix. |
|
I think at this point we can confidently say there will never be a binary message in inspector protocol 😀 (it is pretty much set in stone at this point). Why not just delete |
|
PR message seems misleading. Proposed change does not seem to touch any UTF8/UTF16 shenanigans unless I am missing something. |
I changed the commit and the PR name. |
It does. And that's a key point. We are saving some non-trivial processing. The current code calls std::unique_ptr<Value> parseJSON(const std::string_view string) {
if (string.empty())
return nullptr;
size_t expected_utf16_length =
simdutf::utf16_length_from_utf8(string.data(), string.length());
MaybeStackBuffer<char16_t> buffer(expected_utf16_length);
// simdutf::convert_utf8_to_utf16 returns zero in case of error.
size_t utf16_length = simdutf::convert_utf8_to_utf16(
string.data(), string.length(), buffer.out());
// We have that utf16_length == expected_utf16_length if and only
// if the input was a valid UTF-8 string.
if (utf16_length == 0) return nullptr; // We had an invalid UTF-8 input.
CHECK_EQ(expected_utf16_length, utf16_length);
return parseJSONCharacters(reinterpret_cast<const uint16_t*>(buffer.out()),
utf16_length);
}Observe how it converts to UTF-16. Instead, we call directly std::unique_ptr<Value> parseJSON(v8_inspector::StringView string) {
if (string.length() == 0)
return nullptr;
if (string.is8Bit())
return parseJSONCharacters(string.characters8(), string.length());
return parseJSONCharacters(string.characters16(), string.length());
}See how we are bypassing We are possibly skipping some allocation and some transcoding... It is not going to make a huge difference unless you are typing really fast on your keyboard, but... code simplification is always a positive, I would think. |
Yes. Let me try. |
|
@eugeneo Last commit removes parseMessage entirely since it is seemingly not needed. |
|
@addaleax @alexkozy Can you help? This PR get the following issue under macOS in CI: If it is indeed an issue introduced by the current PR, I would like to be able to reproduce it. Running the script with Node 20 (public release), I get... On my macBook, building the software, I get... and So everything is fine, as far as I can see. This code is built with the patch from the current PR, e.g., we have void dispatchProtocolMessage(const StringView& message) {
std::string raw_message = protocol::StringUtil::StringViewToUtf8(message);
per_process::Debug(DebugCategory::INSPECTOR_SERVER,
"[inspector received] %s\n",
raw_message);
std::unique_ptr<protocol::DictionaryValue> value =
protocol::DictionaryValue::cast(
protocol::StringUtil::parseJSON(message)); |
|
The macOS test seems to pass now. :-/ |
|
Landed in 2990390 |
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
PR-URL: #49780 Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
This is in relation with bug #47457 where we found that under Windows, there seems to be a problem when converting from UTF-8 to UTF-16 a JSON message generated by key presses in the REPL.
Tracing it back, I found evidence that in common cases, each time you hit a key in the Node REPL, a call to
dispatchProtocolMessageis made, passing an UTF-16 JSON message. The message is converted to UTF-8 into the local variableraw_message, and then we callparseMessagewhich converts backraw_messageinto UTF-16 and callsparseJSONon it.This PR avoids the conversion from UTF-8 back into UTF-16. This should save some processing that is currently done with each key press.
It is possible that it is a Chesterton's fence and that the complexity is somehow necessary. Nevertheless, if this PR is correct, it would be an interesting optimization for the REPL.