-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
src: move cpSync dir copy logic completely to C++ #58624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
src: move cpSync dir copy logic completely to C++ #58624
Conversation
7081997
to
055b84b
Compare
src/node_file.cc
Outdated
Local<Value> argv[] = { | ||
String::NewFromUtf8( | ||
env->isolate(), src.c_str(), v8::NewStringType::kNormal) | ||
.ToLocalChecked(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid tolocalchecked() calls since they are unsafe. Also you can use ToV8Value method which reduces the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the safeness of the code, I did run this with different filter functions, I also tried not returning anything or throwing within the filter function itself and every case seems totally fine with ToLocalChecked
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key challenge with ToLocalChecked()
is that it causes the node.js process to crash even on otherwise recoverable errors. It really is better to use the ToLocal(...)
APIs instead and we've been going through and actively converting away from using ToLocalChecked
as much as possible. I agree with @anonrig that we should avoid it here also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If string creation fails due to external factors or simply due a bug in v8, this will directly crash process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks a lot for the clarification both 🫶, I'll swap this with ToLocal
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this works for you? 🙂 🙏
src/node_file.cc
Outdated
if (args[7]->IsFunction()) { | ||
Local<v8::Function> args_filter_fn = args[7].As<v8::Function>(); | ||
|
||
filter_fn = [env, args_filter_fn](std::string src, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use std::string view you will improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added
semver-major
(notice the addition of I do like the extra |
src/node_file.cc
Outdated
.ToLocalChecked(), | ||
String::NewFromUtf8( | ||
env->isolate(), dest.data(), v8::NewStringType::kNormal) | ||
.ToLocalChecked()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional difference here that will be problematic is that this is passing the src
and dest
to the filter callback as a string always. With the currrent implementation, however, if we pass the path in as a Buffer or URL then the filter callback receives the Buffer and URL... I notice that this appears to be undocumented in api docs sadly)... We have to preserve that because this implementation tries to interpret all src and dest paths as UTF8 sequences which they quite likely won't be (the reason we accept Buffer in the first place is that many filenames in older legacy systems use legacy text encodings).
fs.cpSync(Buffer.from('a.js'), Buffer.from('b.js'), { filter(...args) { console.log(...args); return true } });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok I see
Thanks a lot for pointing this out, I'll address it 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Are you sure about this statement?
if we pass the path in as a Buffer or URL then the filter callback receives the Buffer and URL...
Because after giving it a quick try that does not seem to be the case for me 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also commented this here: #58627 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Given our conversation in #58627 and also #58634, what do you think it means for this PR? Do you think we can proceed with my C++ implementation even though it is not converting the values? (it is not making the issue any worse, I guess? 🤔)
Or should this PR be blocked by #58634 as well? 🤔
(PS: I'd be pretty hesitant addressing #58634 here since this PR is only for cpSync
, whilst the issue is both for cp
and cpSync
🤔)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like us to see if we can figure out a fix for #58634 first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is simple enough... I'd like to make sure that the fix to the bug doesn't conflict with the changes you're making here... that is, I want to make sure landing this doesn't make fixing the bug more difficult / convoluted / etc or make sure we don't end up having to revert this in part when fixing the issue. Even if we just can get a sense of how we want to fix the bug it should be enough to unblock this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok I understand, yeah that totally makes sense to me, thanks for the info 🙏
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #58624 +/- ##
==========================================
- Coverage 90.17% 89.63% -0.55%
==========================================
Files 637 637
Lines 188023 187995 -28
Branches 36887 36591 -296
==========================================
- Hits 169552 168506 -1046
- Misses 11223 12166 +943
- Partials 7248 7323 +75
🚀 New features to boost your workflow:
|
May I assume this change is for making cpSync 'faster'? If so, wouldn't that require a benchmark CI? |
Hey @geeksilva97 👋
Yes, as mentioned in the PR's description 😃
Sorry, I am not completely sure what you're asking 🙂 Are you asking me to create a new benchmark in file in the benchmark folder such as benchmark/fs |
Thanks for replying, @dario-piotrowicz . About the benchmark, I was asking if this PR needs a benchmark ci run - that one we add with |
Of course! it's my pleasure 😄
Ah ok, sorry I wasn't familiar with that tag 🙂 We could add it but since I don't see any benchmarks in But if you want I can add such a benchmark and then the tag 🤔 But either way, it sounds like this PR can be blocked for a while 🥹 (#58624 (comment)) |
f9efe3c
to
a3b912b
Compare
prior to these changes cpSync would copy directories using C++ logic only if the user didn't provide a filtering function to the cpSync call, the changes here make it so that instead C++ is used to copy directories even when a filtering function is provided
use std::optional
use std:string_view
Co-authored-by: Yagiz Nizipli <yagiz@nizipli.com>
a3b912b
to
d95e989
Compare
Co-authored-by: Anna Henningsen <github@addaleax.net>
prior to these changes cpSync would copy directories using C++ logic only if the user didn't provide a filtering function to the cpSync call, the changes here make it so that instead C++ is used to copy directories even when a filtering function is provided
Some benchmarking I performed locally:

Note
The perf gain is nice (although not huge), the biggest benefit of this change for me is code improvement, since with these changes we no longer need to have the same exact functionality implemented in both JS and C++, also this should help moving more cpSync code over C++ 🙂