Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use jsdiff instead of 'patch' to reduce dependency on OS binaries #355

Merged
merged 1 commit into from
Feb 17, 2019

Conversation

adrukh
Copy link
Contributor

@adrukh adrukh commented Feb 1, 2019

  • Ready for review
  • Follows CONTRIBUTING rules
  • Reviewed by Snyk internal team

What does this PR do?

Applying patches on snyk protect requires the patch binary to be available. This is problematic especially on Windows envs where GNU binaries are not installed by default.

With the help of @saschanaz in #335 we're giving jsdiff another go, with the intent to remove the dependency on the patch binary, and apply patches with pure JS.

@adrukh adrukh self-assigned this Feb 1, 2019
@adrukh adrukh force-pushed the feat/jsdiff branch 2 times, most recently from 37e9ac6 to 655537e Compare February 1, 2019 09:00
@adrukh
Copy link
Contributor Author

adrukh commented Feb 1, 2019

I tested this manually by applying the patch and the jsdiff versions against https://github.com/snyk/goof, comparing the resulting node_modules folder. The only difference has been the content of the 'patch flag' files, which is expected (they contain the timestamp of patching).

We will huddle internally on how to best move forward in the least risky way. Really excited about this opportunity, thanks again @saschanaz !

@miiila
Copy link
Contributor

miiila commented Feb 12, 2019

Hi @saschanaz, I just wanna let you know we haven't forgotten your awesome contribution and we've proceed a bit more with internal testing and we hope to merge this PR soon.

Thank you again for your help!

@@ -1,6 +1,7 @@
module.exports = applyPatch;

var debug = require('debug')('snyk');
var diff = require('diff');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we added the @types pkg for it, so why not .ts it? 😉

package.json Outdated
@@ -51,11 +51,13 @@
"dependencies": {
"@snyk/dep-graph": "1.1.2",
"@snyk/gemfile": "1.1.0",
"@types/diff": "^3.5.2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this, and the other types are actually here and not in devDependencies ? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - lodash & node types are in devDependencies so all these are here by mistake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move this.

if (error || out.indexOf('FAILED') !== -1) {
debug('patch command failed', relative, error, out);
return patchError(error, out, relative, vuln).catch(reject);
function jsDiff(patchContent, relative, live) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is live ? I see we had such a var before, but maybe it's a chance to improve the naming?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

live is a boolean that decides whether the patch result should be written to the file system or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. Would suggest renaming it to isDryRun (and reversing the logic checks 😅).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes way up into the flow (https://github.com/snyk/snyk/blob/master/src/cli/commands/protect/index.js#L78), used in wizard flow too. Suggest to change this on a separate chore PR.

diff.applyPatches(patchContent, {
loadFile: function (index, callback) {
try {
var fileName = stripFirstSlash(index.oldFileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest add comment with a sample path(s), to illustrate why we do this.

debug('patch command failed', relative, error, out);
return patchError(error, out, relative, vuln).catch(reject);
function jsDiff(patchContent, relative, live) {
const files = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename to patchedFiles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this.

@adrukh
Copy link
Contributor Author

adrukh commented Feb 17, 2019

Force-pushed after rebasing over master and attending to the majority of the comments.
@saschanaz please see last two comments re diff operations and how we can document them.

@adrukh
Copy link
Contributor Author

adrukh commented Feb 17, 2019

@michael-go can you please ✅ so I can squash the commits and merge this? ✨

@adrukh adrukh merged commit 90b0830 into master Feb 17, 2019
@adrukh adrukh deleted the feat/jsdiff branch February 17, 2019 11:38
@snyksec
Copy link

snyksec commented Feb 17, 2019

🎉 This PR is included in version 1.131.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@adrukh adrukh restored the feat/jsdiff branch February 17, 2019 15:16
@adrukh adrukh deleted the feat/jsdiff branch February 17, 2019 18:54
@adrukh
Copy link
Contributor Author

adrukh commented Feb 18, 2019

@saschanaz 👋
We've been running with this change since Sunday, monitoring our success vs failure rates. We are seeing some rare cases where the patching logic is not behaving properly. We had quite limited visibility into patching failures until this change, so it's hard to tell whether this is a new case, or simply deeper visibility into an older issue.

I was wondering if you could shed some light on the situation as we see it. Sometimes, we end up processing an empty hunk. This code runs - https://github.com/snyk/snyk/blob/49dc318db8a5e46bb324c10909239d954cdd3217/src/lib/protect/apply-patch.js#L46-L59 - with the index var being just { "hunks": [] }. We bail out as the index.oldFileName is undefined.

This is happening for our users here and there, but we are not able to reproduce the issue locally. Any thoughts about this would be greatly appreciated 🙏

@saschanaz
Copy link
Contributor

saschanaz commented Feb 18, 2019

@adrukh

https://github.com/kpdecker/jsdiff/blob/2c1a29c0309fef22219bcf2cbb475f20508a955b/src/patch/parse.js#L55-L69

Looks like it may happen when a patch file has no header. Is there an headerless patch on snyk?

@adrukh
Copy link
Contributor Author

adrukh commented Feb 19, 2019

Hmmm

Not only do we not have headerless patches, but this is also not persistent at all in terms of reproduction.

Could this be a race condition when we have yet created the patch file locally, and are already trying to read it? Will check!

if (patchedFiles[fileName] === null) {
fs.unlinkSync(path.resolve(relative, fileName));
}
fs.writeFileSync(path.resolve(relative, fileName), patchedFiles[fileName]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line looks like a potential bug generator where null would trigger removal and then rewrite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This area writes the patched dependency source back to the disk, seems ok :)

Copy link
Contributor

@saschanaz saschanaz Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value null was intended to delete the file (hence the unlink) but here it instead overwrites with the text null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! I see your point. Will make this safer.

Copy link
Contributor Author

@adrukh adrukh Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 908bc3a

@adrukh
Copy link
Contributor Author

adrukh commented Feb 19, 2019

@saschanaz thanks for your assistance here! I think we found the culprit - indeed a race condition where we resolve fetching the patch once the data has been downloaded, but not once it has been written to disk. See https://github.com/snyk/snyk/blob/master/src/lib/protect/fetch-patch.js#L33-L35

We're going to fix this now, have been carrying this problem with us for a very long time! 😌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants