-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Isaacs/explicit caching #75
Conversation
I'm having trouble getting the desired behavior in my bootleg development environment
but the gist of the code LGTM, digging into a bit as to why it's not working |
lib/fetcher.js
Outdated
) | ||
// best-effort. cache write errors should not crash the fetch. | ||
cstream.on('error', (er) => {}) | ||
stream.pipe(cstream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the last piped stream needs to be returned here to have the caching actually run.
This fixes the problems I was seeing
stream.pipe(cstream) | |
stream.pipe(cstream) | |
return cstream |
Alternatively:
stream.pipe(cstream) | |
return stream.pipe(cstream) |
Although you might want to just keep reassigning stream
to the latest stream eg stream = stream.pipe(istream)
& stream = stream.pipe(cstream)
, return stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning cstream
makes all the pacote tests fail, because the cacache.put.stream
writable doesn't pass through its data. Returning cstream
is definitely not the right way to go.
If we pipe istream
into cstream
then all the data gets consumed before the caller can get access to it. So as a result, you always get an empty tarball stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I'm getting with this patch, using pacote directly via its bin:
$ node lib/bin.js tarball file:pacote-11.3.1.tgz copy.tgz --cache=$PWD/pacote-cache
{
integrity: 'sha512-B5R+b6BX2U6exQ46KdYlHZNBsZL0RDe9TYwC6xbOgoU9JQ957m8zUxDAWwUuuCKb1JFPW+8KB+YCZRj4oBzzJQ==',
resolved: '/Users/isaacs/dev/npm/pacote/pacote-11.3.1.tgz',
from: 'file:pacote-11.3.1.tgz'
}
$ shasum -a 512 pacote-11.3.1.tgz
07947e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325 pacote-11.3.1.tgz
$ shasum -a 512 copy.tgz
07947e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325 copy.tgz
$ shasum -a 512 pacote-cache/content-v2/sha512/07/94/7e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325
07947e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325 pacote-cache/content-v2/sha512/07/94/7e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325
$ tree pacote-cache
pacote-cache
├── content-v2
│ └── sha512
│ └── 07
│ └── 94
│ └── 7e6fa057d94e9ec50e3a29d6251d9341b192f44437bd4d8c02eb16ce82853d250f79ee6f335310c05b052eb8229bd4914f5bef0a07e6026518f8a01cf325
├── index-v5
│ └── ee
│ └── 0d
│ └── 4c4d45587af9c1af774f9c59d3fc9b91b91764c81d2f3adb080948187863
└── tmp```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within npm/cli:
$ npm i npm/pacote#pull/75
changed 1 package, and audited 907 packages in 3s
128 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
$ node . cache add file:npm-7.9.0.tgz --cache=test-cache
$ tree test-cache/
test-cache
└── _cacache
├── content-v2
│ └── sha512
│ └── c7
│ └── ed
│ └── 81d48a1de77d860a67af21f50f9dd65f1038ec8291b3b95a6eb665ad3c175d52a989fa10da3eddf2ed694c75dd53f4b6decdf2d43a9508982d846435246c
├── index-v5
│ └── af
│ └── 03
│ └── 5c781820370e585dc2323edbbc80669bf714da5b47d56510c7d0bd7521ee
└── tmp
└── b1d942f0
$ shasum -a 512 test-cache/_cacache/content-v2/sha512/c7/ed/81d48a1de77d860a67af21f50f9dd65f1038ec8291b3b95a6eb665ad3c175d52a989fa10da3eddf2ed694c75dd53f4b6decdf2d43a9508982d846435246c
c7ed81d48a1de77d860a67af21f50f9dd65f1038ec8291b3b95a6eb665ad3c175d52a989fa10da3eddf2ed694c75dd53f4b6decdf2d43a9508982d846435246c test-cache/_cacache/content-v2/sha512/c7/ed/81d48a1de77d860a67af21f50f9dd65f1038ec8291b3b95a6eb665ad3c175d52a989fa10da3eddf2ed694c75dd53f4b6decdf2d43a9508982d846435246c
$ shasum -a 512 npm-7.9.0.tgz
801b08dda0a4334e6643b666285d16f0e317c4d89bc0a5da1a0dd28306aa0b702f307b10fb270e230e364307bdb645ff16e88ac7611ef9702bf5b1bc926ef7b8 npm-7.9.0.tgz
Seems like it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Spoke too soon. The usage in npm/cli is breaking somehow.
lib/fetcher.js
Outdated
) | ||
// best-effort. cache write errors should not crash the fetch. | ||
cstream.on('error', (er) => {}) | ||
stream.pipe(cstream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not sure how streams work, but is istream
inserted in between stream
& cstream
during the stream.pipe(cstream)
. Maybe
stream.pipe(cstream) | |
istream.pipe(cstream) |
would be clearer if it does the same thing?
I'm working all of these off of my initial stream.pipe(istream).pipe(cstream)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we pipe the istream to the cstream, then the caller never gets the bytes, because they go into the cache writer and black-hole there.
We need the stream itself to pipe to both so that the bytes go to both locations, but the user can read from the istream.
Yeah, so, here's what's happening:
So we need to wait for the cstream end event to be called before ending the istream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking so we don't accidentally land; Based on @isaacs's feedback that changes are going to be made before reviewing/landing
ec60d33
to
ca170cf
Compare
Ok, this is working in the CLI now that it's not resolving the fetch stream promise until it's fully written to cache. PTAL |
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 PR-URL: #74 Credit: @mjsir911 Close: #74 Reviewed-by: @isaacs EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes.
ca170cf
to
1e737c5
Compare
LGTM @isaacs and I can confirm the caching works with Thanks for all your work & thanks for picking this up! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the help here @mjsir911
Fixed up and tested version of #74.
Updated to use tap 15, because I got tired of having it try to run the tests in git checkouts when I
^C
out of a previous test run. (Tap 15 is smart enough to avoid folders created byt.testdir()
.)