Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ipfs get" using the Filestore #3981

Open
progval opened this issue Jun 14, 2017 · 18 comments
Open

"ipfs get" using the Filestore #3981

progval opened this issue Jun 14, 2017 · 18 comments
Labels
effort/weeks Estimated to take multiple weeks exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature need/maintainers-input Needs input from the current maintainer(s) P3 Low: Not priority right now

Comments

@progval
Copy link

progval commented Jun 14, 2017

Type:

Enhancement

Severity:

Medium

Description:

ipfs add now has an experimental --no-copy option, which allows to make a file/directory known to IPFS, and serve it to other peers from its original place on the hard disk.

Could you add a similar feature to ipfs get to write files to the filesystem, without adding them in ~/.ipfs too (assuming they are not already there)?

Thanks!

@whyrusleeping
Copy link
Member

This should be doable, though it will be difficult. We could potentially do this by adding some sort of signaling method to the filestore that says "when you get a block by this hash, you should add it to the filestore with this filepath and offset".

Thoughts @kevina ?

@kevina
Copy link
Contributor

kevina commented Jun 15, 2017

Bypassing the "cache" would be difficult. Copying something in the cache to disk without rehashing is fairly straightforward. At least it was on my implementation as I already implemented it https://github.com/ipfs-filestore/go-ipfs/blob/master/filestore/util/move.go. Since the current implementation differs from mine i am not 100% sure how useful my code is.

@whyrusleeping
Copy link
Member

@kevina the difference is that its not being moved to the disk after the fact, we want to write it to the file as its being fetch, avoiding the blockstore entirely

@progval
Copy link
Author

progval commented Jun 15, 2017

@whyrusleeping What I had in mind was to write in the blockstore until the file is complete, then write to the file without recomputing its hash, and clear the blockstore.
Your idea is a great feature, but as you said, it is much harder to implement.

@whyrusleeping
Copy link
Member

@progval ah, that could be done with first fetching the file with ipfs get, then running a gc (to clear the blocks from your blockstore) then re-adding it with a no-copy add. Of course, that could all be automated in a flag on get.

The problem there though is that running that gc has the unintended side effect of cleaning out the entire blockstore, which you don't necessarily want. removing blocks from the blockstore after the download is equally as difficult (Requires computing the entire pinset to make sure the blocks we want to remove arent pinned by other things already). I'm actually thinking that the best (and though maybe not the easiest, but the least problematic) option might actually be to try and stream from bitswap straight into the filestore (bypassing the blockstore as i suggest above).

@kevina
Copy link
Contributor

kevina commented Jun 15, 2017

@whyrusleeping there is the ipfs block rm command that is already implemented, you can batch remove blocks so the pinset is only computed once.

@kevina
Copy link
Contributor

kevina commented Jun 15, 2017

ipfs filestore mv would be a first step in doing the (better) way you described. Getting a block from the network and not storing in the blockstore would be a next step.

@whyrusleeping
Copy link
Member

I think doing ipfs filestore mv should be the first actionable on this issue.

@whyrusleeping whyrusleeping added the help wanted Seeking public contribution on this issue label Aug 31, 2017
@kevina
Copy link
Contributor

kevina commented Aug 31, 2017

@whyrusleeping ipfs filestore mv should be fairly easy, I created a new issue for it #4193

@whyrusleeping
Copy link
Member

whyrusleeping commented Aug 31, 2017 via email

@schomatis
Copy link
Contributor

schomatis commented Feb 26, 2018

@kevina While you are moving forward with ipfs filestore mv, is there something I can contribute towards the ipfs get --no-copy functionality?

@ildar
Copy link

ildar commented Jun 5, 2020

Is the work on this stalled yet?

@Stebalien
Copy link
Member

It looks like there was a start in #4261 but we'd also need something to then remove the file from the blockstore.

@ildar
Copy link

ildar commented Jun 8, 2020 via email

@Stebalien
Copy link
Member

Yes? But how would one do that?

This was referenced Jul 23, 2020
@jl452
Copy link

jl452 commented Jan 22, 2021

"ipfs add" can use --chunker or --raw-leaves or not
...and after this hash will be sort of random
what hash i have after "ipfs get --no-copy" ?

  1. can i take the same hash as the person who added the file?

ps:
i see hash-issue in ipfs, but it "official documented" #6891 from #4318
"Finally, a note on hash determinism. While not guaranteed, adding the same
file/directory with the same flags will almost always result in the same output
hash. However, almost all of the flags provided by this command (other than pin,
only-hash, and progress/status related flags) will change the final hash."

  1. is there any type of hash in ipfs that contains several variants of hashes leading to the same file? for example - default_add_hash+raw-leaves_hash+default_chunker_hash in one

@Stebalien Stebalien changed the title enhancement: "ipfs get" using the Filestore "ipfs get" using the Filestore Mar 11, 2021
@Stebalien Stebalien added exp/expert Having worked on the specific codebase is important effort/days Estimated to take multiple days, but less than a week kind/enhancement A net-new feature or improvement to an existing feature need/maintainers-input Needs input from the current maintainer(s) and removed help wanted Seeking public contribution on this issue effort/days Estimated to take multiple days, but less than a week labels Mar 11, 2021
@Stebalien Stebalien added effort/weeks Estimated to take multiple weeks P3 Low: Not priority right now labels Mar 11, 2021
@Artoria2e5
Copy link

Another thing this can provide is a --pin option on ipfs get, making the help-with-seeding-but-also-look-at-files job somewhat more similar to other P2P applications.

@markg85
Copy link
Contributor

markg85 commented Jul 12, 2023

Just a little "me too" reply, but with a reason why i want this feature.
Granted, #8201 would be ideal but that seems to be long term.

In my usecase i'm syncing data between two IPFS nodes (using syncthing). This data is large per file (up till tens of gigabytes per file) which i only want to have once on my hdd.
Doing ipfs get <cid> -o <whatever> gives you a blockstore copy in the IPFS cache.
To fix that one would have to do:

ipfs repo gc
ipfs add --nocopy <file>

While that solution works, it's not sane because you have to gc your IPFS repo and then re-add that file.
Being able to do ipfs get <cid> --nocopy -o <somefile> would solve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/weeks Estimated to take multiple weeks exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature need/maintainers-input Needs input from the current maintainer(s) P3 Low: Not priority right now
Projects
None yet
Development

No branches or pull requests

9 participants