Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

OOM when importing large directories #1736

Closed
achingbrain opened this issue Nov 28, 2018 · 4 comments
Closed

OOM when importing large directories #1736

achingbrain opened this issue Nov 28, 2018 · 4 comments
Assignees
Labels
exp/wizard Extensive knowledge (implications, ramifications) required kind/bug A bug in existing code (including security flaws) kind/resolved-in-helia P2 Medium: Good to have, but can wait until someone steps up status/blocked Unable to be worked further until needs are met status/deferred Conscious decision to pause or backlog

Comments

@achingbrain
Copy link
Member

This call to get-folder-size can cause OOMs on large directories as the get-folder-size module keeps a list of all the files it's seen to make sure it doesn't double-count symlinked files (for example).

We only calculate the directory size to show a progress indicator - I propose we switch to an indeterminate indicator if the number of files being imported is unreasonable (say, 10k).

What do you think?

@alanshaw
Copy link
Member

What if we switch to files added / total files for > 10K file adds? I'm assuming you're going to need to count the total files to determine if you're over 10K or not...this would give the user at least some indication of progress.

@alanshaw alanshaw added kind/bug A bug in existing code (including security flaws) exp/novice Someone with a little familiarity can pick up status/ready Ready to be worked P2 Medium: Good to have, but can wait until someone steps up labels Nov 29, 2018
@achingbrain
Copy link
Member Author

This is a little harder than it seems - the way all these file counter modules seem to work is to call fs.readdir which returns a list of files. If the directory is too big, it'll explode.

There's an issue on the node.js repo about an iterative fs.readdir but it needs support added to libuv first, which looks like is happening, but slowly.

@alanshaw alanshaw added status/deferred Conscious decision to pause or backlog status/blocked Unable to be worked further until needs are met and removed status/ready Ready to be worked labels Dec 4, 2018
@daviddias daviddias added status/ready Ready to be worked status/deferred Conscious decision to pause or backlog exp/expert Having worked on the specific codebase is important exp/wizard Extensive knowledge (implications, ramifications) required and removed status/deferred Conscious decision to pause or backlog status/ready Ready to be worked exp/novice Someone with a little familiarity can pick up exp/expert Having worked on the specific codebase is important labels Dec 9, 2018
@alanshaw
Copy link
Member

I'm hoping that #1777 might help a little - if you call ipfs add --progress=false then get-folder-size won't be used anymore.

@achingbrain
Copy link
Member Author

js-ipfs is being deprecated in favor of Helia. You can #4336 and read the migration guide.

@helia/unixfs doesn't notify on progress in the same way, the user can keep track of the number of files/folders imported if they wish to show feedback.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/wizard Extensive knowledge (implications, ramifications) required kind/bug A bug in existing code (including security flaws) kind/resolved-in-helia P2 Medium: Good to have, but can wait until someone steps up status/blocked Unable to be worked further until needs are met status/deferred Conscious decision to pause or backlog
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants