Description
Specification
Some operations are "unscalable". For example to delete a file, one has to iterate over the entire file blocks and batch up all the delete operations and then execute them. This is due to the structure of how we are storing each file block as a separate key value entry in a sublevel.
If the file is large, this can take a long time. Furthermore deletion of files should be a lot faster than this. Real filesystems do not need to read the entire file just to delete it.
In order to solve this problem one way is to use lazy garbage collection. When deleting a file, it is marked as no longer accessible. The is already possible since file accessibility is determined by hardlinks which are persisted in the directory inodes.
However what is going to actually delete the key value entries? Real filesystems can simply mark filespace as reusable. Unfortunately this is not possible with a key value structure, and it would just get quite complex.
Therefore, we may instead of a garbage collection process that runs as a stream/loop and quietly deletes un-used entries in the background continuously as the EFS is being used. Right now we have something similar to this in order to remove zombie inodes in our gc sublevel.
Recommend researching the garbage collection techniques like in https://github.com/kenfox/gc-viz and seeing how we can apply it.
Deleting a file isn't the only cause where this can occur. I'm sure there may be other operations that are unscalable. Identify all of these in this issue.
Additional context
- https://github.com/kenfox/gc-viz
- https://spin.atomicobject.com/2014/09/03/visualizing-garbage-collection-algorithms/ - discussion about the above repo
Tasks
- - Identify all unscalable operations and whether they apply to this situation
- - Research the garbage collection techniques above
- - Integrate the technique above in a lazy way so that it doesn't hog the CPU nor does it block other operations, because JS async contexts cannot be interrupted, then they must be yielded, the implementation must be as asynchronous as possible and coroutine/cooperative multitasking friendly, perhaps by the use of
setTimeout
yields (implemented as sleep promises). - - Consider if it would be possible to multi-thread the leveldb usage in worker threads and end up with parallel lazy GC, note that leveldb currently is not "multi-process" safe, but there is a multilevel adapter for multiprocesses, experiment with worker threads as it is using threads and not processes