-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC+/ Milestone 2 execution plan #4801
Conversation
## Milestone 1 | ||
The first beta version that was released included: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Milestone 1 | |
The first beta version that was released included: | |
## Milestone 1 | |
The first beta version that was released included: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
1. Implementation of the clean run flow for old and new repository structures (without optimizations) | ||
2. Mark & Sweep | ||
3. Integration tests | ||
4. Backup & Restore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minimal support using rclone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
## Milestone 2 | ||
|
||
### Goals | ||
1. Removing the limitation of a read-only lakeFS during the job run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
job run == gc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
2. Performance improvements - better parallelization of the storage namespace traversal | ||
3. Implementing Optimized Run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 and 3 are performance and optimizations - the first item is self explained, I didn't understand the second one.
Do you mean incremental uncommitted gc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nopcoder Optimized run means incremental
@eden-ohana I think we should focus this milestone on optimization of the full run and adding more testing (unit / integration) - and create a separate milestone for the incremental/optimized run
* [lakeFSFS renameObject method](https://github.com/treeverse/lakeFS/issues/4479) | ||
* [Track copied objects in ref-store](https://github.com/treeverse/lakeFS/issues/4562) | ||
|
||
3. Performance improvements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all uncommitted GC? maybe we can specify it as there are lakefs, common code and etc.
3. Performance improvements: | |
2. Performance improvements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
3. Performance improvements: | ||
* [Optimaized listing on old repository structure](https://github.com/treeverse/lakeFS/issues/4620) | ||
* [Efficient listing on committed entries](https://github.com/treeverse/lakeFS/issues/4600) | ||
* [Optimized listing on slices](https://github.com/treeverse/lakeFS/issues/4614) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought we already doing this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nopcoder Yes you did :) I think we can close this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
3. Implementing Optimized Run | ||
|
||
### Non-Goals | ||
1. Support for non-S3 repositories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big bullet - I think it should be itemized by storage providers. Also I really think we should do it incremental.
For example - for this milestone, starting with Azure - and afterwards working on GCS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eden-ohana Thanks for creating this plan.
It looks well put through - I think we learned a lot from the first milestone, especially about our expectations and we should put this into this plan.
There is a lot of work hidden inside the milestone goals lines in my opinion.
I believe we should focus on improving our current solution for the 2nd milestone - structuring it, creating more tests and add optimizations.
We should also start to create the ground for incremental runs but I think the client implementation itself should be planned for milestone 3
@ozkatz can you give context from a product perspective? Is there any requirement to prioritize the GC+ for Azure? |
@eden-ohana I was under the impression that since the "Sweep" part of GC+ is re-used (i.e. it's the same "sweep" that is already implemented for Azure) we should support this ootb? There are active, production installations that we know of that are running on Azure, with compliance requirements that are very hard to meet without GC+.. |
The sweep part is reused but we added more functionalities that weren't tesed on Azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It feels like a very-stressed milestone. I think we should settle for the "safe-write" lakeFS goal and leave the rest for the next milestones. The timeline itself is too short for all three, I think...
|
||
* marks dependency | ||
|
||
1. Required changes by lakeFS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a non-trivial test issue here. I'm not sure how to test that lakeFS is safe while GC is running. Feels like we should plan for that here too.
* [Optimaized listing on old repository structure](https://github.com/treeverse/lakeFS/issues/4620) | ||
* [Efficient listing on committed entries](https://github.com/treeverse/lakeFS/issues/4600) | ||
* Benchmarks - verify uncommitted GC [performance requirements](https://github.com/treeverse/lakeFS/blob/e316cafe7717bb3203e4018837a41415aa61f74b/design/accepted/gc_plus/uncommitted-gc.md?plain=1#L185) are kept | ||
2. [Implement optimized run flow](https://github.com/treeverse/lakeFS/issues/4489) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we implementing the incremental run? I thought we decided to do it in the next milestone
3. Support for non-S3 repositories | ||
* Azure | ||
* GCP | ||
4. Incorporation of committed & uncommitted GC into a single job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - I thought we were going to push it towards the end of development
|
||
By the end of this milestone, we will release a new beta version that includes all the additions. | ||
|
||
**Due date: 15/01/2023** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this due date as a bit ambitious. We still left a lot of work inside this milestone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't anymore :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🙏
closes #4802