Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Share global shard checkpoints across Index Transform #1135

Open
sarthakaggarwal97 opened this issue Mar 18, 2024 · 1 comment
Open
Labels
enhancement New request

Comments

@sarthakaggarwal97
Copy link
Contributor

Is your feature request related to a problem?
Currently, each of the transform job is independent of each other. There is no way where they interact with each other or share any information.

But, there could be scenarios where we would want the transform job to share its already process shard checkpoints with other transform jobs.

In cases, where we would like to split the current transform job (which maybe processes multiple indices at once), into new transform jobs to process over say individual indices. Right now, if we create the new transform jobs, they would re-process the already computed buckets by the old transform job.

There is no way to currently continue the work of old/parent transform job.

What solution would you like?
This issue is to track to ability to share the global checkpoints across transform job in order to continue the work done by the old/ parent transform job.

Transform metadata internally maintains the global shard checkpoints to track the documents it needs to process upon run. If we are able to share this metadata from one transform job to another, we should be able to continue or split the work of the old transform job into new ones without worrying about data duplicacy or consistency.

@sarthakaggarwal97 sarthakaggarwal97 changed the title [FEATURE] Share global shard checkpoints across Transform Jobs [FEATURE] Share global shard checkpoints across Index Transform Mar 18, 2024
@dblock dblock removed the untriaged label Jun 17, 2024
@dblock
Copy link
Member

dblock commented Jun 17, 2024

Catch All Triage - 1 2 3 4 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New request
Projects
None yet
Development

No branches or pull requests

2 participants