Description
When calling _stop
against a running transform, there should be an option to wait_for_checkpoint
. This will allow for better data consistency.
The overall requirements for this flag should probably be as follows:
wait_for_checkpoint
should be its own flag, separate from force
and wait_for_completion
.
wait_for_completion
: can be used with any other flag and indicates if we wait for the task to go away before returning to the listener or not. Essentially sync vs async.force
: has to be used if the task is failedwait_for_checkpoint
: cannot be used on a failed task since the indexer cannot continue, this flag makes no sense for a failed task. Its value should just be ignored on a failed task.
As for the default value for each of them, I think the following makes sense:
force: false
wait_for_completion: false
wait_for_checkpoint: true
This means that if a user wants to stop a checkpoint, but has noticed that it has stayed int he STOPPING
state for a long time, they can use _stop?wait_for_checkpoint=false
to cause it to stop.
This will most likely require a new DataFrameTransformTaskState
state of STOPPING
so that the transform can signal a stop
when ClientDataFrameIndexer#onFinish
is called.