Skip to content

[ML-DataFrame] Combine task_state and indexer_state in stats #45201

Closed
@droberts195

Description

@droberts195

#43767 moved indexer_state into checkpointing.next, and it has been pointed out that this means it is only available when a checkpoint is in progress.

From an end user perspective the difference between task_state and indexer_state is an internal implementation detail. But for debugging purposes we might want to see it even when there isn't a checkpoint in progress. If we move it to the top level then as an end user I'm back to wondering which of task_state and indexer_state I should be taking notice of, and why there are two states in the first place. A better alternative is to have just one top level state that combines the two, like anomaly detection jobs and datafeeds have. It can be defined as:

  • failed if what's currently reported as task_state is failed
  • stopped if there is no persistent task
  • Otherwise what's currently reported as indexer_state

To avoid multiple breaking changes to the stats format in consecutive versions and complex BWC this change should be made for 7.4.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions