Skip to content

[7.x][ML] Improve resuming a DFA job stopped during inference (#67623) #67669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dimitris-athanasiou
Copy link
Contributor

If a DFA job is stopped while in the inference phase, after
resuming we should start inference immediately. However, this
is currently not the case. Inference is tied in AnalyticsProcessManager
and thus we start a process, load data, restore state, etc., until
we get to start inference.

This commit gets rid of this unnecessary delay by factoring inference
out as an independent step and ensuring we can resume straight from
that phase upon restarting a job.

Backport of #67623

…c#67623)

If a DFA job is stopped while in the inference phase, after
resuming we should start inference immediately. However, this
is currently not the case. Inference is tied in `AnalyticsProcessManager`
and thus we start a process, load data, restore state, etc., until
we get to start inference.

This commit gets rid of this unnecessary delay by factoring inference
out as an independent step and ensuring we can resume straight from
that phase upon restarting a job.

Backport of elastic#67623
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@dimitris-athanasiou dimitris-athanasiou merged commit a64997d into elastic:7.x Jan 18, 2021
@dimitris-athanasiou dimitris-athanasiou deleted the improve-resume-dfa-job-in-inference-phase-7x branch January 18, 2021 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants