[ML] Add functionality to upgrade ML model state to 7.x format before upgrade to 8.0

Currently the `autodetect` process has to know how to load model state formats going right back to version 5.5. In 8.x we would like to drop support for loading model states that predate 7.0. This means that during 7.x we must offer a way for users to easily upgrade model snapshots in 5.x or 6.x formats into the latest 7.x format.

Work was done for elastic/ml-cpp#1460 to add the necessary building block on the C++ side.  What is required now is Java code to make use of this.

The Java work consists of:

* Add a new "upgrade model snapshot" API.
  * To match the other model snapshot APIs this should have job ID and snapshot ID as path arguments.
  * If the referenced snapshot exists and has a `min_version` earlier than 7.0 (or doesn't have a `min_version` field at all) then:
    * The `autodetect` process should be started for the job, and passed the supplied model snapshot (in preference to the one that would normally be restored).
    * A `w` control message should be sent to the `autodetect` process, supplying the same snapshot ID, snapshot timestamp and snapshot description that were on the original snapshot - this will cause it to overwrite the original snapshot  documents with replacement documents in the latest format.
    * The `autodetect` process should be gracefully stopped by closing its input stream - it will not persist state again as no data was sent to it.
* Changes to the deprecations API to report old model snapshots.
  * When the deprecations API is called, it should return a critical problem that must be fixed before upgrade for every model snapshot whose `min_version` is older than 7.0 (or `min_version` not present).

There are many tricky details to work through with the seemingly simple "start and stop `autodetect`" portion of the work.  Will we reuse the same persistent task that we use to open the job for normal operation?  If so, how will we prevent data being sent to the process?  And would that mean the job would have to be closed during the upgrade (not ideal as it could be inconvenient)?  But if we don't reuse the same persistent task then how will we account for memory requirement and enforce that it runs on an ML node?  And how will we avoid named pipe name clashes?  Additionally, when the model snapshot gets persisted for one of these special upgrade invocations of `autodetect`, we need to persist the upgraded model snapshot but _not_ set it as the active one for the job, so the results handling code will need a tweak too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Add functionality to upgrade ML model state to 7.x format before upgrade to 8.0 #64154

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ML] Add functionality to upgrade ML model state to 7.x format before upgrade to 8.0 #64154

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions