Skip to content

[ML] Consider validating jobs outside of the builder #34899

Open
@davidkyle

Description

@davidkyle

A general issue but the changes in the Job in Index feature branch make it more relevant.

Whenever a job is parsed from xContent it is parsed into a builder then built wherein the build() method validates the config. This has the satisfying result that all job objects must be valid as they can only be constructed by the builder. Job config is stored in the clusterstate as Job objects rather than Job.Builders however, when the clusterstate is parsed (e.g after a node restart) the builder is used and validation will occur potentially throwing an exception.

For job index documents using the builder to parse the document means the job has to be validated every time it is read, in practice this means lots of code like this:

try {
   // need to wrap this is a try in case it throws
    job = builder.build();
} catch (e) {
    // hmm what should I do now?
}

If the validation did change or some bug was introduced and the build method throws then it is not possible to view the actual job config from the GET jobs API.

Job configuration must be validated on PUT, update and when the job is opened but is it not necessary at GET. I'd like to remove the need for all those try { build()..} statements by either not validating in the builder or parsing the document directly into a job. The job will have to be revalidated when it is opened.

I prefer the idea of having a non-throwing Builder.build() as it makes the code easier to reason about and there are a small number of places where the job config must absolutely be validated.

See also #34858

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions