Skip to content

[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

Closed
@droberts195

Description

@droberts195

The current workflow for memory limits is:

  1. User specifies a memory limit up front with no guidance as to what is sensible/realistic
  2. They start the analysis
  3. Analysis chugs away for a while reindexing the source index into the destination index
  4. C++ process is started and checks whether the memory limit is sufficient to do the analysis
  5. If it's not then the process exits, reporting how much memory was required in the error message

Two possible ways to solve it are:

  1. We duplicate that logic that calculates the memory requirement from the C++ code into the Java and UI code
  2. We add a mode of operation to the C++ process where you just supply the spec and instead of actually doing the analysis it just tells you:
    i. What you'd have to set the memory limit to to do the analysis entirely in RAM
    ii. What the minimum memory limit is that will enable the analysis to run at all (using disk)

Although option 2 is a lot of work, the memory calculations done by the C++ are now so complex that it is impractical to duplicate them. Therefore we should implement option 2. The work will be something along the lines of:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions