Skip to content

Find spec/node_type of Kepler node for model selection #231

@sunya-ch

Description

@sunya-ch

What would you like to be added?

Flow to link Kepler-deploying node specification to model selection from Kepler model DB.

Why is this needed?

Problem description

As previously, we have only a single node_type in the pipeline. We always put _1 after the trainer name to get the model name. However, with SPECPower and AWS instances, we can now train multiple node_type.

Currently, we have a function generate_spec to generate machine spec implemented in python on kepler-model-server.

Idea

The thing to do is to let Kepler determine know its node_type.
The logic of generate_spec may not need to merge into inside Kepler.
It can run in init container to generate spec and save to a file to mount. Server API may need to update to allow adding machine spec inside the request to select the model.

Note that,

  • node_type is per pipeline determined by node_type_index.json inside the pipeline folder.
  • we can set default pipeline to spec_benchmark for acpi value and aws_instance_pipeline for rapl value.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions