-
Notifications
You must be signed in to change notification settings - Fork 26
Closed
Labels
kind/featureNew feature or requestNew feature or request
Description
What would you like to be added?
Flow to link Kepler-deploying node specification to model selection from Kepler model DB.
Why is this needed?
Problem description
As previously, we have only a single node_type in the pipeline. We always put _1 after the trainer name to get the model name. However, with SPECPower and AWS instances, we can now train multiple node_type.
Currently, we have a function generate_spec to generate machine spec implemented in python on kepler-model-server.
Idea
The thing to do is to let Kepler determine know its node_type.
The logic of generate_spec may not need to merge into inside Kepler.
It can run in init container to generate spec and save to a file to mount. Server API may need to update to allow adding machine spec inside the request to select the model.
Note that,
- node_type is per pipeline determined by
node_type_index.json
inside the pipeline folder. - we can set default pipeline to spec_benchmark for acpi value and aws_instance_pipeline for rapl value.
Metadata
Metadata
Assignees
Labels
kind/featureNew feature or requestNew feature or request