Current working design -
- SlurmCtlD
- Receive the information from all the SlurmD processes. Mark the competed jobs if any.
- Send the pending jobs and nodes information to scheduler function.
- Scheduler returns node to job mapping.
- Master node sends the jobs to be run to the assigned nodes.
- SlurmD
- Send the available free resource to SlurmCtlD (currently only resource considered is the number of free CPUs).
- Receive the Job Subset (compute volume and the number of CPUs on which it needs to be run).
- Execute asynchronously the compute volume on the free CPUs.
Installation.
- Ubuntu 22.04 or latest
- Prerequisites to run the model
- simgrid library
sudo apt install simgrid pajeng cmake g++- tensorflow
- Install pip
sudo apt install python3-pip - Install tensorflow (newer version of pip require installation in venv)
pip install tensorflow - Clone the repository
git clone https://github.com/gautamMeeshi/Simgrid-HPC-simulation.git
How to run the model?
- Compile the source files -
make compile - Run the model -
make run SCHED=<scheduler_type> JOB_FILE=<job_file_name>
scheduler_type = easy_backfill/fcfs/naive_backfill/remote_nn
eg -make run SCHED=fcfs JOB_FILE=jobs1.csv
job files are located at ./input/jobs
Important links for reference -
- SimGrid host energy consumption plugin.
- This model is inspired by SimGrid-master-worker example.