Based on works of Hongzi Mao HotNets'16 http://people.csail.mit.edu/hongzi/content/publications/DeepRM-HotNets16.pdf
Made improvements based on DeepRM http://github.com/hongzimao/deeprm
File: build_small_conv_pg_network in pg_network.py Network structure: Input: CNN with size 2*2, 16 filters
Output: Fully connected layer with # of actions output Major improvement. Improved convergence rate (by ??? --> To Do)
File: environment.py
In DeepRM, state space was generated by stacking vertically matrices in the following way:
State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\ State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2.
I decide to put the related matrices closer, therefore stacking matrices in the following way:
Stacking vertically respectively: State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\ and State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2. And then stack the above two long matrices vertically.
See picture below for better explanation:
Original state matrix Reshaped state matrixMajor improvement. Improved the average slowdown by 8.9% after 1000 epochs of training.
File: parameters.py
I gave different weights of penalty for jobs already planned(in machine matrix), jobs in jobslot queue and jobs in backlog. Minor improvement. Improved convergence rate.
-
Added log and save checkpoints to make record of slowdown and save models (pg_re_single_core.py and pg_re.py)
-
Added launcher2 for convenient launching and debugging (launcher2.py)
sudo apt-get update
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
pip install --user Theano
pip install --user Lasagne==0.1
sudo apt-get install python-matplotlib
In folder RL, create a data/ folder.
Use launcher.py
to launch experiments.
--exp_type <type of experiment>
--num_res <number of resources>
--num_nw <number of visible new work>
--simu_len <simulation length>
--num_ex <number of examples>
--num_seq_per_batch <rough number of samples in one batch update>
--eps_max_len <episode maximum length (terminated at the end)>
--num_epochs <number of epoch to do the training>
--time_horizon <time step into future, screen height>
--res_slot <total number of resource slots, screen width>
--max_job_len <maximum new job length>
--max_job_size <maximum new job resource request>
--new_job_rate <new job arrival rate>
--dist <discount factor>
--lr_rate <learning rate>
--ba_size <batch size>
--pg_re <parameter file for pg network>
--v_re <parameter file for v network>
--q_re <parameter file for q network>
--out_freq <network output frequency>
--ofile <output file name>
--log <log file name>
--render <plot dynamics>
--unseen <generate unseen example>
The default variables are defined in parameters.py
.
Example:
- launch supervised learning for policy estimation
python launcher.py --exp_type=pg_su --simu_len=50 --num_ex=1000 --ofile=data/pg_su --out_freq=10
- launch policy gradient using network parameter just obtained
python launcher.py --exp_type=pg_re --pg_re=data/pg_su_net_file_20.pkl --simu_len=50 --num_ex=10 --ofile=data/pg_re
- launch testing and comparing experiemnt on unseen examples with pg agent just trained
python launcher.py --exp_type=test --simu_len=50 --num_ex=10 --pg_re=data/pg_re_1600.pkl --unseen=True