Update README.md

TheFrogSquat · Oct 30, 2017 · 74e3917 · 74e3917
1 parent 96802e5
commit 74e3917
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -2,9 +2,11 @@
 
 **WORK IN PROGRESS**
 
+**UPDATE 10/30/17** Was unable to get the RL pretraining model with greedy decoding to learn on the TSP10 or TSP20 environments. I tried a critic network as well as an exponential moving average baseline. I am using 1 NVIDIA GTX 1080 and trained for 1-2 days. It appears as if the variance of the actor loss is still too high, even with these baselines. Please create an Issue and let me know if you get this to work.
+
 PyTorch implementation of [Neural Combinatorial Optimization with Reinforcement Learning](https://arxiv.org/abs/1611.09940). 
 
-So far, I have implemented the basic RL pretraining model from the paper. An implementation of the supervised learning baseline model is available [here](https://github.com/pemami4911/neural-combinatorial-rl-tensorflow). 
+I have implemented the basic RL pretraining model from the paper. An implementation of the supervised learning baseline model is available [here](https://github.com/pemami4911/neural-combinatorial-rl-tensorflow). 
 
 My implementation uses a stochastic decoding policy in the pointer network, realized via PyTorch's `torch.multinomial()`, during training, and beam search (not yet finished, only supports 1 beam a.k.a. greedy) for decoding when testing the model. I have tried to use the same hyperparameters as mentioned in the paper but have not yet been able to replicate results from TSP.