Hi,
From a previous discussion #16 it was said that a learning rate of 0.001 was used. When I tried both 0.001 and 0.0001, it seemed that the latter gave a lower loss. I'm wondering if this means I should use a LR of 0.0001 instead?
Thank you!
Charles