Apparently the project implemented these two wrongly. The formula used for momentum was incorrect, and Nesterov was only being applied to forward propagation, rather than both it and backprop.
Apparently the project implemented these two wrongly. The formula used for momentum was incorrect, and Nesterov was only being applied to forward propagation, rather than both it and backprop.