[FIXED]: Increase in training time after each batch #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After refactoring the neural net code to use the new tensor lib, there was a perf issue when training the data. In particular after batch, it progressively got slower. The issue was with the line that was updating the weights. However each of these updates is an op which was adding nodes to the computational graph even because of how the
requires_grad
flag was implemented. Hence after each batch the backprop had to be done on a larger and larger computational graph which was causing the issue.The issue has been fixed temporarily by not having the operation done through tensors (there by not needing to think about the computational graph at all) but through raw ndarrays and then assigning a new tensor from it after all ops are done