You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, @Tramac , I use your code with default setting, and find the speed of training is too slow. The usage of GPU is less than 10%, resulting in about 130 seconds to train a mini-batch. How to solve this problem?
The text was updated successfully, but these errors were encountered:
Same issue. To debug, I ran timing on each portion of training, I was able to identify the source of the largest delay:
The computation of the SoftmaxCrossEntropyOHEMLoss occurs on CPU:
On my workstation and my training data, the training is taking ~1.1 seconds/image. The execution of the forward method in class SoftmaxCrossEntropyOHEMLoss(nn.Module) alone accounts for ~60% of that duration.
Has anyone come up with a more efficient implementation of this OHEM loss? Possibly translating the ops from numpy to torch to be able to run on GPU?
Hi, @Tramac , I use your code with default setting, and find the speed of training is too slow. The usage of GPU is less than 10%, resulting in about 130 seconds to train a mini-batch. How to solve this problem?
The text was updated successfully, but these errors were encountered: