Open
Description
from torch import nn
from rnn import FastGRNNCUDA
x = torch.randn((8192, 4, 512)).cuda()
h0 = torch.zeros((4, 512)).cuda()
gru = nn.GRU(512, 512, batch_first=False).cuda()
grnn = FastGRNNCUDA(512, 512, batch_first=False).cuda()
Timing (with proper cuda synchronisation) gives a loop time of 0.1s for the GRU and 0.35s for the GRNN. Am I doing something wrong, surely the GRNN should be at least on par with the GRU since it is less operations.
Metadata
Metadata
Assignees
Labels
No labels