Open
Description
❓ Questions and Help
Would appreciate it if anyone has some insight on how to use deformable convolution correctly.
Deformable convolution is tricky as even the official implementation is different from what's described in the paper. The paper claims to use 2N offset size instead of 2 x ks x ks.
Anyway, we're using the 2 x ks x ks offset here, but I always got poor performance. Accuracy drops in CIFAR10 and YOLACT. Anything wrong with my usage?
from torchvision.ops import DeformConv2d
class DConv(nn.Module):
def __init__(self, inplanes, planes, kernel_size=3, stride=1, padding=1, bias=False):
super(DConv, self).__init__()
self.conv1 = nn.Conv2d(inplanes, 2 * kernel_size * kernel_size, kernel_size=kernel_size,
stride=stride, padding=padding, bias=bias)
self.conv2 = DeformConv2d(inplanes, planes, kernel_size=kernel_size, stride=stride, padding=padding, bias=bias)
def forward(self, x):
out = self.conv1(x)
out = self.conv2(x, out)
return out