-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add Apple's MobileOne encoder #693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi, thanks for your work and contribution! |
Done :) |
Thanks for your contribution, I tried it on my side and could not make it work when the input images has only one channel (greyscale images). Is that a known limitation? |
I honestly didn't check that. Let me investigate. |
I looked into the grayscale limitation and could not make it work without making more drastic changes to apple's code (other than passing "in_channel" through all the init functions). |
I can try to give it a look, if you already have advices to share it might help. |
Model must be inited with 3 channels so that weights can be loaded. |
Feel free to send me a snippet or create a PR if it works! 👍 |
@kevinpl07 what's the purpose of |
Monkey patching like this seems to do the trick from . import _utils as utils
def set_in_channels(self, in_channels, pretrained=True):
"""Change first convolution channels"""
if in_channels == 3:
return
self._in_channels = in_channels
self._out_channels = tuple([in_channels] + list(self._out_channels)[1:])
utils.patch_first_conv(model=self.stage0.rbr_conv, new_in_channels=in_channels, pretrained=pretrained)
utils.patch_first_conv(model=self.stage0.rbr_scale, new_in_channels=in_channels, pretrained=pretrained) |
Essentially the multi-branch structure is benefitial for training but has drawbacks during inference. The reparameterize function takes the model after training and converts it to plain CNN-like structure for inference. This can be called on the complete segmentation model because it checks whether individual components have a reparameterize function. |
I'm surprised by the size of the model. I'm used to work with unet-resnet18 (depth 4) and unet-mobileone_s2 (depth 4) is still bigger in size (23Mo vs 14Mo) |
I got you, but on paper resnet18 (11M) has more parameters than mobileone_s0/1/2/3 |
@JulienMaille Can you check if my last commit is according to your suggestion? @qubvel can you trigger the workflow again, once Julien approves? |
Looks good to me |
mod_list.add_module("bn", nn.BatchNorm2d(num_features=self.out_channels)) | ||
return mod_list | ||
|
||
def set_in_channels(self, in_channels, pretrained=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we should move it to the MobileOne
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, my mistake -> Done.
@kevinpl07 could you, please, also add information about new encoders to the docs here |
Done as well :) |
Thanks a lot, merged! |
@kevinpl07 I gave it a try, IoU is great but inference time on Cuda is not optimized (tried with OpenCV with Cuda backend) |
Hello,
I added support for Apple's MobileOne encoder.
Paper: Link
There were very few changes I had to make to their official github repo: Link
It works with all decoders and has impressive inference time for images with 256x256: