-
Hi, Have you ever trained maxvit_tiny_224 (https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/maxxvit.py#L1879)? I found out the used gpu memory is too large to train. I used 2080ti X6 . What is you training hardware? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
@twmht I've used a mix of TPU and A6000 in Lambda Labs cloud for training these models, they are pretty resource intensive, yes. If you compare to other models of a somewhat similar nature (ie Swin), the MaxViT tiny has comparable resource use (but also accuracy) to 'small' of the other models. It has a (relatively) high activation count compared to it's param count which relates to mem use. |
Beta Was this translation helpful? Give feedback.
@twmht I've used a mix of TPU and A6000 in Lambda Labs cloud for training these models, they are pretty resource intensive, yes. If you compare to other models of a somewhat similar nature (ie Swin), the MaxViT tiny has comparable resource use (but also accuracy) to 'small' of the other models. It has a (relatively) high activation count compared to it's param count which relates to mem use.