Used GPU Memory is quite large when training maxvit #1472

twmht · 2022-09-22T10:35:26Z

twmht
Sep 22, 2022

Hi,

Have you ever trained maxvit_tiny_224 (https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/maxxvit.py#L1879)?

I found out the used gpu memory is too large to train. I used 2080ti X6 .

What is you training hardware?

Answered by rwightman

Sep 23, 2022

@twmht I've used a mix of TPU and A6000 in Lambda Labs cloud for training these models, they are pretty resource intensive, yes. If you compare to other models of a somewhat similar nature (ie Swin), the MaxViT tiny has comparable resource use (but also accuracy) to 'small' of the other models. It has a (relatively) high activation count compared to it's param count which relates to mem use.

View full answer

rwightman · 2022-09-23T17:58:40Z

rwightman
Sep 23, 2022
Maintainer

@twmht I've used a mix of TPU and A6000 in Lambda Labs cloud for training these models, they are pretty resource intensive, yes. If you compare to other models of a somewhat similar nature (ie Swin), the MaxViT tiny has comparable resource use (but also accuracy) to 'small' of the other models. It has a (relatively) high activation count compared to it's param count which relates to mem use.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Used GPU Memory is quite large when training maxvit #1472

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Used GPU Memory is quite large when training maxvit #1472

twmht Sep 22, 2022

Replies: 1 comment

rwightman Sep 23, 2022 Maintainer

twmht
Sep 22, 2022

rwightman
Sep 23, 2022
Maintainer