Skip to content

Used GPU Memory is quite large when training maxvit #1472

Answered by rwightman
twmht asked this question in General
Discussion options

You must be logged in to vote

@twmht I've used a mix of TPU and A6000 in Lambda Labs cloud for training these models, they are pretty resource intensive, yes. If you compare to other models of a somewhat similar nature (ie Swin), the MaxViT tiny has comparable resource use (but also accuracy) to 'small' of the other models. It has a (relatively) high activation count compared to it's param count which relates to mem use.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by twmht
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants