You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, when I am running four A100 with parameter tensor_parallel_size is 4 in parallel, I found that the speed is slower(nearly twice) than a single card. can you explain what causes this and how to solve it. Thank you.