rotary position embedding cause different output in different tensor parallel settings! #203

marscrazy · 2023-03-16T03:49:12Z

Thanks for your great work in LLM.
I have tried to load llama-13b in different mp size settings, e.g., 2,4. However, the output embedding and generated sentence changes with the change of mp settings.

My question: Is this normal?

mp size = 4

mp size = 2

marscrazy · 2023-03-16T05:56:50Z

The -3.8359 is the mean of output embedding and 1.9458 is the std with mp size =4.
the mean and std is changed when mp size=2

ejsd1989 added the model-usage issues related to how models are used/loaded label Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rotary position embedding cause different output in different tensor parallel settings! #203

rotary position embedding cause different output in different tensor parallel settings! #203

marscrazy commented Mar 16, 2023

marscrazy commented Mar 16, 2023 •

edited

Loading

rotary position embedding cause different output in different tensor parallel settings! #203

rotary position embedding cause different output in different tensor parallel settings! #203

Comments

marscrazy commented Mar 16, 2023

marscrazy commented Mar 16, 2023 • edited Loading

marscrazy commented Mar 16, 2023 •

edited

Loading