You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a basic case (ext_factor is 0), the theta uses for cos/sin is scaled by freq_scale * freq_scale. I think this is wrong and this line should be deleted.
value passed to int64_t i0 is wrong: (data type does not matches, either.)
I am using YaRN when implementing DeepSeek V2 models. And current YaRN does not look good me.
@cebtenzzre Could you take a look on this? Correct me if I am wrong.
theta_base *= freq_scale
is done again later inrope_yarn
:ggml/src/ggml.c
Line 14077 in 0cbb7c0
In a basic case (
ext_factor
is 0), thetheta
uses forcos/sin
is scaled byfreq_scale * freq_scale
. I think this is wrong and this line should be deleted.value passed to
int64_t i0
is wrong: (data type does not matches, either.)ggml/src/ggml.c
Lines 14082 to 14088 in 0cbb7c0
I think it should be
ic
here.(Confirmed when implementing DeepSeek V2 models)
The text was updated successfully, but these errors were encountered: