Open
Description
Name and Version
$./llama-cli --version
version: 4267 (f112d19)
built with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
Test code
Problem description & steps to reproduce
test-backend-ops run failed when I add test case:
test_cases.emplace_back(new test_soft_max(GGML_TYPE_F32, { 32, 4, 1, 32 }, true, 0.1f, 8.0f));
result:
SOFT_MAX(type=f32,ne=[32,4,1,32],mask=1,scale=0.100000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.021202819 > 0.000001000 �[1;31mFAIL�[0m
Does this situation not exist or is it really a bug?
If it is a real bug, I think it is because of the calculation of slope, the cuda code is a little different with the c code, i am not sure which implementation is right.
cuda:
// h(rowx/nrows_y) ranges from 0 to ne02*ne03
const float slope = get_alibi_slope(max_bias, rowx/nrows_y, n_head_log2, m0, m1);
c:
// h ranges from 0 to ne02
const uint32_t h = (i1/ne01)%ne02; // head
const float slope = (max_bias > 0.0f) ? h < n_head_log2 ? powf(m0, h + 1) : powf(m1, 2*(h - n_head_log2) + 1) : 1.0f;
First Bad Commit
No response
Relevant log output
No response