Skip to content

Commit 70ef653

Browse files
committed
graph : restore same attention ops as on master
ggml-ci
1 parent 5fc6dbd commit 70ef653

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/llama-graph.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1384,7 +1384,7 @@ ggml_tensor * llm_graph_context::build_attn(
13841384
// note: storing RoPE-ed version of K in the KV cache
13851385
ggml_build_forward_expand(gf, ggml_cpy(ctx0, k_cur, k_cache_view));
13861386

1387-
v_cur = ggml_reshape_2d(ctx0, v_cur, n_embd_v_gqa, n_tokens);
1387+
assert(v_cur->ne[0] == n_embd_v_gqa && v_cur->ne[1] == n_tokens);
13881388

13891389
ggml_tensor * v_cache_view = nullptr;
13901390

0 commit comments

Comments
 (0)