Skip to content

Commit

Permalink
metal : utilize max shared memory for mul_mat_id (ggerganov#7935)
Browse files Browse the repository at this point in the history
  • Loading branch information
ggerganov authored Jun 14, 2024
1 parent e65bbf6 commit 66ef1ce
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion ggml-metal.m
Original file line number Diff line number Diff line change
Expand Up @@ -1862,9 +1862,10 @@ static enum ggml_status ggml_metal_graph_compute(
// ne21 = n_rows
const int dst_rows = ne20*ne21;
const int dst_rows_min = n_as;
const int dst_rows_max = (ctx->device.maxThreadgroupMemoryLength - 32 - 8192)/4;

// max size of the rowids array in the kernel shared buffer
GGML_ASSERT(dst_rows <= 2048);
GGML_ASSERT(dst_rows <= dst_rows_max);

// for now the matrix-matrix multiplication kernel only works on A14+/M1+ SoCs
// AMD GPU and older A-chips will reuse matrix-vector multiplication kernel
Expand Down

0 comments on commit 66ef1ce

Please sign in to comment.