Skip to content

Commit 3e951fc

Browse files
ggerganovmounta11n
authored andcommitted
ggml : add ggml_soft_max_ext (ggml-org#4256)
* metal : implement soft_max_ext * cuda : implement soft_max_ext * ggml : implement soft_max_ext (CPU) * batched-bench : print threads ggml-ci * metal : simplify soft_max encoding ggml-ci * cuda : use 512 threads for soft_max instead of 32 * ggml : update soft max cpu * cuda : do warp-based block reduce * cuda : increase max block size to 1024 * cuda : fix warp reduction initialization of shared mem * metal : warp-based reduction for soft max kernel * metal : warp-based reduce for rms_norm * metal : simplify soft max kernel ggml-ci * alloc : fix build with debug
1 parent ff6293c commit 3e951fc

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

ggml-alloc.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ void ggml_tallocr_alloc(ggml_tallocr_t alloc, struct ggml_tensor * tensor) {
137137

138138
#ifdef GGML_ALLOCATOR_DEBUG
139139
add_allocated_tensor(alloc, tensor);
140-
size_t cur_max = (char*)addr - (char*)alloc->data + size;
140+
size_t cur_max = (char*)addr - (char*)alloc->base + size;
141141
if (cur_max > alloc->max_size) {
142142
printf("max_size = %.2f MB: tensors: ", cur_max / 1024.0 / 1024.0);
143143
for (int i = 0; i < 1024; i++) {

0 commit comments

Comments
 (0)