Retrieval: Fix Memory Leak in Retrieval Query Handling #8955

gtygo · 2024-08-09T17:19:48Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High
Description
This pull request addresses a memory leak issue in the retrieval.cpp file, specifically when continuously accepting query inputs. The problem arises from the llama_batch initialization and clearing process.
Problem
The llama_batch_init function allocates memory on the heap for the batch. However, the current implementation uses llama_batch_clear to reset the batch size to 0, which does not properly free the allocated heap memory. This results in a continuous increase in memory usage as the process runs.
Solution
The solution involves ensuring that the allocated memory for llama_batch is properly freed after each query is processed. This prevents the memory leak and stabilizes the memory usage of the process.
Changes
Replaced llama_batch_clear with llama_batch_free to ensure proper memory deallocation.

ggerganov

It would be better to init and free the batch once outside the while loop and, clear it inside the loop

gtygo · 2024-08-09T17:27:13Z

It would be better to init and free the batch once outside the while loop and, clear it inside the loop

Yes, this reduces frequent memory allocations

* retrieval * Reuse querybatch to reduce frequent memory allocation * delete unused white space

retrieval

fe6dc61

ggerganov reviewed Aug 9, 2024

View reviewed changes

Reuse querybatch to reduce frequent memory allocation

88105b7

gtygo requested a review from ggerganov August 9, 2024 17:47

ggerganov approved these changes Aug 9, 2024

View reviewed changes

github-actions bot added the examples label Aug 9, 2024

mofosyne added Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix bugfix fixes an issue or bug labels Aug 10, 2024

delete unused white space

804ddd7

ggerganov merged commit 4b9afbb into ggml-org:master Aug 15, 2024

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

retrieval : fix memory leak in retrieval query handling (ggml-org#8955)

eeca925

* retrieval * Reuse querybatch to reduce frequent memory allocation * delete unused white space

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

retrieval : fix memory leak in retrieval query handling (ggml-org#8955)

0b9740d

* retrieval * Reuse querybatch to reduce frequent memory allocation * delete unused white space

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retrieval: Fix Memory Leak in Retrieval Query Handling #8955

Retrieval: Fix Memory Leak in Retrieval Query Handling #8955

Uh oh!

gtygo commented Aug 9, 2024

Uh oh!

ggerganov left a comment

Uh oh!

gtygo commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Retrieval: Fix Memory Leak in Retrieval Query Handling #8955

Retrieval: Fix Memory Leak in Retrieval Query Handling #8955

Uh oh!

Conversation

gtygo commented Aug 9, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

gtygo commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants