Skip to content

Bug: task ids not removed from waiting_tasks for /v1/chat/completions call #9528

Closed
@anagri

Description

@anagri

What happened?

in commit 6e7d133a5f9409dd257fad90d7f320721b07a1b2, changes were made to how the /v1/chat/completions is handled.

ealier the call sequence was -

        const int id_task = ctx_server.queue_tasks.get_new_id();

        ctx_server.queue_results.add_waiting_task_id(id_task);
        ctx_server.request_completion(id_task, -1, data, false, false);
...

       ctx_server.queue_results.remove_waiting_task_id(id_task);

after the changes, the ctx_server.queue_results.remove_waiting_task_id(id_task); call is missing and is causing the server_response.waiting_task_ids to increase after serving every call to the above endpoint. if there is a long running instance of llama-server in production, this will end up consuming a lot of memory as the ids are not cleared for a few refactored server handlers.

@ngxson kindly provide your inputs. tx.

Name and Version

$ ./bin/llama-cli --version
version: 3609 (2f3c1466)
built with Homebrew clang version 18.1.5 for arm64-apple-darwin23.3.0

What operating system are you seeing the problem on?

Mac

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions