UPSTREAM PR #18416: fix: change llama_chat_apply_template return type from int32_t to int64_t #723
+13
−13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Mirrored from ggml-org/llama.cpp#18416
Summary
Changes
llama_chat_apply_templatereturn type fromint32_ttoint64_tllama_chat_apply_templatelength parameter fromint32_ttoint64_tllm_chat_apply_templatereturn type fromint32_ttoint64_tint64_tfor the result variableRoot Cause
The
llm_chat_apply_templatefunction returnsdest.size()which is asize_t. When messages are very large (>2GB), storing this inint32_tcauses integer overflow. This resulted in negative values being returned even for valid templates, triggering incorrect error messages like "this custom template is not supported" instead of proper size handling.Test plan
Files modified
include/llama.h- API declarationsrc/llama.cpp- Implementationsrc/llama-chat.h- Internal headersrc/llama-chat.cpp- Internal implementationcommon/chat.cpp- Caller updatesexamples/simple-chat/simple-chat.cpp- Example updatestests/test-chat-template.cpp- Test updates🤖 Generated with Claude Code