Skip to content

context : round n_tokens to next multiple of n_seqs when reserving #14140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 12, 2025

Conversation

compilade
Copy link
Collaborator

This fixes RWKV inference which otherwise fails when ubatch.n_seq_tokens is 0.

I noticed this when trying to run the command from #13834 (comment), but with RWKV instead of Mamba.

For example, with https://huggingface.co/latestissue/rwkv-6-finch-1b6-gguf/blob/main/rwkv-6-finch-1b6-Q4_0.gguf

Before:

$ ./bin/llama-parallel -m /path/to/rwkv-6-finch-1b6-Q4_0.gguf -np 5 -ns 8 --temp 0 --repeat-penalty 1.1 -ub 2 -pps
.../ggml/src/ggml.c:1593: GGML_ASSERT(view_src == NULL || data_size == 0 || data_size + view_offs <= ggml_nbytes(view_src)) failed

After:

$ ./bin/llama-parallel -m /path/to/rwkv-6-finch-1b6-Q4_0.gguf -np 5 -ns 8 --temp 0 --repeat-penalty 1.1 -ub 2 -pps
(proceeds normally)

This has been broken since #13746 because it made the worst case graph_reserve use n_seqs = cparams.n_seq_max instead of n_seqs = 1.

ubatch.n_seq_tokens must not be 0 for RWKV, because in some views it uses ubatch.n_seq_tokens - 1 for one of the dimensions:

ggml_view_3d(ctx0, att_norm, n_embd, n_seq_tokens - 1, n_seqs, att_norm->nb[1], att_norm->nb[2], 0),


There will hopefully eventually be automated tests which will help detect this kind of problem, see #14139 (still in the early stages, though).


Make sure to read the contributing guidelines before submitting a PR

This fixes RWKV inference which fails when ubatch.n_seq_tokens is 0.
@compilade compilade added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels Jun 12, 2025
@MollySophia
Copy link
Collaborator

Thanks a lot for solving this issue!

@compilade compilade merged commit a20b2b0 into master Jun 12, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants