Throughputs of Long Sequences #12608
Unanswered
simmonssong
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I am using
https://github.com/abetlen/llama-cpp-python
to test throughputs of input sequences with different lengths. I found that throughput increases with length on several different models and quantization, is this caused by build-in infrastructure optimization of Llama.cpp?Beta Was this translation helpful? Give feedback.
All reactions