Open
Description
Is your feature request related to a problem? Please describe.
When using this library in a loop, I am getting poor GPU Utilisation running zephyr-7b.
Describe the solution you'd like
It would be fantastic to be able to pass a list of prompts to a function of the Transformers class, and define a batch size like you can for a huggingface pipeline. This significantly improves speed and GPU utilisation.
Metadata
Metadata
Assignees
Labels
No labels