Batch inference tool #13863
someone13574
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A tool for efficiently processing very large datasets would be nice. You would give it a file of things to process, and it would order them to take advantage of things like common prefixes and then run as many as possible in parallel as the batch size allows for. The server sort of works for this use case, but there are still quite a few things you need to do to take advantage of parallelism (ie. multiple async calls, ordering to get prefix caching, etc.) which are non-trivial.
Beta Was this translation helpful? Give feedback.
All reactions