FEAT: auto batch embedding #4197

qinxuye · 2025-11-02T16:29:04Z

Before this PR, embedding is sequential, though users could create multiple embeddings at same time.

After this PR, the model class could inherit BatchMixin, and provides method with batch version (utilizing xoscar batch API), the request will be put into queue, and a background coroutine will collect items as many as possible and calling API in a single call.

This is an initial version of auto batching, actually, this could be applied to all models, not only for self regressive models(basically LLM).

Fixes #4123

qinxuye · 2025-11-02T16:32:02Z

@llyycchhee please help me check this PR, and see if there's anything that can be improved.

Benchmarks welcome.

qinxuye added 3 commits October 31, 2025 18:24

FEAT: support batch embedding

281996f

FEAT: auto batch embedding

d2dbca9

optimize batch method

dabb6ce

XprobeBot added the feature label Nov 2, 2025

XprobeBot added this to the v1.x milestone Nov 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT: auto batch embedding #4197

FEAT: auto batch embedding #4197

Uh oh!

qinxuye commented Nov 2, 2025 •

edited

Loading

Uh oh!

qinxuye commented Nov 2, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FEAT: auto batch embedding #4197

Are you sure you want to change the base?

FEAT: auto batch embedding #4197

Uh oh!

Conversation

qinxuye commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qinxuye commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qinxuye commented Nov 2, 2025 •

edited

Loading

qinxuye commented Nov 2, 2025 •

edited

Loading