Add inference-time sequence packing support by vivekkalyanarangan30 · Pull Request #292 · urchade/GLiNER

vivekkalyanarangan30 · 2025-09-16T18:51:33Z

Summary

Add a new inference-packing utility that builds packed batches with block-diagonal masks and helpers to unpack outputs.
Expose configuration knobs on the GLiNER API so packing can be toggled globally or per-call.
Wire the encoder to use packed execution when configured, including automatic pack/unpack around the transformer forward pass.
Results are identical with or without packing (verified by tests).

Benchmarks

All runs on CPU, roberta-base, batch_size=64, max_length=512.

Scenario	Baseline tokens/s	Packed tokens/s	Speedup	Padding ↓
short_zipf	2.00e+03	3.88e+03	1.94×	61.5% → 12.0%
short_uniform	2.44e+03	3.47e+03	1.42×	45.9% → 13.4%
mixed_tail	6.18e+02	3.43e+03	5.55×	87.4% → 19.6%
flat_long	4.94e+03	4.10e+03	0.83×	0.0% → 0.0%

👉 Packing yields 1.4–5.5× throughput improvements when input lengths are short or skewed, while performance is neutral (or slightly worse) when all sequences are long and uniform.

document how to enable and benchmark inference-time sequence packing in README_Extended
bench/bench_gliner_e2e.py to support full GLiNER models benchmarking

Add inference-time sequence packing support

vivekkalyanarangan30 · 2025-09-16T18:52:39Z

@urchade hopefully this helps.

urchade · 2025-09-26T11:21:19Z

@Ingvarstep 👀

…equence-packing-and-gliner-benchmarks Document sequence packing and extend benchmark

Ingvarstep · 2025-10-09T18:38:46Z

@vivekkalyanarangan30 , awesome job, thanks for contributing!

vivekkalyanarangan30 added 3 commits September 16, 2025 22:28

Clarify packing configuration docstring

ac69400

Add inference packing tests and benchmark harness

7125fc3

Merge pull request #1

07e6f3f

Add inference-time sequence packing support

urchade requested a review from Ingvarstep September 26, 2025 11:51

vivekkalyanarangan30 added 6 commits October 2, 2025 21:24

Document sequence packing and extend benchmark

6c8f8ed

added full gliner benchmark

ec329d2

updated README extended

28f7609

updated README extended

4645696

updated README extended

2528895

Merge pull request #4 from vivekkalyanarangan30/vk/update-readme-on-s…

871aad6

…equence-packing-and-gliner-benchmarks Document sequence packing and extend benchmark

Ingvarstep merged commit 4518525 into urchade:main Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inference-time sequence packing support#292

Add inference-time sequence packing support#292
Ingvarstep merged 9 commits intourchade:mainfrom
vivekkalyanarangan30:main

vivekkalyanarangan30 commented Sep 16, 2025 •

edited

Loading

Uh oh!

vivekkalyanarangan30 commented Sep 16, 2025

Uh oh!

urchade commented Sep 26, 2025

Uh oh!

Ingvarstep commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vivekkalyanarangan30 commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarks

Uh oh!

vivekkalyanarangan30 commented Sep 16, 2025

Uh oh!

urchade commented Sep 26, 2025

Uh oh!

Ingvarstep commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vivekkalyanarangan30 commented Sep 16, 2025 •

edited

Loading