Skip to content

Conversation

@jkrukowski
Copy link
Owner

Key Changes

  • Added attention mask support: Tokenizer now returns BatchTokenizeResult struct containing both tokens and attention masks
  • Implemented masked mean pooling: Padding tokens are now correctly excluded from pooling calculations
  • Updated all embedding models: Bert, CLIP, ModernBert, Roberta, and XLMRoberta now use attention masks in forward pass
  • Refactored accuracy tests: Split monolithic test file into focused per-model test suites with batch accuracy tests
  • Enhanced test infrastructure: Added shared utilities and updated Python generation script for batch testing

Models Updated

  • BertModel
  • ClipModel
  • ModernBertModel
  • RobertaModel
  • XLMRobertaModel

Breaking Changes

  • tokenizeTextsPaddingToLongest methods now return BatchTokenizeResult instead of [[Int32]]

co-author: @dang-hai

Co-authored-by: dang-hai <dan.duonghai@gmail.com>
@jkrukowski jkrukowski merged commit 63b5f21 into main Nov 21, 2025
@jkrukowski jkrukowski deleted the jankrukowski/fix-batch-processing branch November 21, 2025 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants