Skip to content

Comments

Fix key partitioner to use bytesize instead of size for multi-byte characters#804

Open
mensfeld wants to merge 1 commit intomasterfrom
fix/issue-629-bytesize-partitioner
Open

Fix key partitioner to use bytesize instead of size for multi-byte characters#804
mensfeld wants to merge 1 commit intomasterfrom
fix/issue-629-bytesize-partitioner

Conversation

@mensfeld
Copy link
Member

@mensfeld mensfeld commented Feb 5, 2026

Summary

Fixes #629

The key partitioner was using #size instead of #bytesize which could cause incorrect partition assignments for keys with multi-byte characters. Since Kafka works with byte arrays, the partitioner should use #bytesize to ensure correct partition assignment based on the actual byte length of the key.

Changes

  • Updated lib/rdkafka/bindings.rb to use str.bytesize instead of str.size in the partition key calculation
  • Added comprehensive test coverage for multi-byte character handling in spec/lib/rdkafka/producer_spec.rb
  • Updated CHANGELOG.md

Testing

Added two new test contexts:

  1. Verifies that multi-byte keys consistently route to the same partition across multiple produces
  2. Verifies that strings with the same character count but different byte sizes are handled correctly

🤖 Generated with Claude Code

…aracters

Fixes #629

The key partitioner was using `#size` instead of `#bytesize` which could
cause incorrect partition assignments for keys with multi-byte characters.
Since Kafka works with byte arrays, the partitioner should use `#bytesize`
to ensure correct partition assignment based on the actual byte length.

Changes:
- Updated lib/rdkafka/bindings.rb to use `str.bytesize` instead of `str.size`
- Added comprehensive tests for multi-byte character handling
- Updated CHANGELOG.md

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Key partitioner is using #size instead of bytesize potentially causing incorrect partition assignments

2 participants