Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix byte size estimation with kafka producer #1393

Merged
merged 1 commit into from
Feb 24, 2018

Conversation

blakeembrey
Copy link
Contributor

@blakeembrey blakeembrey commented Feb 22, 2018

Unfortunately 1.4.x broke with this because it's accidentally sending the raw key and value instead of serialised byte strings.

Copy link
Collaborator

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me at first glance. Be good to have @tvoinarovskyi double-check as he wrote this originally in fbea5f0 just in case he did this on purpose for some reason...

@tvoinarovskyi
Copy link
Collaborator

Wow, sorry about that.

@tvoinarovskyi
Copy link
Collaborator

Will take another look in the evening, but seems like a legit error )

@jeffwidman
Copy link
Collaborator

jeffwidman commented Feb 22, 2018

Also, we might also want to change the function signature of _estimate_size_in_bytes(), it looks misleading that the function is asking for key, value, so that would make more sense why this mistake happened when actually using the function... maybe rename the signature to key_bytes, value_bytes for clarity.

@jeffwidman
Copy link
Collaborator

jeffwidman commented Feb 22, 2018

Thanks for the fast update @blakeembrey.

_estimate_size_in_bytes() passes the values through to DefaultRecordBatchBuilder.estimate_size_in_bytes() or LegacyRecordBatchBuilder.estimate_size_in_bytes() both of which also have function signatures of key, value... so I'm afraid it's turtles all the way down.

Probably all 5 of these usages should get updated for clarity: https://github.com/dpkp/kafka-python/search?utf8=%E2%9C%93&q=estimate_size_in_bytes&type=

Sorry to nitpick here, it's just these functions are somewhat self-documenting, and since you're already knee-deep into it, might as well finish making it consistent all the way through.

@blakeembrey
Copy link
Contributor Author

@jeffwidman I've updated more of the places where key, value is passed around and expected to be bytes. I believe that headers are also expected to be a list of tuples of bytes.

jeffwidman
jeffwidman previously approved these changes Feb 23, 2018
Copy link
Collaborator

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakeembrey thanks! Looks like tests are failing?

@jeffwidman jeffwidman dismissed their stale review February 23, 2018 05:34

Oops, tests are failing...

@blakeembrey
Copy link
Contributor Author

I’ll have to look into it next week.

@tvoinarovskyi
Copy link
Collaborator

Please, leave the kafka/record part intact. It's the same naming as in Java.

@jeffwidman
Copy link
Collaborator

If that's true, then probably should just revert this back to the original fix of c8945c0 and drop the other commits. I'm fine with that, and sorry for the runaround @blakeembrey.

@blakeembrey
Copy link
Contributor Author

@jeffwidman No problem, just deleted the extra commits.

@tvoinarovskyi
Copy link
Collaborator

gj

@tvoinarovskyi tvoinarovskyi merged commit e66d8c4 into dpkp:master Feb 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants