Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datalake: manage protobuf descriptor lifetimes correctly #24154

Merged
merged 2 commits into from
Nov 18, 2024

Conversation

ballard26
Copy link
Contributor

The protobuf descriptor reference passed around by resolved_schema is to an object owned by a descriptor pool
in pandaproxy::schema_registry::protobuf_schema_definition. Therefore this PR changes resolved_schema to also pass around protobuf_schema_definition so that the lifetime of the reference doesn't exceed that of the object.

A follow-up PR to this one will include unit tests that will fail without this change.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

  • none

@@ -96,7 +96,7 @@ struct schema_translating_visitor {
auto type = type_to_iceberg(*d).value();
co_return type_and_buf{
.type = resolved_type{
.schema = *d,
.schema = wrapped_protobuf_descriptor { *d, pb_def },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can pb_def be moved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, pb_def is just a wrapper around a shared_ptr to the underlying descriptor pool though so it won't change much. I'll switch to moving it though in case the shared_ptr is ever switched to something else.

@@ -96,7 +96,7 @@ struct schema_translating_visitor {
auto type = type_to_iceberg(*d).value();
co_return type_and_buf{
.type = resolved_type{
.schema = *d,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also document this in the descriptor method? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Nov 18, 2024

@vbotbuildovich
Copy link
Collaborator

the below tests from https://buildkite.com/redpanda/redpanda/builds/58178#01933ddf-68cc-4fd5-98f3-80b61e9d5d8e have failed and will be retried

gtest_raft_rpunit

@vbotbuildovich
Copy link
Collaborator

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/58178#01933e3b-6273-4d21-9d8c-6a0366fcf3d3:

"rptest.tests.shard_placement_test.ShardPlacementTest.test_node_join.disable_license=True"

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#58178

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/shard_placement_test.py::ShardPlacementTest.test_node_join@{"disable_license":true}

@rockwotj rockwotj merged commit 1102318 into redpanda-data:dev Nov 18, 2024
16 checks passed
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix!

@andrwng
Copy link
Contributor

andrwng commented Nov 18, 2024

/backport v24.3.x

@@ -96,7 +96,7 @@ struct schema_translating_visitor {
auto type = type_to_iceberg(*d).value();
co_return type_and_buf{
.type = resolved_type{
.schema = *d,
.schema = wrapped_protobuf_descriptor { *d, std::move(pb_def) },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// descriptor references an object owned by schema_def.
but what makes the guarantee that moving pb_def doesn't invalidate that same ownership property that it looks like you are trying to maintain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, right now its indirection. I.e, the descriptor pool is held by pb_def in a shared_ptr. It may be better to move pb_def first then get the reference though in case that changes in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh i see its heap allocated. got it thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants