Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka: Consumer group stale static member properties on rejoin #23693

Merged
merged 2 commits into from
Oct 10, 2024

Conversation

IoannisRP
Copy link
Contributor

@IoannisRP IoannisRP commented Oct 9, 2024

Fixes: CORE-7749

When a static group member rejoins, only its protocols were updated based on the rejoin request.

The relevant updateMember functions have been extended to also include the client-id, client-host, session-timeout and rebalance-timeout properties.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

Bug Fixes

  • Fixes a bug where only a group static member's protocols would be updated on rejoin, even if more properties had been passed to the rejoin command

@CLAassistant
Copy link

CLAassistant commented Oct 9, 2024

CLA assistant check
All committers have signed the CLA.

@IoannisRP IoannisRP marked this pull request as ready for review October 9, 2024 11:04
@BenPope BenPope requested review from BenPope and a team October 9, 2024 11:19
@pgellert pgellert self-requested a review October 9, 2024 11:32
BenPope
BenPope previously approved these changes Oct 9, 2024
Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/v/kafka/server/group.cc Outdated Show resolved Hide resolved
src/v/kafka/server/group.cc Outdated Show resolved Hide resolved
r.client_id,
r.client_host,
r.data.session_timeout_ms,
r.data.rebalance_timeout_ms);
auto old_protocols = _members.at(new_member_id)->protocols().copy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an understanding of why old_protocols are a backup of the new protocols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any reason for it to make sense to store the "old" protocols after they have been updated. I believe it is a bug. However, as I couldn't test it, I preferred to not touch it as part of this change and see what we can do about it as a separate thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/rptest/tests/consumer_group_test.py Outdated Show resolved Hide resolved
tests/rptest/tests/consumer_group_test.py Outdated Show resolved Hide resolved
tests/rptest/tests/consumer_group_test.py Outdated Show resolved Hide resolved
Up to now, group::update_member function would only update the
protocols of the group_member. The fields updated now include the
client id and host fields and the session and rebalance timeout
durations.
BenPope
BenPope previously approved these changes Oct 9, 2024
pgellert
pgellert previously approved these changes Oct 9, 2024
Copy link
Contributor

@pgellert pgellert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, nice first PR

tests/rptest/tests/consumer_group_test.py Outdated Show resolved Hide resolved
tests/rptest/tests/consumer_group_test.py Outdated Show resolved Hide resolved
r.client_id,
r.client_host,
r.data.session_timeout_ms,
r.data.rebalance_timeout_ms);
auto old_protocols = _members.at(new_member_id)->protocols().copy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vbotbuildovich
Copy link
Collaborator

the below tests from https://buildkite.com/redpanda/redpanda/builds/56148#01927240-2408-45bb-b566-8b87d7d87867 have failed and will be retried

gtest_storage_e2e_rpfixture

@vbotbuildovich
Copy link
Collaborator

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56148#01927298-45f6-4ef0-b290-f69f4709cfd8:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#56148

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/cluster_config_test.py::ClusterConfigTest.test_valid_settings

@vbotbuildovich
Copy link
Collaborator

@michael-redpanda
Copy link
Contributor

michael-redpanda commented Oct 10, 2024

CI Failure:

@michael-redpanda michael-redpanda merged commit c3ed8fe into redpanda-data:dev Oct 10, 2024
15 of 18 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-23693-v24.1.x-942 remotes/upstream/v24.1.x
git cherry-pick -x e1bc0df34c 72365bb767

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-23693-v23.3.x-437 remotes/upstream/v23.3.x
git cherry-pick -x e1bc0df34c 72365bb767

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-23693-v24.2.x-229 remotes/upstream/v24.2.x
git cherry-pick -x e1bc0df34c 72365bb767

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants