Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[membership] Replace Ringop with PeerProvider interface #4653

Merged

Conversation

mantas-sidlauskas
Copy link
Contributor

What changed?
Membership resolver can be started with any PeerProvider implementation now.

This means:

  • membership.Resolver is not tightly coupled with Ringpop anymore
  • Ringpop is now just implementing PeerProvider interface.
  • Not using labels for HostInfo. Labels are moved to part of ringpop version of PeerProvider

Why?
This change allows to implement different type of peer provider easily.

How did you test it?
Unit tests, local test

Potential risks

Release notes

Documentation Changes

@coveralls
Copy link

coveralls commented Nov 22, 2021

Pull Request Test Coverage Report for Build 9775defa-410e-4629-871e-e87527d04086

  • 90 of 227 (39.65%) changed or added relevant lines in 7 files are covered.
  • 54 unchanged lines in 10 files lost coverage.
  • Overall coverage decreased (-0.003%) to 56.905%

Changes Missing Coverage Covered Lines Changed/Added Lines %
common/membership/resolver.go 24 26 92.31%
cmd/server/cadence/server.go 0 11 0.0%
common/membership/hashring.go 58 82 70.73%
common/peerprovider/ringpopprovider/provider.go 0 100 0.0%
Files with Coverage Reduction New Missed Lines %
common/task/weightedRoundRobinTaskScheduler.go 1 89.64%
service/history/execution/mutable_state_task_refresher.go 1 73.82%
common/task/fifoTaskScheduler.go 2 84.54%
service/history/queue/transfer_queue_processor.go 2 57.24%
service/matching/taskListManager.go 2 74.19%
common/membership/resolver.go 3 89.29%
service/history/queue/timer_queue_processor_base.go 5 78.6%
common/membership/hashring.go 8 76.05%
common/persistence/nosql/nosqlplugin/cassandra/tasks.go 14 72.8%
common/persistence/nosql/nosqlTaskStore.go 16 58.37%
Totals Coverage Status
Change from base Build 8da7310a-7645-43e8-9f24-a07d6848c2dc: -0.003%
Covered Lines: 82441
Relevant Lines: 144874

💛 - Coveralls

@mantas-sidlauskas mantas-sidlauskas requested a review from a team November 22, 2021 17:45
Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still reading, but sending partial notes anyway

common/membership/hashring.go Show resolved Hide resolved
common/membership/hashring.go Outdated Show resolved Hide resolved
common/membership/hashring.go Outdated Show resolved Hide resolved
}

func (r *ringpopHashring) compareMembers(addrs []string) (map[string]struct{}, bool) {
func (r *ring) compareMembers(addrs []string) (map[string]struct{}, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compareMembersLocked too, as membersMap needs to be protected by the lock

common/peerprovider/ringpopprovider/provider.go Outdated Show resolved Hide resolved
common/membership/hashring.go Show resolved Hide resolved
Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit fuzzy on if duplicates are bad or just unexpected in all this tbh. one place checks, one doesn't, and I'm not sure what two subscribers in either location would imply behaviorally - naively it seems like it'd be fine, though it could be a sign of a double-start or something.

common/peerprovider/ringpopprovider/provider.go Outdated Show resolved Hide resolved
common/membership/hashring.go Outdated Show resolved Hide resolved
common/membership/hashring.go Outdated Show resolved Hide resolved
common/membership/hashring.go Outdated Show resolved Hide resolved
r.logger.Error("Failed to send event to subscriber, channel full", tag.Subscriber(name))
}
func (r *ring) refreshWithBackoff() error {
if r.members.refreshed.After(time.Now().Add(-minRefreshInternal)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side-benefit of that lock-in-struct pattern: easier to notice incorrect locking!

this needs to be locked or it'll be a racy read. which also brings up a general complication: if you get e.g. 5 calls at once, they can all pass this check and refresh 5x. probably worth fixing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed "with backoff" method, now there's only refreshWithLock()

common/membership/hashring_test.go Outdated Show resolved Hide resolved
common/membership/hashring_test.go Outdated Show resolved Hide resolved
select {
case ch <- change:
default:
r.logger.Error("Failed to send listener notification, channel full", tag.Subscriber(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really an error? or is it expected to be so rare that it may be a sign of problems?

common/peerprovider/ringpopprovider/provider.go Outdated Show resolved Hide resolved
common/membership/resolver_test.go Show resolved Hide resolved
common/membership/resolver_test.go Show resolved Hide resolved
Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking issues need to be fixed, but the rest are probably "whatever you agree with is fine", it's mostly minor stuff. Looks good overall :)

So I'll approve to unblock whenever you're ready, but feel free to re-request review if you want me to check the changes.

@mantas-sidlauskas mantas-sidlauskas enabled auto-merge (squash) November 24, 2021 14:55
@mantas-sidlauskas mantas-sidlauskas merged commit 770e9ec into cadence-workflow:master Nov 24, 2021
r.smu.RLock()
defer r.smu.RUnlock()

for name, ch := range r.subscribers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mantas-sidlauskas I was looking at why we never see Ring membership changed log and looks like after this refactoring we stopped notifying the subscribers. Was this intentional or missed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants