Skip to content

Update test case to set correct resource quota and work with multi-cluster setup #2264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

johscheuer
Copy link
Member

@johscheuer johscheuer commented Apr 7, 2025

Description

Test case fix for multi-cluster coordination setup and reduce log level for multi-cluster coordination. Fixed some other minor bugs in the global coordination setup that I detected during debugging.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Discussion

Testing

Ran some test manually.

Documentation

Follow-up

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: e2737fb
  • Duration 3:16:45
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 7, 2025
@johscheuer johscheuer reopened this Apr 7, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: e2737fb
  • Duration 4:12:55
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 8, 2025
@johscheuer johscheuer reopened this Apr 8, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: e2737fb
  • Duration 3:05:23
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 1dd04a9
  • Duration 2:59:15
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 8, 2025
@johscheuer johscheuer reopened this Apr 8, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 1dd04a9
  • Duration 3:08:13
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 9, 2025
@johscheuer johscheuer reopened this Apr 9, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 1dd04a9
  • Duration 3:19:38
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 7325f22
  • Duration 4:12:19
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 9, 2025
@johscheuer johscheuer reopened this Apr 9, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 7325f22
  • Duration 3:17:57
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 9, 2025
@johscheuer johscheuer reopened this Apr 9, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 2066efa
  • Duration 3:10:48
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 2066efa
  • Duration 3:01:22
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 10, 2025
@johscheuer johscheuer reopened this Apr 10, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 2066efa
  • Duration 4:13:19
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Apr 29, 2025
@johscheuer johscheuer reopened this Apr 29, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: 93117ff
  • Duration 4:13:49
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer requested a review from nicmorales9 April 30, 2025 17:21
@johscheuer johscheuer marked this pull request as ready for review April 30, 2025 17:22
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: 62fb251
  • Duration 4:14:18
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: aa64c2d
  • Duration 4:13:49
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: 6bfbfc0
  • Duration 3:29:10
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Member Author

@johscheuer johscheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: 6bfbfc0
  • Duration 3:29:10
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

I fixed a few other things in the global coordination but I disabled it yet for the e2e testing and I will be working on the other failures in a new PR before enabling it again.

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 2:43:10
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this May 7, 2025
@johscheuer johscheuer reopened this May 7, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 2:40:46
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this May 7, 2025
@johscheuer johscheuer reopened this May 7, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 2:45:24
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this May 8, 2025
@johscheuer johscheuer reopened this May 8, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 2:42:57
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this May 8, 2025
@johscheuer johscheuer reopened this May 8, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 3:03:31
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this May 8, 2025
@johscheuer johscheuer reopened this May 8, 2025
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux RHEL 9

  • Commit ID: a6661cc
  • Duration 2:57:00
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Contributor

@nicmorales9 nicmorales9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor things but overall LGTM

func (processGroupID ProcessGroupID) GetProcessClass() ProcessClass {
tmp := strings.Split(string(processGroupID), "-")
// The process class will always be the second last item, the last item is the ID number.
return ProcessClass(tmp[len(tmp)-2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this guaranteed to be 2 or more sections (i.e. is a dash guaranteed)?

endif
# else
# FEATURE_SYNCHRONIZATION_MODE="global"
# endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we specifically testing this feature anywhere or waiting for some improvements?

// getAddressStringForIncludeAndExclude will return a string with addresses or localities that can be used for exclusion
// or inclusion. If any of the addresses defines a port, it will be reset by this method to ensure the whole pod gets
// excluded or included.
func getAddressStringForIncludeAndExclude(addresses []fdbv1beta2.ProcessAddress) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel like it would read a bit easier to rename this something more like getAddressStringsWithoutPorts and explain the use in the docstring, my first instinct is "how is the address list used for both"?

var mapResult map[fdbv1beta2.ProcessGroupID][]string
fdbClient.logger.V(1).Info("Fetching process addresses for global coordination in FDB", "prefix", prefix)
defer func() {
fdbClient.logger.V(1).Info("Done process addresses for global coordination in FDB", "prefix", prefix, "results", len(mapResult))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fdbClient.logger.V(1).Info("Done process addresses for global coordination in FDB", "prefix", prefix, "results", len(mapResult))
fdbClient.logger.V(1).Info("Done fetching process addresses for global coordination in FDB", "prefix", prefix, "results", len(mapResult))

@@ -139,13 +146,53 @@ func AllProcessesReadyForExclusion(logger logr.Logger, pendingProcessGroups map[
return readyProcessGroups, nil
}

// GetAddressesFromCoordinationState will return the addresses based on the coordination state. If addresses should be included, the process address(es) are included and if localities should be included, the localities are included too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly I think this is clearer:

Suggested change
// GetAddressesFromCoordinationState will return the addresses based on the coordination state. If addresses should be included, the process address(es) are included and if localities should be included, the localities are included too.
// GetAddressesFromCoordinationState will return the addresses based on the coordination state. If addresses should be included, the process address(es) are included in the result, and if localities should be included, the localities are included in the result too.

@@ -288,6 +364,19 @@ func UpdateGlobalCoordinationState(logger logr.Logger, cluster *fdbv1beta2.Found
}
}

addresses, ok := processAddresses[processGroup.ProcessGroupID]
if !ok || len(addresses) == 0 {
logger.V(1).Info("found process without process addresses, will be added", "processGroupID", processGroup.ProcessGroupID, "addresses", processGroup.Addresses)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not 100% clear from the message whether you are saying that the process address will be added to the process or that the process itself will be added to something
I would reword to something more like
"found process without process addresses, the process will be added to the coordination state"
same for the old process log message below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants