Skip to content

Conversation

@echen-98
Copy link
Contributor

@echen-98 echen-98 commented Oct 21, 2021

Nice.

Description of changes:
Properly retrieve the latest state for these 3 fields and compare them to the desired state. For each of these 3 spec fields, obtaining the corresponding latest state requires processing of a different Status field due to the structure of the DescribeReplicationGroups API response

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@echen-98
Copy link
Contributor Author

Removed the block in custom_update_api as we shouldn't need it anymore. Even if multi AZ and auto failover are present in the delta, CustomModifyReplicationGroup is still executed first and any field which needs to be updated separate from the generated code path will be.

Test succeeded locally; running prow job to ensure this change doesn't affect any other tests.

# assert AF and multi AZ
assert resource['status']['automaticFailover'] == "disabled"
assert resource['status']['multiAZ'] == "disabled"
assert resource['status']['automaticFailover'] == "enabled"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were just wrong assertions; if you check the file the last patch enables both of these fields

func multiAZRequiresUpdate(desired *resource, latest *resource) bool {
// no preference for multi AZ specified; no update required
if desired.ko.Spec.MultiAZEnabled == nil {
return false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If desired.ko.Spec.MultiAZEnabled is nil or &false, and latest.ko.Spec.MultiAZEnable is &true no updates are required? there are no API calls to make such update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is nil, then the user has no preference and the server will select the default value (can't remember if it's enabled or disabled off the top of my head).

One thing to note is that latest.ko.Spec.MultiAZEnabled will always the same here as desired.ko.Spec.MultiAZEnabled because latest is copied from desired and MultiAZEnabled (bool) is not part of the DescribeReplicationGroups API response, so we get the status directly from MultiAZ (string/enum).

If desired.ko.Spec.MultiAZEnabled is &false and *latest.ko.Spec.MultiAZ is "enabled", then line 134 handles this case and there will be an API call to disable multi AZ. The test test_rg_fault_tolerance covers this scenario.

Comment on lines 132 to 136
if *desired.ko.Spec.MultiAZEnabled {
return *latest.ko.Status.MultiAZ != "enabled"
} else {
return *latest.ko.Status.MultiAZ != "disabled"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 151 to 155
if *desired.ko.Spec.AutomaticFailoverEnabled {
return *latest.ko.Status.AutomaticFailover != "enabled"
} else {
return *latest.ko.Status.AutomaticFailover != "disabled"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue
}

if member.CurrentRole != nil && *member.CurrentRole == "primary" && member.CacheClusterID != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: you can only have one "primary" cluster in a node group?

Copy link
Contributor Author

@echen-98 echen-98 Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and primary "cluster" (it's just a node) ID can only be specified when there is one node group in the replication group.

Comment on lines 158 to 161
// primaryClusterIDRequiresUpdate retrieves the current primary cluster ID and determines whether
// an update is required. If no desired state is specified or there is an issue retrieving the
// latest state, return false, nil. Otherwise, return false or true depending on equality of
// the latest and desired states, and a non-nil pointer to the latest value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The triple space after the comment token // are not necessary. Use only one in every line. This might impact the generation of the GoDoc (untested)

Comment on lines +58 to +69
if multiAZRequiresUpdate(desired, latest) {
delta.Add("Spec.MultiAZEnabled", desired.ko.Spec.MultiAZEnabled, latest.ko.Status.MultiAZ)
}

if autoFailoverRequiresUpdate(desired, latest) {
delta.Add("Spec.AutomaticFailoverEnabled", desired.ko.Spec.AutomaticFailoverEnabled,
latest.ko.Status.AutomaticFailover)
}

if updateRequired, current := primaryClusterIDRequiresUpdate(desired, latest); updateRequired {
delta.Add("Spec.PrimaryClusterID", desired.ko.Spec.PrimaryClusterID, *current)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since filterDelta does not only remove deltas anymore, maybe put this logic in a different function? Or rename it to something more generalistic?

@echen-98
Copy link
Contributor Author

Made changes as requested. Waiting for test run to complete

Comment on lines 74 to 79
// 1. When automaticFailoverEnabled differs:
// if automaticFailoverEnabled == false; do nothing in this custom logic, let the modify execute first.
// else if automaticFailoverEnabled == true then following logic should execute first.
// 2. When multiAZ differs
// if multiAZ = true then below is fine.
// else if multiAZ = false ; do nothing in custom logic, let the modify execute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the required order of operation, it seems this logic is still needed:
When delta is present for auto failover and/or multi AZ fields:

  • When desired value is false then invoke modifyReplicationGroup (default implementation in sdk.go) first
  • When desired value is true then let the rest of the custom modify logic execute first for AF/multiAZ to be applied later once all custom logic has executed.

Copy link
Contributor Author

@echen-98 echen-98 Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When desired value is false then invoke modifyReplicationGroup (default implementation in sdk.go) first

modifyReplicationGroup only leads to the default/generated implementation if it returns nil, nil, which happens when there is no difference in security group IDs or engine version. That is to say, if we want to skip straight to the generated code, we should immediately return nil, nil instead of calling rm.modifyReplicationGroup.

When desired value is true then let the rest of the custom modify logic execute first for AF/multiAZ to be applied later once all custom logic has executed.

This second case is equivalent to removing the below block, so the uncertainty remains with the first case above. Which of the below fields require multi AZ and auto failover to be disabled first (replica count is one, but I'm not sure if there's another)? If order matters then we should have had a test which failed upon removing this block, but we can add one now if it doesn't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went ahead and reverted the changes to this particular file. Since we have limited time to iterate on this (as discussed offline), we can hold these thoughts for later. Same with non-blocking comments below, let's merge if this test run passes.

Copy link
Member

@a-hilaly a-hilaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

MultiAZEnabled:
compare:
is_ignored: true
PrimaryClusterId: # note: "PrimaryClusterID" will not function properly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, maybe we should open an Issue for this?

// an update is required. If no desired state is specified or there is an issue retrieving the
// latest state, return false, nil. Otherwise, return false or true depending on equality of
// the latest and desired states, and a non-nil pointer to the latest value
func primaryClusterIDRequiresUpdate(desired *resource, latest *resource) (bool, *string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny QQ: Any reason why are you using a *string and not just a string?

@jaypipes jaypipes closed this Nov 9, 2021
@jaypipes jaypipes reopened this Nov 9, 2021
}

if multiAZRequiresUpdate(desired, latest) {
delta.Add("Spec.MultiAZEnabled", desired.ko.Spec.MultiAZEnabled, latest.ko.Status.MultiAZ)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the Spec.MultiAZEnabled field being set to the return value of the ReadOne operation's Output shape's MultiAZ field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are different types; see response below for more detail

Comment on lines +130 to +136
// true maps to "enabled"; false maps to "disabled"
// this accounts for values such as "enabling" and "disabling"
if *desired.ko.Spec.MultiAZEnabled {
return *latest.ko.Status.MultiAZ != string(svcapitypes.MultiAZStatus_enabled)
} else {
return *latest.ko.Status.MultiAZ != string(svcapitypes.MultiAZStatus_disabled)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. There should not be a Status.MultiAZ field. There should only be a single field: Spec.MultiAZEnabled

  2. This kind of custom code should go in your updateSpecFields function in custom_set_output.go and look something like this:

	// MultiAZEnabled is a boolean in the Spec field but a string enum in the
	// output shape field. Here, we reconcile that difference...
	if (respRG.MultiAZ == nil) {
		// Or set this to a default value if nil isn't default?
		resource.ko.Spec.MultiAZEnabled = nil
	} else {
		resource.ko.Spec.MultiAZEnabled = (*respRG.MultiAZ == string(svcapitypes.MultiAZStatus_enabled))
	}

If you do the above, you don't have to worry about delta comparisons at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should not be a Status.MultiAZ field. There should only be a single field: Spec.MultiAZEnabled

Agreed in principle, but the existence of a separate MultiAZ field is an ElastiCache API quirk rather than a conscious decision on our part.

This kind of custom code should go in your updateSpecFields function in custom_set_output.go and look something like this:

I'm glad you brought this up because it's helping me externalize why we're moving away from updateSpecFields in many cases.

Initially, I wanted to use updateSpecFields to update every single spec field during ReadOne, so that the generated delta code could handle everything and there would be no need for manual comparison between desired.Spec and latest.Status.

However, Spec fields often don't allow us to store all the information we're getting from the API response. For example, Spec.AutomaticFailoverEnabled is a bool, whose associated field Status.AutomaticFailover is a string/enum which could be any of "enabled", "disabled", "enabling", or "disabling". We would have to map each of these four states back to a bool (square peg, round hole sort of thing), when it would make more sense to just make a decision based on the full version of the information (Status) since we have it anyway.

Granted, the MultiAZ case is a little different since at the moment, the only two possible values are "enabled" and "disabled". However, it's entirely possible that in the future ElastiCache decides to have this field update asynchronously or make any other change which could introduce additional states/values for Status.MultiAZ. So the suggested approach could break given these changes, and even if no service-side changes occur, I would say it's still non-ideal to force a string/enum into a bool when we can just make a decision based on the full, unadulterated information from the API response.

As far as the role of updateSpecFields, I think it absolutely should exist, but the function name and intended purpose should change. The majority of this function is already calling the DescribeCacheClusters API and populating the associated fields in the replication group object, since the latest state for some Spec fields cannot be retrieved directly from the DescribeReplicationGroups API. I would say that this should be the primary purpose of this function in the future.

So to sum up, the function currently named updateSpecFields is an extension of SdkFind/ReadOne in that we are still retrieving the latest state of the replication group (but via a different API). There should be limited mapping back to Spec fields to avoid losing information; we only do it here because this state will be lost if we don't save it back to the latest object. On the other hand, modifyDelta should look at the most complete representation of the latest state of each Spec field available to it (which is often in the Status) and make a decision based on that. Therefore I think we should keep things as-is.

@nmvk
Copy link
Contributor

nmvk commented Nov 16, 2021

/lgtm

@ack-bot ack-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 16, 2021
@ack-bot
Copy link
Collaborator

ack-bot commented Nov 16, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: A-Hilaly, echen-98, nmvk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [A-Hilaly,echen-98,nmvk]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ack-bot ack-bot merged commit 277966a into aws-controllers-k8s:main Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants