Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCE: create one node pool for each availability zone #44

Merged
merged 5 commits into from
Aug 1, 2022
Merged

CCE: create one node pool for each availability zone #44

merged 5 commits into from
Aug 1, 2022

Conversation

k11h-de
Copy link
Contributor

@k11h-de k11h-de commented Jul 27, 2022

Hello iits team,

We noticed that one can define a list of availability zones (az) for the variable node_config.availability_zones in the CCE module. When doing so only one node pool is created and the first item in the list is used in the config.

In this PR, I simply added a count to the terraform module to make sure one node pool is created for each az.
This adds real high availability to the CCE cluster in case one OTC az goes down.

I was able to test this successfully. The existing node pool was renamed in-place (not destroyed and recreated).

Hope this is helpful
Cheers Karsten


@k11h-de k11h-de changed the title create one node pool for each availability zone CCE: create one node pool for each availability zone Jul 27, 2022
@canaykin
Copy link
Contributor

canaykin commented Jul 29, 2022

Hi Karsten,

You are absolutely correct that it was a bad idea to always use the first AZ for the single node pool, and I love the solution to create one node pool for each AZ.

I added some suggestions but they are mainly syntax changes rather than the logic. The only problem I have with this change is the fact that it is 100% not backwards compatible and definitely breaking.

Changing the node pool names to add az prefix alone will cause destroy and recreate all of the existing node pools and will most certainly cause downtime.

I will discuss this with @victorgetz too as I believe the improvement is justified to be added for multi AZ but we will need to plan a major version release for this. Meanwhile I will try to find a way to make this backwards compatible but I am unable to find a solution so far.

Regardless, thank you very much for the contribution.
Best,
Can.

@canaykin
Copy link
Contributor

canaykin commented Jul 29, 2022

I was able to test this successfully. The existing node pool was renamed in-place (not destroyed and recreated).

I have just seen this, if this is the case we should definitely merge it. The question then becomes will it break if we use foreach instead of count. I will test both approaches and post an update.

@canaykin
Copy link
Contributor

canaykin commented Jul 29, 2022

Karsten, you are a genius.

If count is used instead of for_each terraform automatically moves the unindexed resource to a 0 index in its state:

Terraform will perform the following actions:

  # module.cce.opentelekomcloud_cce_node_pool_v3.cluster_node_pool[0] will be updated in-place
  # (moved from module.cce.opentelekomcloud_cce_node_pool_v3.cluster_node_pool)
  ~ resource "opentelekomcloud_cce_node_pool_v3" "cluster_node_pool" {
        id                       = "REDACTED"
      ~ name                     = "cendo-dev-node-pool" -> "cendo-dev-node-pool-eu-de-01"
        # (15 unchanged attributes hidden)


        # (2 unchanged blocks hidden)
    }

While I find the for_each cleaner, it breaks the approach and causes a destroy and recreate instead. With count we can merge it. I updated my suggestions to match this.

modules/cce/cluster.tf Outdated Show resolved Hide resolved
modules/cce/output.tf Outdated Show resolved Hide resolved
modules/cce/variables.tf Outdated Show resolved Hide resolved
modules/cce/output.tf Outdated Show resolved Hide resolved
k11h-de and others added 4 commits July 29, 2022 21:10
Co-authored-by: canaykin <70644420+canaykin@users.noreply.github.com>
Co-authored-by: canaykin <70644420+canaykin@users.noreply.github.com>
Co-authored-by: canaykin <70644420+canaykin@users.noreply.github.com>
Co-authored-by: canaykin <70644420+canaykin@users.noreply.github.com>
@k11h-de
Copy link
Contributor Author

k11h-de commented Jul 29, 2022

@canaykin
Thanks a lot for your feedback and suggestions. I hope other users find this useful too.

One thing came to my mind to consider (and test):
In case the old (default) value was ["eu-de-03"] and one changes this to ["eu-de-02", "eu-de-03"] so the item with index [0] changes - wouldn't this cause (destroy and) re-creation?

@canaykin
Copy link
Contributor

canaykin commented Aug 1, 2022

@canaykin Thanks a lot for your feedback and suggestions. I hope other users find this useful too.

One thing came to my mind to consider (and test): In case the old (default) value was ["eu-de-03"] and one changes this to ["eu-de-02", "eu-de-03"] so the item with index [0] changes - wouldn't this cause (destroy and) re-creation?

Absolutely, but in that case a change is expected as the ordering of elements in a list is important. User is expected to know [a,b]=[b,a]. More importantly, this will only have an effect if use changes this in the future, but not if the user changed it before the release.

@canaykin canaykin merged commit 2d8d6a4 into iits-consulting:master Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants