Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix volume attachment limit calculation #1561

Merged
merged 1 commit into from
Apr 11, 2023

Conversation

torredil
Copy link
Member

@torredil torredil commented Apr 6, 2023

Is this a bug fix or adding new feature?

  • Bug fix.

What is this PR about? / Why do we need it?

Currently, the blockVolumes count returned from the ec2 metadata service is incorrect.

func (d *nodeService) getVolumesLimit() int64 {

blockDevMappings = strings.Count(mappings, "\n")

As an example, on Windows the virtual devices for instance store volumes are counted which results in -1 as the attachment limit when block devices are subtracted from the available limit:

$ kubectl logs ebs-csi-node-windows-gjllb -n kube-system -c ebs-plugin
      
I0404 21:25:23.743697    2772 driver.go:73] "Driver Information" Driver="ebs.csi.aws.com" Version="v1.17.0"
I0404 21:25:25.521753    2772 node.go:81] "regionFromSession Node service" region=""
I0404 21:25:25.523961    2772 metadata.go:85] "retrieving instance data from ec2 metadata"
I0404 21:25:25.544851    2772 metadata.go:92] "ec2 metadata is available"
I0404 21:25:25.546760    2772 metadata_ec2.go:25] "Retrieving EC2 instance identity metadata" regionFromSession=""
I0404 21:25:25.549832    2772 metadata_ec2.go:70] "Retrieving EC2 block device mapping metadata" mappings=<
	ami
	ebs1
	ephemeral0
	ephemeral1
	ephemeral10
	ephemeral11
	ephemeral12
	ephemeral13
	ephemeral14
	ephemeral15
	ephemeral16
	ephemeral17
	ephemeral18
	ephemeral19
	ephemeral2
	ephemeral20
	ephemeral21
	ephemeral22
	ephemeral23
	ephemeral24
	ephemeral25
	ephemeral3
	ephemeral4
	ephemeral5
	ephemeral6
	ephemeral7
	ephemeral8
	ephemeral9
	root
 >
I0404 21:25:28.465190    2772 node.go:749] "blockVolumes count" blockVolumes=28

This explains the behavior we have observed with the CSINode Allocatable property not being set on Windows nodes. As a result of this change, only entries in the block device mapping that start with ebs are counted. This will fix issues with double counting the root volume and counting entries that should not be counted.

What testing is done?

  • make test
  • External storage test: volumeLimits [It] should verify that all csinodes have volume limits
  • CSINode Allocatables count present and correct:
$ kubectl describe csinode

Name:               ip-192-168-50-216.ap-northeast-2.compute.internal
Labels:             <none>
Annotations:        storage.alpha.kubernetes.io/migrated-plugins:
                      kubernetes.io/aws-ebs,kubernetes.io/azure-disk,kubernetes.io/azure-file,kubernetes.io/cinder,kubernetes.io/gce-pd
CreationTimestamp:  Thu, 06 Apr 2023 18:58:23 +0000
Spec:
  Drivers:
    ebs.csi.aws.com:
      Node ID:  i-0d5778d50d069306f
      Allocatables:
        Count:        37

Signed-off-by: Eddie Torres <torredil@amazon.com>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 6, 2023
@k8s-ci-robot k8s-ci-robot requested review from gtxu and rdpsin April 6, 2023 19:20
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 6, 2023
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2023
@ConnorJC3
Copy link
Contributor

/retest

morning data collection

@ConnorJC3
Copy link
Contributor

/retest

rollback test

@ConnorJC3
Copy link
Contributor

/retest

flake

@ConnorJC3
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ConnorJC3, gtxu, hanyuel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ConnorJC3
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 10, 2023
@torredil
Copy link
Member Author

/test pull-aws-ebs-csi-driver-external-test-eks

@gtxu
Copy link
Contributor

gtxu commented Apr 10, 2023

/retest

1 similar comment
@torredil
Copy link
Member Author

/retest

@ConnorJC3
Copy link
Contributor

/retest

How unlucky can we get

@ConnorJC3
Copy link
Contributor

/retest

Another round on this roller coaster

@k8s-ci-robot k8s-ci-robot merged commit 4e96fd3 into kubernetes-sigs:master Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants