-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tag volumes on Create/Attach #130
Conversation
driver/controller.go
Outdated
@@ -253,6 +261,8 @@ func (d *Driver) ControllerPublishVolume(ctx context.Context, req *csi.Controlle | |||
return nil, err | |||
} | |||
|
|||
d.tagVolume(ctx, vol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not checking for the error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. In an earlier iteration I was contemplating making these updates asynchronously so as to not scuttle a volume attach. I changed that approach, but forgot to handle the error value up here.
I wasn't sure what the best code would be, so I went with internal
. eea3ef3#diff-aa5513ce5e9cac724e0d7994151fed94R267
driver/controller.go
Outdated
|
||
ctx, cancel := context.WithTimeout(ctx, doAPITimeout) | ||
resp, err := d.tags.TagResources(ctx, d.doVolTag, tagReq) | ||
cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should call this in a defer
, otherwise the context will be cancelled immediately, which is also passed down below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on the context reuse. I was meaning to use the parent context for each API call rather than chaining them like that. Tests would have missed that since we faked the TagService.
I struggled on the right way to deal with context here. From my perspective, as long as we're making progress we should reset the timeout for each call. So I didn't want to use a single context/cancel.
I had originally opted to leave the cancel()
bare rather than in a defer because I thought deferring a cancel() and then overriding that cancel function-ref with a new cancel for the next call was confusing. defer cancel()
dereferences the function-ref at the line where the defer is; not the end of the function. Using the defer does mean those first two timeouts might fire (with no effect) after their associated API call completes but before the defer cancels them. The only harm is a fairly trivial amount of inefficiency in exchange for a lot of clarity.
ctx, cancel = context.WithTimeout(ctx, doAPITimeout) | ||
resp, err = d.tags.TagResources(ctx, d.doVolTag, tagReq) | ||
cancel() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't understand what this code is meant to do. If we create a tag, why do we check for it's existence again?
Also seems like we're not checking resp
here and just continue. Finally, we should return from the error check immediately and not continue. I know you're overriding the err
here, but I feel like that makes the code open for future bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the TagsService directly (as opposed to as an argument to CreateVolume
) the tag must exist before you can assign resources (like a volume) to it. We get a 404 if the tag doesn't exist. In that case we create the tag, and then retry the TagResources
which applies the tag to our volume.
driver/driver.go
Outdated
@@ -106,6 +112,7 @@ func NewDriver(ep, token, url string) (*Driver, error) { | |||
}) | |||
|
|||
return &Driver{ | |||
doVolTag: os.Getenv(doVolTagEnv), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't assign it here. Instead this should be passed to cmd/do-csi-plugin
via a flag, such as --do-tag
. And then in our controller we would use the still the same environment variable in this form:
- name: csi-do-plugin
image: digitalocean/do-csi-plugin:v1.0.0
args :
- "--endpoint=$(CSI_ENDPOINT)"
- "--token=$(DIGITALOCEAN_ACCESS_TOKEN)"
- "--url=$(DIGITALOCEAN_API_URL)"
- "--do-tag=$DIGITALOCEAN_TAG_VOLUMES)"
This is to prevent any side effects on the binary and cmd/do-csi-plugin
should the only place we pass in external values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Also I think vendoring went wrong and our build is also failing. I would suggest to first upgrade |
- rework tagVolumes for clarity and to fix accidental context reuse in after cancel - use flag rather than environmental var - rename doVolTag to doTag in anticipation of upcoming snapshot tagging - check errors on tagVolumes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Cody, the current code looks better now. I've added couple of comments that simplifies some of our operations and also covers an edge case.
if err != nil { | ||
ll.Errorf("error tagging volume: %s", err) | ||
return nil, status.Errorf(codes.Internal, "failed to tag volume") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that came to my mind yesterday is that we shouldn't tag the volume if no values have been passed to --do-tag
. This driver is also used by people who don't use our managed service, so in their case --do-tag
will be most probably empty. We should change it therefore to:
if d.doTag != "" {
err = d.tagVolume(ctx, vol)
if err != nil {
ll.Errorf("error tagging volume: %s", err)
return nil, status.Errorf(codes.Internal, "failed to tag volume")
}
}
Please merge the vendor changes and rebase on top of it before merging this 👍 |
* tag DO volumes on create/attach
* tag DO volumes on create/attach
* Cherry-pick: Add tagging support for Volumes via the new `--do-tag` flag [[GH-130]](#130) * Cherry-pick: Fix support for volume snapshots by setting snapshot id on volume creation [[GH-129]](#129) * Cherry-pick: Goreportcard fixes (typos, exported variables, etc..) [[GH-121]](#121) * Cherry-pick: Rename the cluster role bindings for the `node-driver-registrar` to be consistent with the other role bindings. [[GH-118]](#118) * Cherry-pick: Remove the `--token` flag for the `csi-do-node` driver. Drivers running on the node don't need the token anymore. [[GH-118]](#118) * Cherry-pick: Don't check the volume limits on the worker nodes (worker nodes are not able to talk to DigitalOcean API) [[GH-142]](#142) * Cherry-pick: Update `godo` (DigitalOcean API package) version to v1.13.0 [[GH-143]](#143) * Cherry-pick: Fix race in snapshot integration test. [[GH-146]](#146) * Cherry-pick: Add tagging support for Volume snapshots via the new `--do-tag` flag [[GH-145]](#145)
* Cherry-pick: Add tagging support for Volumes via the new `--do-tag` flag [[GH-130]](#130) * Cherry-pick: Fix support for volume snapshots by setting snapshot id on volume creation [[GH-129]](#129) * Cherry-pick: Goreportcard fixes (typos, exported variables, etc..) [[GH-121]](#121) * Cherry-pick: Rename the cluster role bindings for the `node-driver-registrar` to be consistent with the other role bindings. [[GH-118]](#118) * Cherry-pick: Remove the `--token` flag for the `csi-do-node` driver. Drivers running on the node don't need the token anymore. [[GH-118]](#118) * Cherry-pick: Don't check the volume limits on the worker nodes (worker nodes are not able to talk to DigitalOcean API) [[GH-142]](#142) * Cherry-pick: Update `godo` (DigitalOcean API package) version to v1.13.0 [[GH-143]](#143) * Cherry-pick: Fix race in snapshot integration test. [[GH-146]](#146) * Cherry-pick: Add tagging support for Volume snapshots via the new `--do-tag` flag [[GH-145]](#145)
Tag DigitalOcean volumes on Create and Attach when the
--do-tag
flag is specified. This work is being done to improve billing readability for DO managed Kubernetes cluster.