Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage capacity #248

Merged
merged 4 commits into from
Feb 24, 2021
Merged

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Feb 12, 2021

What type of PR is this?
/kind feature

What this PR does / why we need it:

We need a way to demonstrate and test both distributed provisioning and storage capacity tracking.

This PR also fixes the long-standing issue of not properly protecting global variables against concurrent access.

Which issue(s) this PR fixes:
Fixes #72

Special notes for your reviewer:

This needs to be complemented by special Prow jobs that use the new deployment: kubernetes/test-infra#20848

Does this PR introduce a user-facing change?:

simulate storage capacity constraints and publish available capacity information

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 12, 2021
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 12, 2021
@pohly
Copy link
Contributor Author

pohly commented Feb 12, 2021

I can split out the locking change into a separate PR, if desired. It's already provided as a separate commit.

@pohly
Copy link
Contributor Author

pohly commented Feb 16, 2021

/test pull-kubernetes-csi-csi-driver-host-path-distributed-on-kubernetes-master
/test pull-kubernetes-csi-csi-driver-host-distributed-on-kubernetes-1-19

@msau42
Copy link
Collaborator

msau42 commented Feb 17, 2021

/assign @verult

@pohly pohly force-pushed the storage-capacity branch 2 times, most recently from e2fd59b to 2317581 Compare February 19, 2021 10:05
name: csi-data-dir

- name: hostpath
image: k8s.gcr.io/sig-storage/hostpathplugin:v1.6.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I had "canary" here with a TODO. I think we can switch directly to using the next release number, then we avoid some churn.

Copy link
Contributor

@verult verult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool! Largely looks good to me

volumes: map[string]hostPathVolume{},
snapshots: map[string]hostPathSnapshot{},
}
hp.discoverExistingSnapshots()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: does the order of operations matter between discovering snapshots and volumes? Wondering why you moved snapshots before volumes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters and I didn't have any particular reason for changing it. I can revert the order, just to be safe.

// gRPC calls involving any of the fields below must be serialized
// by locking this mutex before starting. Internal helper
// functions assume that the mutex has been locked.
mutex sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of locking/unlocking in top level GRPC functions only where needed (and having to inspect the helper code to identify where locking should take place), what do you think about moving these 3 fields into a separate struct and define read/write methods that lock/unlock at the top of the method body?

e.g.

type hostPathCache { // TODO need better name
  mutex sync.Mutex
  volumes   map[string]hostPathVolume
  snapshots map[string]hostPathSnapshot
}

func (c *hostPathCache) GetVolume(volName string) {
  c.Lock()
  defer c.Unlock()
  ...
}
...
type hostPath struct {
  ...
  cache *hostPathCache
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And is it worth having a RWMutex instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A production-quality driver would definitely want to be more intelligent about its locking. But this is just a demo driver where the main priority is correctness, not performance when executing many operations in parallel.
Therefore I prefer to keep the locking as simple as possible, i.e. lock in the gRPC functions after inspecting parameters and before using local state. The rule for helper functions is simple: helpers don't lock and a gRPC function must take the lock before calling them. Only a few read-only functions may (but don't have to) be treated as exceptions.

What you propose for example may have a correctness issues: consider CreateVolume function checks for a volume with GetVolume, then proceeds to create it. In parallel, the function is called again for the same volume and also proceeds to create it. Both complete and we leak one volume because only one of them gets stored and returned to the caller.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah thanks for the catch, sgtm


vol, err := createHostpathVolume(volumeID, req.GetName(), capacity, requestedAccessType, false /* ephemeral */)
kind := req.GetParameters()[storageKind]
vol, err := hp.createVolume(volumeID, req.GetName(), capacity, requestedAccessType, false /* ephemeral */, kind)
if err != nil {
return nil, status.Errorf(codes.Internal, "failed to create volume %v: %v", volumeID, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create volume RPC errors codes never gets returned here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

I think all of these cases where we wrap some error can simply be replaced with returning the error or wrapping it. The result will be that instead of the non-descriptive codes.Internal the caller will get a codes.Unknown, which has the same effect.

There's also an opportunity to remove duplicate code by letting the getVolume/SnapshotBy... functions return a NotFound error themselves.

@@ -343,6 +375,10 @@ func (hp *hostPath) deleteVolume(volID string) error {
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
return err
}
glog.V(4).Infof("deleting hostpath volume: %s = %+v", volID, vol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log this before the RemoveAll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I put it after the work is done to mirror the "adding hostpath volume" message, i.e. once we get that message, we know that hp.volumes is really being changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit, but maybe we could move this after all the work is done (i.e. after delete()) and log "added/deleted hostpath volume"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's better. Changed.

// adds the volume to the list.
//
// It returns the volume path or err if one occurs. That error is suitable as result of a gRPC call.
func (hp *hostPath) createVolume(volID, name string, cap int64, volAccessType accessType, ephemeral bool, kind string) (hpv *hostPathVolume, finalErr error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return invalid arg error if kind is specified but capacity is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, added.

@pohly pohly force-pushed the storage-capacity branch 2 times, most recently from 7f6697a to 86b2ad1 Compare February 22, 2021 11:30
@pohly
Copy link
Contributor Author

pohly commented Feb 22, 2021

/test pull-kubernetes-csi-csi-driver-host-path-distributed-on-kubernetes-master
/test pull-kubernetes-csi-csi-driver-host-distributed-on-kubernetes-1-19

Copy link
Contributor

@verult verult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

(I don't think I have approval permissions though)

@@ -175,9 +176,9 @@ func (hp *hostPath) NodePublishVolume(ctx context.Context, req *csi.NodePublishV
}
}

volume := hostPathVolumes[req.GetVolumeId()]
volume := hp.volumes[req.GetVolumeId()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the requested volume ID not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question. This change came from a global seach/replace, I haven't really thought about it.

There was indeed a check that the volume exists earlier and we even have a copy of that volume in vol. So the code is correct, just a bit unusual. I think this is better:

-       volume := hp.volumes[req.GetVolumeId()]
-       volume.NodeID = hp.nodeID
-       hp.volumes[req.GetVolumeId()] = volume
+       vol.NodeID = hp.nodeID
+       hp.updateVolume(req.GetVolumeId(), vol)
        return &csi.NodePublishVolumeResponse{}, nil

@@ -343,6 +375,10 @@ func (hp *hostPath) deleteVolume(volID string) error {
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
return err
}
glog.V(4).Infof("deleting hostpath volume: %s = %+v", volID, vol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit, but maybe we could move this after all the work is done (i.e. after delete()) and log "added/deleted hostpath volume"

@@ -428,7 +428,7 @@ func (hp *hostPath) loadFromSnapshot(size int64, snapshotId, destPath string, mo
out, err := executor.Command(cmd[0], cmd[1:]...).CombinedOutput()
glog.V(4).Infof("Command Finish: %v", string(out))
if err != nil {
return status.Errorf(codes.Internal, "failed pre-populate data from snapshot %v: %v: %s", snapshotId, err, out)
return fmt.Errorf("failed pre-populate data from snapshot %v: %w: %s", snapshotId, err, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one more instance of codes.Internal in this function above. Should that be changed too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had kept it for consistency with the two error returns around it which both return a status error. But thinking about it again, I agree that it is better to also change it. Done.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 22, 2021
The code used to share state via global variables (bad practice!)
without proper locking (even worse). The different servers were
implemented as individual structs, which meant that parameters had to
be duplicated.

A simpler approach is to have one struct with clearly defined
read-only and read/write members, with coarse locking at the outer
gRPC layer.
This fakes capacity by pretending to have linear storage for different
kinds and subtracting the size of existing volumes from that.

A test deployment with some example storage classes, an example app
with a generic ephemeral inline volume, and storage capacity tracking
enabled is provided for use on a Kubernetes cluster where these alpha
features are enabled.

When that feature is not enabled in the cluster, also the driver
deployment is done without storage capacity enabled. This then serves
as a test that distributed provisioning works.

prow.sh can be used to test this new deployment, for example with:

CSI_PROW_SANITY_POD=csi-hostpath-socat-0 \
  CSI_PROW_SANITY_CONTAINER=socat \
  CSI_PROW_DEPLOYMENT=kubernetes-distributed \
  CSI_PROW_KUBERNETES_VERSION=1.19.0 \
  ./.prow.sh
Wrapping errors with an "Internal" status error was at best
useless (it does not tell the caller anything and the default
"Unknown" error has the same effect) and at worst hid the actual
status (when wrapping createVolume). Therefore most of those wrappers
get removed.

The handling of "not found" also was inconsistent. Some places checked
the map directly, others made assumptions about the error returned by
the helper functions. Now the helper functions themselves generate a
gRPC status error which can be returned directly.
The parameter becomes invalid when the driver is not configured
to have different storage kinds.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 23, 2021
@verult
Copy link
Contributor

verult commented Feb 24, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2021
@pohly
Copy link
Contributor Author

pohly commented Feb 24, 2021

/assign @jsafrane

For approval.

@jsafrane
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, pohly, verult

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2021
@k8s-ci-robot k8s-ci-robot merged commit 7a0de75 into kubernetes-csi:master Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

map access needs lock
5 participants