-
Notifications
You must be signed in to change notification settings - Fork 34
Add design for StorageClass parameters #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: John Strunk <jstrunk@redhat.com>
|
Can one of the admins verify this patch? |
|
Thanks @JohnStrunk for this : At the very minimum I would like to pass |
|
I can work on this code implementation. Assigning to myself. |
|
@centos-ci ok to test |
amarts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
From glusterfs, we mainly need 'volume type', and 'volume options', and as the example covers both these, it should be good to get in for now!
|
From @humblec 's feedback, I have rechecked the allowed syntax, and my proposed |
jarrpa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few questions and comments, with more waiting until the layout gets flattened.
A general comment: I see that there are a number of parameters from the current (in-tree) provisioner that are not represented in this proposal. I haven't compared them with scrutiny, but have they been consciously omitted, possibly due to no longer being relevant?
| parameters: | ||
| # Maximum size of an individual brick. Volumes that exceed this will be | ||
| # created as distributed-* volumes. | ||
| maxBrickSize: 100Gi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually a feature of Gluster I'm not aware of? Or is this logic yet-to-be-implemented in the CSI driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will interact w/ GD2 IVP to limit brick size to make recovery and capacity balancing easier. I thought Heketi had something similar, though not quite used like intended here... Perhaps I'm just thinking of vol resize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. Might be something that could be enforced within the driver, for Heketi anyway.
docs/storageclass-description.md
Outdated
| # PV capacity * capacityReservation will be reserved for use by this | ||
| # volume. The reservation should account for snaps, clones, compression, | ||
| # and deduplication. Float >= 0 | ||
| capacityReservation: "1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name does not really reflect its function. Since we're explicitly using LVM, what about something like thinpoolFactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to make the function reflect the name. 😁
The idea really is to "reserve" this brick capacity and not allocate it for other volumes. This is more than just setting the thinp size. It needs to account for VDO as well.
My big struggle here is that it's pretty straightforward for snapshots, but I'm not sure what to do about the "factor" when cloning. If I have an initial volume w/ reservation=10x and I clone it, presumably, the 10x accounted for the clone... What should the reservation parameter of the clone be? And does the answer change if I'm cloning a single image multiple times vs. cloning a clone of a clone of a...?
Due to the uncertainty here, I'm tempted to drop it for now, but w/o some sort of reservation, clone and snap are pretty useless.
Side note: We're also evaluating removal of LVM due to performance & scale issues.
I was working from: https://kubernetes.io/docs/concepts/storage/storage-classes/#glusterfs
|
This will not be true if we plan on ever supporting more than one Gluster cluster per CSI driver. As we currently promote deploying at least two Gluster clusters in a "full" OpenShift installation, I would think it'd be ridiculous to have two copies of the same driver running.
That would definitely be appreciated!
I've always been confused by this, as well. I'd also like to know more about why this is. |
I don't have any plans to support a single driver accessing multiple gluster clusters. The reasons are:
|
|
as this needs some more discussion, moving out of GCS/0.2 |
The only coordination is FUSE client version >= Server version. The Operator already controls both, such that all Gluster clusters should have the same version / container images. On Operator update, if it comes with a new default image for the server containers, we already have plans to disallow further Operator updates until the full Gluster system reaches quiescence.
Complexity worth the feature, in my opinion. :)
We could still move that to the Operator by having it take care of configuring the StorageClasses with the appropriate credentials automatically.
While the CPU and memory requirements are small, we do still have to consider that the pods themselves count towards the maximum number of pods per node. This means (with the current example deployments) that for every Gluster cluster you have at least two controller pods per Gluster cluster in the Kube/OCP cluster and at lease one node pod on (typically) EVERY schedulable node in the Kube/OCP cluster. |
Signed-off-by: John Strunk <jstrunk@redhat.com>
|
The current PR LGTM. I still have my concerns about continuing to rely on a single cluster per driver model, but that can be addressed in another PR. |
|
Any other reviews? |
|
@JohnStrunk as in comment #30 (comment), I agree with @jarrpa . IMO, we should have support for multiple cluster for a single CSI driver. On that regard, I would like to have @obnoxxx thoughts? |
|
@JohnStrunk one other high level question, how are you planning to match these allowed SC params to GD2 volumecreate fields ? |
|
We can have optional fields for specifying cluster details, but that's not how GCS is going to be deployed. If someone wants to do it all manually (e.g., deploy CSI w/o the rest of the stack), then sure, specifying via SC is fine. Upgrade, versioning, and authentication is on them.
Likely, some of these options cannot currently be expressed to GD2 IVP. Implementation here will need to be coordinated w/ the GD2 team. I assume you're referring to things such as |
@JohnStrunk Most of the param has issues in co-ordination with GD2 api. For example, if we parse How to pass arbitertype to GD2 ? |
|
It appears thin arbiter is designated via a flag. See: gluster/glusterd2#702 |
|
In keeping w/ #154:
|
Describe what this PR does
This PR adds a design for StorageClass parameter fields.
Is there anything that requires special attention?
My goal for this PR is to agree on where we want to end up with the fields in the StorageClass. Since the StorageClass object is the main interface for cluster admins to define the types of storage that should be provided to users, the parameters that we choose to expose and how we do it will significantly affect the admins UX.
Related issues:
none.