(feat): Support user creation on cluster init#56
(feat): Support user creation on cluster init#56utdrmac wants to merge 8 commits intovalkey-io:mainfrom
Conversation
Signed-off-by: drmac <github@matthewboehm.com>
Signed-off-by: drmac <github@matthewboehm.com>
Signed-off-by: drmac <github@matthewboehm.com>
Signed-off-by: drmac <github@matthewboehm.com>
Signed-off-by: drmac <github@matthewboehm.com>
Signed-off-by: drmac <github@matthewboehm.com>
|
alright. make test and make test-e2e both 100% successful in my lab just a few moments ago. |
jdheyburn
left a comment
There was a problem hiding this comment.
Thanks for the contribution! I've some comments, some on code readability, and the logic that you're enforcing. Other comments are around not restricting how Valkey can be configured based on the design of the CRD.
I think documenting the logic under docs/ would be useful so that we can understand what's going on.
At this stage we're open to breaking changes to enhance the CRDs, so the CRD design can be worked on then.
| } | ||
|
|
||
| func getDefaultSecretName(cn string) string { | ||
| return cn + "-secret" |
There was a problem hiding this comment.
I am not a fan of adding the resource Kind into the name of the resource, we already know it's a secret. Is there any reason to keep it?
kubectl get secret valkey-cluster-secret
There was a problem hiding this comment.
I was following the naming pattern used by other database operators I've been studying. This is just the default name, and users can override if they want in the CR.
| instanceLabel = "app.kubernetes.io/instance" | ||
| ) | ||
|
|
||
| func getInternalSecretName(cn string) string { |
There was a problem hiding this comment.
Can we avoid acronyms on variable names? I understand cn here is referring to the clusterName that we're passing from higher up, but if we override the value with something else then it would not align with that. Perhaps prefix would be a better alternative? I saw that there are a few places this is being set.
| Password string `json:"password,omitempty"` | ||
|
|
||
| // Reference to a key in UsersSecretRef containing the password for this user | ||
| // Defaults to the username | ||
| PasswordKeyRef string `json:"passwordKeyRef,omitempty"` |
There was a problem hiding this comment.
ACLs can support multiple passwords for a user. Such an example case would be if the password needs to be rolled, you can have both passwords active while the client switches between them.
Not critical for this PR, but so long as we're open to breaking changes while we're in alpha.
There was a problem hiding this comment.
IMO, that's an advanced operation outside the scope of the operator. If the community consensus is to have such functionality, there's no argument from me, and I'll implement it. Or we can keep things simple for now, and add rotation later on.
| type UserAcl struct { | ||
|
|
||
| // Raw ACL line, including password | ||
| Permissions string `json:"permissions,omitempty"` |
There was a problem hiding this comment.
Thinking out loud: I wonder if the operator should help provide an abstraction away from the syntax for permissions.
There was a problem hiding this comment.
We could, but I didn't want the operator to have to be responsible for validating ACL commands, and syntax. That seemed an overly complicated job for the operator.
| permissions: ~* +@all | ||
| charlie: | ||
| permissions: -@all +get ~secretkey ~valkey:* | ||
| password: sup3rS3cr3t |
There was a problem hiding this comment.
Wouldn't this fail the validation that you have in place?
password:
description: |-
sha256 password
kubebuilder:validation:MinLength=64
kubebuilder:validation:MaxLength=65
kubebuilder:XValidation:message="Password should be a sha256 hash"
type: string
There was a problem hiding this comment.
Wow, so much for e2e tests... Shouldn't this have been caught by the K8S api when the e2e test attempted to apply an invalid CR? (Will remove this anyways)
| sort.Strings(sortedNames) | ||
|
|
||
| // Loop over users and build acl line | ||
| for _, un := range sortedNames { |
There was a problem hiding this comment.
Could you replace un with userName please
| } | ||
|
|
||
| return []byte(aclFileContents) | ||
| } |
There was a problem hiding this comment.
The acl set user spec mentioned the hashed password has to be converted to hexidecimal. Is that being done here?
#<hashedpassword>: Adds the specified hashed password to the list of user passwords. A hashed password is hashed with SHA256 and translated into a hexadecimal string. Example: #c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.
There was a problem hiding this comment.
Yes. It's on users.go:266 // If password string is less than 64 characters, assume plaintext password, and hash it
| // If the password string is 64 characters, assume it is hashed, and prefix it | ||
| hashedPass = "#" + pw |
There was a problem hiding this comment.
This assumption could cause confusion to the user when their 64 char password does not work as intended.
| UsersSecretRef string `json:"usersSecretRef,omitempty"` | ||
|
|
||
| // Array of users with raw ACL, or reference to a key in the UsersSecretRef | ||
| Users map[string]UserAcl `json:"users,omitempty"` |
There was a problem hiding this comment.
I suggest we use listType=map and listMapKey=name here to follow API design guidelines.
https://kubernetes.io/docs/reference/using-api/server-side-apply/
There was a problem hiding this comment.
valkeycluster_types.go:101:2: must apply listType to an array, found object
valkeycluster_types.go:101:2: must apply listMapKey to an array, found object
|
The more I think about it, I would rather we didn't support a password in cleartext. While I agree it is convenient for users, it is insecure and adds complexity to the operator. I believe we should only support reading passwords from secrets, as this is an adopted pattern already in other operators. As for secret creation, we can wrap this in a helm chart or in whatever deployment tool the user has. Then there is a separation of concerns which we don't have to get involved with. I have updated my CRD design for authentication here, I'd appreciate it if you could take a look and leave some comments or additional use cases: https://github.com/valkey-io/valkey-operator/pull/63/changes Apologies that we did not outline the requirements initially; I am hoping now that we have a CRD design proposal that can act as a North Star for how we should be designing APIs. |
I'll agree with that.
I looked over it, and it seems straightforward to me. When do you consider the new CRD finalized in such a way that I can begin implementing it? |
|
From a users/acls configs I don't think I will be changing that. It's open for comments, but given that we're in alpha and developing fast we can implement it now and accepted changes later on. How does that sound? @utdrmac |
|
Good for me. I've got dedicated time to this all week. Will get started. |
|
@utdrmac I added some more detail on the spec below, if that helps. I wonder if it's worth writing the aclfile to a Secret which is mounted whereever |
From my quick research on that, it could be up to a minute for the local kubelet to "remount" the changed Secret. For cluster init, having the mounted secret is easy to ensure users are present at the beginning. With a cluster running, and the file delay, does there need to be a mechanism that accesses each primary, logs in to valkey, and for-each (re)creates the users? That would mean the operator now needs to create/manage a "cluster-operator admin" user, which certainly needs to be present on init. |
|
The eventual consistency is probably fine for this first iteration, so long as part of the Reconcile loop ACL LOAD is being called. We can always add on the extra functionality to set the users live (if users cannot wait for the eventual consistency). For both we would want a user that the operator could authenticate as to perform such reloading of configs. Perhaps we can create one by default. What are your thoughts? |
I'm down for that. That's the same behavior with our other operators. For example, with our MySQL operator, we generate a random password for root (if not provided) and store in the secret for users to fetch if they need to use it. TODO: Document pre-defined |
|
Cool. I think the operator can bootstrap its own admin user first with the permissions that it needs. Later on we can allow the user to override this. I would prefer a generic name other than |
|
Closing this to open a new PR with cleaner git history |
Resolves #36
This feature request allows for the creation of users on cluster init. Users can be added by creating an entry under
spec.valkeyConfig.auth.usersPermissions must be provided otherwise the user is skipped. A password, or password reference must be provided otherwise the user is skipped. Passwords can be provided as plaintext, or as sha256 hash directly in the spec, or the password can be searched for in a referenced secret. If the secret is not specified, a secret with the name "${clustername}-secret" will be searched for.The operator will fetch, or create, a secret named "internal-${clustername}-secret". This secret will be populated with the ACLs built from the spec. It will be automatically mounted to /config/users/ as users.acl. The default config has been adjusted to look for this file.