(feat): Support user creation on cluster init by utdrmac · Pull Request #56 · valkey-io/valkey-operator

utdrmac · 2026-01-16T23:05:46Z

Resolves #36

This feature request allows for the creation of users on cluster init. Users can be added by creating an entry under spec.valkeyConfig.auth.users Permissions must be provided otherwise the user is skipped. A password, or password reference must be provided otherwise the user is skipped. Passwords can be provided as plaintext, or as sha256 hash directly in the spec, or the password can be searched for in a referenced secret. If the secret is not specified, a secret with the name "${clustername}-secret" will be searched for.

The operator will fetch, or create, a secret named "internal-${clustername}-secret". This secret will be populated with the ACLs built from the spec. It will be automatically mounted to /config/users/ as users.acl. The default config has been adjusted to look for this file.

Signed-off-by: drmac <github@matthewboehm.com>

api/v1alpha1/valkeycluster_types.go

internal/controller/users.go

internal/controller/deployment.go

Signed-off-by: drmac <github@matthewboehm.com>

internal/controller/users.go

Signed-off-by: drmac <github@matthewboehm.com>

utdrmac · 2026-01-23T16:17:46Z

alright. make test and make test-e2e both 100% successful in my lab just a few moments ago.

jdheyburn

Thanks for the contribution! I've some comments, some on code readability, and the logic that you're enforcing. Other comments are around not restricting how Valkey can be configured based on the design of the CRD.

I think documenting the logic under docs/ would be useful so that we can understand what's going on.

At this stage we're open to breaking changes to enhance the CRDs, so the CRD design can be worked on then.

jdheyburn · 2026-01-24T14:53:15Z

internal/controller/users.go

+}
+
+func getDefaultSecretName(cn string) string {
+	return cn + "-secret"


I am not a fan of adding the resource Kind into the name of the resource, we already know it's a secret. Is there any reason to keep it?

kubectl get secret valkey-cluster-secret

I was following the naming pattern used by other database operators I've been studying. This is just the default name, and users can override if they want in the CR.

jdheyburn · 2026-01-24T14:55:33Z

internal/controller/users.go

+	instanceLabel     = "app.kubernetes.io/instance"
+)
+
+func getInternalSecretName(cn string) string {


Can we avoid acronyms on variable names? I understand cn here is referring to the clusterName that we're passing from higher up, but if we override the value with something else then it would not align with that. Perhaps prefix would be a better alternative? I saw that there are a few places this is being set.

Sure, I can change the acronym

jdheyburn · 2026-01-24T15:00:11Z

api/v1alpha1/valkeycluster_types.go

+	Password string `json:"password,omitempty"`
+
+	// Reference to a key in UsersSecretRef containing the password for this user
+	// Defaults to the username
+	PasswordKeyRef string `json:"passwordKeyRef,omitempty"`


ACLs can support multiple passwords for a user. Such an example case would be if the password needs to be rolled, you can have both passwords active while the client switches between them.

Not critical for this PR, but so long as we're open to breaking changes while we're in alpha.

IMO, that's an advanced operation outside the scope of the operator. If the community consensus is to have such functionality, there's no argument from me, and I'll implement it. Or we can keep things simple for now, and add rotation later on.

jdheyburn · 2026-01-24T15:02:39Z

api/v1alpha1/valkeycluster_types.go

+type UserAcl struct {
+
+	// Raw ACL line, including password
+	Permissions string `json:"permissions,omitempty"`


Thinking out loud: I wonder if the operator should help provide an abstraction away from the syntax for permissions.

https://valkey.io/commands/acl-setuser/

We could, but I didn't want the operator to have to be responsible for validating ACL commands, and syntax. That seemed an overly complicated job for the operator.

jdheyburn · 2026-01-24T15:05:01Z

config/samples/v1alpha1_valkeycluster.yaml

+          permissions: ~* +@all
+        charlie:
+          permissions: -@all +get ~secretkey ~valkey:*
+          password: sup3rS3cr3t


Wouldn't this fail the validation that you have in place?

password: description: |- sha256 password kubebuilder:validation:MinLength=64 kubebuilder:validation:MaxLength=65 kubebuilder:XValidation:message="Password should be a sha256 hash" type: string

Wow, so much for e2e tests... Shouldn't this have been caught by the K8S api when the e2e test attempted to apply an invalid CR? (Will remove this anyways)

jdheyburn · 2026-01-24T15:08:38Z

internal/controller/users.go

+	sort.Strings(sortedNames)
+
+	// Loop over users and build acl line
+	for _, un := range sortedNames {


Could you replace un with userName please

jdheyburn · 2026-01-24T15:11:05Z

internal/controller/users.go

+	}
+
+	return []byte(aclFileContents)
+}


The acl set user spec mentioned the hashed password has to be converted to hexidecimal. Is that being done here?

#<hashedpassword>: Adds the specified hashed password to the list of user passwords. A hashed password is hashed with SHA256 and translated into a hexadecimal string. Example: #c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.

https://valkey.io/commands/acl-setuser/

Yes. It's on users.go:266 // If password string is less than 64 characters, assume plaintext password, and hash it

jdheyburn · 2026-01-24T15:13:39Z

internal/controller/users.go

+			// If the password string is 64 characters, assume it is hashed, and prefix it
+			hashedPass = "#" + pw


This assumption could cause confusion to the user when their 64 char password does not work as intended.

jdheyburn · 2026-01-24T15:24:32Z

api/v1alpha1/valkeycluster_types.go

+	UsersSecretRef string `json:"usersSecretRef,omitempty"`
+
+	// Array of users with raw ACL, or reference to a key in the UsersSecretRef
+	Users map[string]UserAcl `json:"users,omitempty"`


I suggest we use listType=map and listMapKey=name here to follow API design guidelines.

https://kubernetes.io/docs/reference/using-api/server-side-apply/

valkeycluster_types.go:101:2: must apply listType to an array, found object valkeycluster_types.go:101:2: must apply listMapKey to an array, found object

jdheyburn · 2026-01-24T16:58:59Z

@utdrmac

The more I think about it, I would rather we didn't support a password in cleartext. While I agree it is convenient for users, it is insecure and adds complexity to the operator.

I believe we should only support reading passwords from secrets, as this is an adopted pattern already in other operators. As for secret creation, we can wrap this in a helm chart or in whatever deployment tool the user has. Then there is a separation of concerns which we don't have to get involved with.

I have updated my CRD design for authentication here, I'd appreciate it if you could take a look and leave some comments or additional use cases:

https://github.com/valkey-io/valkey-operator/pull/63/changes

Apologies that we did not outline the requirements initially; I am hoping now that we have a CRD design proposal that can act as a North Star for how we should be designing APIs.

utdrmac · 2026-01-26T19:08:15Z

@jdheyburn

I believe we should only support reading passwords from secrets, as this is an adopted pattern already in other operators.

I'll agree with that.

I have updated my CRD design for authentication here

I looked over it, and it seems straightforward to me. When do you consider the new CRD finalized in such a way that I can begin implementing it?

jdheyburn · 2026-01-26T19:48:34Z

From a users/acls configs I don't think I will be changing that. It's open for comments, but given that we're in alpha and developing fast we can implement it now and accepted changes later on. How does that sound? @utdrmac

utdrmac · 2026-01-26T19:53:40Z

Good for me. I've got dedicated time to this all week. Will get started.

jdheyburn · 2026-01-26T20:48:51Z

@utdrmac I added some more detail on the spec below, if that helps. I wonder if it's worth writing the aclfile to a Secret which is mounted whereever aclfile is reading it from - which could then read it via ACL LOAD. The only query I have is how long the propagation would be from Secret being updated to being available on the mount. I believe it may take some time for changes to appear on the mount.

(feat) Support ACL user creation on cluster init #36 (comment)

utdrmac · 2026-01-26T21:05:07Z

The only query I have is how long the propagation would be from Secret being updated to being available on the mount. I believe it may take some time for changes to appear on the mount.

From my quick research on that, it could be up to a minute for the local kubelet to "remount" the changed Secret. For cluster init, having the mounted secret is easy to ensure users are present at the beginning. With a cluster running, and the file delay, does there need to be a mechanism that accesses each primary, logs in to valkey, and for-each (re)creates the users? That would mean the operator now needs to create/manage a "cluster-operator admin" user, which certainly needs to be present on init.

jdheyburn · 2026-01-26T21:18:24Z

The eventual consistency is probably fine for this first iteration, so long as part of the Reconcile loop ACL LOAD is being called. We can always add on the extra functionality to set the users live (if users cannot wait for the eventual consistency).

For both we would want a user that the operator could authenticate as to perform such reloading of configs. Perhaps we can create one by default.

What are your thoughts?

utdrmac · 2026-01-26T21:47:30Z

Perhaps we can create one by default.

I'm down for that. That's the same behavior with our other operators. For example, with our MySQL operator, we generate a random password for root (if not provided) and store in the secret for users to fetch if they need to use it.

TODO: Document pre-defined clusterAdmin user; fetching password.

jdheyburn · 2026-01-26T22:01:53Z

Cool. I think the operator can bootstrap its own admin user first with the permissions that it needs. Later on we can allow the user to override this.

I would prefer a generic name other than clusterAdmin, perhaps simply operator would be fine, since this would be applicable for the operator in all Valkey modes (including non-Cluster).

utdrmac · 2026-02-10T16:44:20Z

Closing this to open a new PR with cleaner git history

utdrmac added 4 commits January 15, 2026 22:48

add support for users

c323af4

Signed-off-by: drmac <github@matthewboehm.com>

merge upstream

cb2ef36

Signed-off-by: drmac <github@matthewboehm.com>

add support for creating users

7f47abd

Signed-off-by: drmac <github@matthewboehm.com>

golangci fix

3b1dbf1

Signed-off-by: drmac <github@matthewboehm.com>

sandeepkunusoth reviewed Jan 16, 2026

View reviewed changes

api/v1alpha1/valkeycluster_types.go Outdated Show resolved Hide resolved