Skip to content

Conversation

@squaremo
Copy link
Contributor

@squaremo squaremo commented Oct 15, 2025

After removing the references from Group objects to OrganizationUser objects (#314), the next step is to rewrite RBAC to rely only on the claims in the userinfo. This breaks the coupling between the RBAC and the user database, so that an external provider can be used.

The custom claims in question are in an optional, namespaced object in the userinfo claims. The identity service populates it for users and service accounts that it holds; platform-specific hooks will need to be used for other providers.

The correctness of the changes is demonstrated by continuing to pass the ACL tests in pkg/rbac/groups_test.go (with the necessary change to provide the custom claims).

NB this is currently targeted at decouple-rbac-groups because it depends on the .Subjects field in the Group type. It'll be best to get #314 merged, then retarget this one.

@squaremo squaremo force-pushed the decouple-auth-from-userdb branch from c87b1e7 to 2e1d83b Compare October 15, 2025 15:58
@squaremo squaremo force-pushed the decouple-rbac-groups branch 4 times, most recently from 2d1bf1f to 371f5c9 Compare October 28, 2025 17:17
This adds a test that ACLs are calculated correctly, given a
particular set up of group membership.

The aim of this test is that, with the obvious necessary adjustments
to the setup, it should still pass after migrations, demonstrating
that the migration has not affected ACL calculation.
Since I'm trying to make a change in a backward-compatible way, I am
going to use the API handler rather than the Kubernetes client for
creating users. If the change goes well, the tests should pass without
alteration -- because "backwarbcompatible.

This ends up saving lines, because the handler does all the object
creation.
This commit adds a field for referring to users directly. If using an
external IdP, we will not have the User and OrganizationUser records
to refer to (in .UserIDs).

This is to some extent backward-compatible: if the callers keep using
the UserIDs field, then the server will keep populating it -- while also
populating the Subjects field, to use in calculations.

The part that isn't backward-compatible is that once you send a
request with Subjects supplied, then UserIDs is forgotten about. These
fields are not interchangeable, because the subjects can refer to
users somewhere else.
This modifies another place that groups are updated: when user records
are created, updated or deleted. Since this only happens when the UNI
user database is in use, both .UserIDs (refering to OrganizationUser
objects) and .Subjects are updated.
Superficial changes so that the linter passes this.
@squaremo squaremo force-pushed the decouple-rbac-groups branch from 371f5c9 to a90f76e Compare October 29, 2025 11:31
This changes the type of a group's subjects to be `{ID, Email,
Issuer}`. The motivation is to give the system, user interfaces, and
troubleshooting humans, some idea of provenance.

The key for the records is then {ID, Issuer}, with the email
effectively being a display name. For this reason, when adding and
removing subjects from groups, both the ID and Issuer are
considered. (It's not _that_ likely that subject IDs will collide --
only really if two user databases use the email as the subject, which
UNI does, but no-one else.)

The UNI user database is a special case, in that it uses the User
email as the subject ID. The email is used as the subject because UNI
issues tokens with the email as the subject; when RBAC is decoupled
entirely from the user database, it will need to be able to look up
group subjects by whatever is in the token.
In many places, an issuer URL is needed for minting tokens. Most of
these are in response to HTTP requests, and use the request Host
header as a shortcut: the rationale would be, if the request was
routed here, this must be the issuer.

However, the server has a flag for setting the URL to use in the
well-known endpoint -- and any token that is expected to validate at
the server should use the same URL. To enforce so, this commit passes
the issuer URL value through to everything dealing with token Issuer
claims, so those components have the same value whatever an individual
request suggests. (To be clear: I don't think there's a danger of this
being a security hole, so much as it's a trap leaving it to
coincidence.)

That includes the Group client, which needs to put an issuer URL in
the Subject records it synthesises.
@squaremo squaremo force-pushed the decouple-rbac-groups branch from a90f76e to ba28658 Compare October 29, 2025 13:43
In a previous commit, the Group type was changed so that it uses the
user subject as the key for refering to users, rather than the ID of
the OrganizationUser object representing membership of the
organisation. In running systems, the UserIDs field will be populated,
so we need to port those to use subjects instead.

This does the simplest version of that, by running a func from main.go
on startup. The func will update the subjects field. It only makes
sense to do this once, so it will only port objects if the subject
field has no entries. The UserIDs field is left alone, so that
processes still using it do not break.

The subject is obtained by looking at the OrganizationUser object
referenced in the organisation namespace; then finding the
corresponding User record in the "global" namespace (usually
`unikorn-identity`). The email is used as both the subject record's
email value (for display), and for the ID -- usually you'd expect the
ID to be something opaque, but since UNI issues tokens with the email
as the subject, we use that as the ID.
I'm trying to decouple authorisation from identity, so that (at least
in principle) an authorisation service can be run separately to the
identity service, and without the user database kept by the identity
service.

One bit to detangle is that `RBAC.GetACL()` relies on knowing the kind
of account represented by an access token.

This works at present because the handler that calls GetACL is behind
middleware (pkg/middleware/openapi/local) which puts that information
in the context. But since this relies on running in the same place as
the token issuer, which has the decryption keys, a decoupled
authorisation service would not be able to see that information.

So: include the account type amongst the claims returned by /userinfo,
so that it will be visible to the middleware, whether called remotely
or in-process with the issuer. The custom claims go in a namespaced
sub-object, to avoid pollution.

This custom claim will _not_ be present by default in /userinfo
returned by other providers; but that can be worked around:

 - assume a /userinfo response without the field is a user (works if
   system and service accounts are only internal to UNI);

 - add the claim to /userinfo results from the provider, by using a
   platform-specific extension mechanism (platforms like Auth0 let you
   add claims into the ID tokens).
This commit adds checks that variously-constructed access tokens will
result in appropriate custom claims in their /userinfo response.
One obstacle in trying to disentangle authorisation from the IdP part
of UNI is that the authorisation server needs to know organisation
memberships, in order for it to calculate ACLs. If the memberships
(along with the users) are recorded in another IdP, it can't access
them.

There's different ways to solve this:

- have a parallel implementation of authorisation that works with the
  IdP of choice; or,
- integrate IdP management APIs into the authorisation service, to be
  switched on with flags;

but the cleanest one is to include the membership information in the
ID token (or rather, the /userinfo response) as a custom claim. Any
external IdP service is likely to have mechanisms for putting custom
claims into tokens.
This commit separates the user- and membership-oriented operations
from the RBAC queries. The eventual aim is that RBAC does not depend
on running with the user database, since that precludes using an
external provider for authentication.

Mostly this is a simple case of moving methods to a new abstraction in
pkg/userdb, then fixing up the callers.

One wrinkle is that the Onboard method in oauth2 needs to produce a
"super user" context, and that can only be done with understanding of
RBAC; but at least this dependency runs from oauth2 to RBAC, and not
from RBAC to userdb.
This commit removes the user database methods from RBAC, which have
been moved to pkg/userdb, and rewrites the GetACLs method to use the
information in the userinfo instead. This means the RBAC no longer
relies on the service account, user and org'user records; so it will
work with an external provider, so long as that provider can be
persuaded to include the custom claims in userinfo.

This also needs the Subject field to be populated in Group
objects. For a live system, this would mean running some migration to
populate those fields from the OrganizationUser records.
Broadly, these are the same as users, but ServiceAccounts can only
belong to one org so that's checked explicitly.

This also fixes, and checks, something that ~worked before by
coincidence. If you ask for a service account's ACL for an
organisation, and it's not bound for that organisation, then it would
succeed but give you no project-level permissions. But this was simply
because the groups and projects in the service accounts home
organisation were not that likely to line up with the groups and
projects in the queried organisation.

Now if the queried organization is not the service account's home
organization, it returns an error, just as for a user who is not a
member of the organization.
@squaremo squaremo force-pushed the decouple-auth-from-userdb branch from 2e1d83b to 812adf1 Compare October 30, 2025 10:55
@squaremo squaremo force-pushed the decouple-rbac-groups branch 2 times, most recently from 9c16346 to ba28658 Compare October 31, 2025 09:11
Base automatically changed from decouple-rbac-groups to main October 31, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants