Skip to content

Conversation

@drebelsky
Copy link
Contributor

@drebelsky drebelsky commented Sep 2, 2025

There seems to be a problem with OIDC refresh. As far as I can tell, the issue is in the underlying C# Kubernetes library. When the id-token is invalid (expired), it appears to use the refresh-token, but it doesn't store the new values of id-token and refresh-token to the ~/.kube/config file. The library the C# Kubernetes library uses for OIDC is now archived, so it was removed as a dependency by version 16.0.7 (see kubernetes-client/csharp#1621). That implementation seems to call the wrong URL to refresh, which causes supercluster to still fail to refresh. However, it does mean that supercluster doesn't consume the refresh-token, so kubectl is able to update auth.

Initially, this PR bumped the KubernetesClient version from 15.0.1 to 16.0.7. This has the downsides brought up below. Now, we use kubectl to attempt to do an id-token refresh if using oidc auth. Note that since we'll presumably need to upgrade the KubernetesClient version at some point, we do need to address proper refresh at some point, also.

@drebelsky
Copy link
Contributor Author

drebelsky commented Sep 3, 2025

It seems a little unreliable to rely on the library performing incorrectly. Additionally, this has an issue with the token expiring in the middle of a run—the run will fail to refresh the OIDC token and exit. We can still refresh using kubectl in this case, but this requires perhaps a little more effort monitoring a supercluster run and may increase testing time (e.g., imagine the case where the token expires just before the end of a run, so the entire run needs to be redone).

Some options:

  • add a wrapper that looks for the OIDC error and tries to call out to kubectl to refresh the token.
  • correctly implement refresh + saving to kube config
    • maintain fork of csharp k8s library
    • try to do something fancy with F# object expressions
    • try to get change accepted upstream
  • assume no supercluster run takes close to a day (our current setting for how long an id-token lasts) and refresh token at supercluster startup (either via kubectl or manual flow)

@drebelsky drebelsky closed this Sep 3, 2025
@drebelsky drebelsky reopened this Sep 3, 2025
@drebelsky drebelsky force-pushed the oidc branch 4 times, most recently from c018f60 to 4454f94 Compare September 5, 2025 15:40
Copy link
Contributor

@bboston7 bboston7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!

@drebelsky drebelsky requested a review from bboston7 September 8, 2025 17:03
@drebelsky drebelsky marked this pull request as ready for review September 8, 2025 19:33
Copy link
Contributor

@bboston7 bboston7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Just found a small typo. Please rebase after fixing and we'll get this merged!

@bboston7
Copy link
Contributor

bboston7 commented Sep 10, 2025

Just making a note here for my own memory: This should gracefully handle machines without kubectl by falling back on the old behavior (proceeding without updating the token). We should be good to merge after that's added.

Co-authored-by: Brett Boston <bboston7@users.noreply.github.com>
Copy link
Contributor

@bboston7 bboston7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@bboston7 bboston7 merged commit b71115b into stellar:main Sep 11, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants