Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: TokenRefreshInterceptor throws when running incluster #3445

Merged
merged 5 commits into from
Sep 21, 2021

Conversation

tsaarni
Copy link
Contributor

@tsaarni tsaarni commented Sep 3, 2021

Description

Token refresh was unsuccessful after failed authentication when running incluster. The reason was that TokenRefreshInterceptor was expecting kubeconfig file to exists, even though in incluster mode it does not exist.

  • Removes kubeconfig handling from TokenRefreshInterceptor and re-uses existing logic in autoconfig instead, which is prepared for both in/out-of-cluster use cases.
  • Adds system property for previously hardcoded token path to facilitate unit testing.
  • Adds copy of auth-provider from model to config to avoid config parsing in TokenRefreshInterceptor.

Fixes #3312

Signed-off-by: Tero Saarni tero.saarni@est.tech

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

Token refresh after failed authentication was unsuccessful since
TokenRefreshInterceptor was looking for reloading kubeconfig file even when
running in incluster mode.

This change removes kubeconfig handling from TokenRefreshInterceptor and
re-uses the logic in autoconfig instead, which is prepared for both in/out-of
cluster use cases.

Signed-off-by: Tero Saarni <tero.saarni@est.tech>
@tsaarni tsaarni marked this pull request as ready for review September 6, 2021 18:59
@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 6, 2021

I've done some manual testing with token renewal as well. The procedure is documented here.

It seems OIDC path was never covered by a test case. Since I touched that line, SonarCloud alerted about it here. Should this PR introduce OIDC renewal test case?

@rohanKanojia
Copy link
Member

rohanKanojia commented Sep 7, 2021

It seems OIDC path was never covered by a test case. Since I touched that line, SonarCloud alerted about it here. Should this PR introduce OIDC renewal test case?

Yeah, Would be nice if you can write a test to cover this :-) . I don't mind it having in this PR

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 7, 2021

Yeah, Would be nice if you can write a test to cover this :-) . I don't mind it having in this PR

@rohanKanojia OK, I'll check if I can figure out an approach to test that part as well.

By the way, some checks seem bit flaky, is that to be expected? They seemed unrelated to the change.

@rohanKanojia
Copy link
Member

You're right. I've restarted them for now. Somehow these failures only seem to happen on Github Actions.

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 7, 2021

I've now added a test case shouldRefreshOIDCToken() to ensure that TokenRefreshInterceptor also refreshes token in case of OIDC auth provider.

@sonarcloud
Copy link

sonarcloud bot commented Sep 7, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

84.0% 84.0% Coverage
0.0% 0.0% Duplication

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 16, 2021

Thank you @oscerd for the review. I'd like to kindly ask if @manusa and @shawkins could review too?
It would be nice to get this forward since this bug breaks applications on clusters with expiring service account tokens.

Copy link
Contributor

@shawkins shawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes seem good - if I understand correctly, we get an unauthorized response, then it won't assume kubeconfig exists and look to the AuthProvider if possible.

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 16, 2021

if I understand correctly, we get an unauthorized response, then it won't assume kubeconfig exists and look to the AuthProvider if possible.

Yes when we get unauthorized, the logic in Config.autoConfigure() is first look to look for kubeconfig. If it exists, it might contain AuthProvider config for OIDC provider which is used to fetch refreshed token (in that case we execute branch on line 3), or just pure token (line 5). If kubeconfig does not exist Config.autoConfigure() will look for service account token file and re-read the renewed token from there (we'll also end up in branch on line 5).

1 Config newestConfig = Config.autoConfigure(currentContextName);
2 if (newestConfig.getAuthProvider() != null && newestConfig.getAuthProvider().getName().equalsIgnoreCase("oidc")) {
3   newAccessToken = OpenIDConnectionUtils.resolveOIDCTokenFromAuthConfig(newestConfig.getAuthProvider().getConfig());
4 } else {
5   newAccessToken = newestConfig.getOauthToken();
6 }

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 16, 2021

Is it possible to re-trigger failed tests to get a clean run? I think those are flakes.

@tsaarni
Copy link
Contributor Author

tsaarni commented Sep 18, 2021

One more observation:

My PR continues using the approach from existing implementation where reloading is based on 401 Unauthorized response. This postpones the reload to the very last possible moment, or one could also say it reloads "too late". While this works correctly, the service account token on disk is refreshed in advance and 401 Unauthorized could be avoided.

Looking at other implementations kubernetes-client/java#967 reloads if 60 seconds has passed from previous reload. It seems to me this is not as straightforward to implement with Config.autoConfigure(), since it always goes through the whole process of checking the kubeconfig, falling back to service account if not successful.

Why would one use time based reloading?

To help transitioning to expiring tokens, KEP 1205 describes that tokens include kubernetes.io/warnafter field that is used to call out clients that do not reload often. Metrics serviceaccount_stale_tokens_total and audit annotation authentication.k8s.io/stale-token can be used for monitoring this (see here). When client uses 401 Unauthorized for reloading, it might cause some unnecessary concerns due to false "stale token" signals.

@manusa
Copy link
Member

manusa commented Sep 21, 2021

Why would one use time based reloading?

I think we discussed this in some other issue.

I would only go for time based reloading if the expiration date is something known (e.g. kubernetes.io/warnafter is available).
Having a periodic check for something that might be unnecessary most of the times and that is easily solved by refreshing the token after the first unauthorized response, seems like an overhead to me.

Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@manusa manusa merged commit 5b7b0e7 into fabric8io:master Sep 21, 2021
@manusa manusa added this to the 5.8.0 milestone Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reload token when kube config file does not exist, errors shows
5 participants