Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-obvious behavior with long lived oidc.Provider objects: remoteKeySet caches original context #339

Closed
kuttas opened this issue Jun 1, 2022 · 4 comments · Fixed by #364

Comments

@kuttas
Copy link

kuttas commented Jun 1, 2022

I noticed a non-obvious "bug" in our codebase when using the go-oidc package to verify JWTs. It wasn't expected to me or documented in oidc.NewProvider that the supplied context must never be canceled if you are going to re-use the resulting Provider, though on further investigation it is mentioned in the underlying comments for oidc.NewRemoteKeySet. This is a little weird because it means that a Provider object created in the context of an incoming HTTP server request cannot be cached and re-used safely in future requests unless you explicitly pass in a separate context that will never be canceled (e.g. context.Background().

My default assumption was to pass in the context from the HTTP request, but HTTP request contexts are canceled after ServeHTTP is finished (according to https://pkg.go.dev/net/http#Request.Context). This led to a fairly hard-to-track-down bug in our codebase since we have a runtime cache of oidc.Providers, but sometimes they would get created during one request but wouldn't fetch JWKS until a future request, at which point the fetch would fail with: failed to verify signature: fetching keys oidc: get keys failed Get "https://<some-oidc-provider>/.well-known/jwks.json": context canceled

The fact that remoteKeySet internally stores the context runs counter to the Golang standard recommendations of never storing contexts in structs, and always using the one passed in (e.g. the guidance given here https://go.dev/blog/context-and-structs and here https://pkg.go.dev/context). In this case, it seems odd/unintuitive that remoteKeySet (which actually accepts a context arg in most methods) would use its own stored context r.ctx in updateKeys() even though keysFromRemote (which is the only method that calls updateKeys()) takes a context arg and could just pass that in?

Surprisingly it seems almost no one else has run into this; I only found one other issue via google search that seemed similar (dexidp/dex#1255).

Is this an oversight or intentional?

@ericchiang
Copy link
Collaborator

Some background:

Part of the issue is that the remote key set needs a context to refresh the keys in the background. In hindsight, creating a remote key set probably always should have taken a context.

So instead of:

func (p *Provider) Verifier(config *Config) *IDTokenVerifier

Something like:

func (p *Provider) NewVerifier(ctx context.Context, config *Config) *IDTokenVerifier

Which would hopefully produce a clearer intent about what context corresponds to what variable lifetime.

Maybe we could add "NewVerifier" and deprecate "Verifier"?

@kuttas
Copy link
Author

kuttas commented Jun 1, 2022

I think that explicitly taking in a new context for the verifier makes sense, though I think that (generally) the context is less important in New* methods and more important in the methods that actually do the work. In this case I think IDTokenVerifier.Verify() is the method whose context should be used for requests, if possible?

When you say that the remote key set needs to refresh a context in the background, are you referring to the goroutine in keysFromRemote? If I understand right, the select statement in there is going to block til the background operation completes, so from a lexical scope point-of-view, the context that is passed in to keysFromRemote should be valid for the entire duration of the r.updateKeys() call inside, right? Is there a reason that updateKeys doesn't take ctx as a parameter? I could see that, perhaps, the intent is to ensure that the RemoteKeySet always uses the same context values as the context passed in to create the Provider? (Apologies if that doesn't make sense, I am fairly new to Golang concurrency)

On a related note, it seems odd that Provider creates the remoteKeySet but never actually uses it directly from what I can see; instead it only gets stored in the Provider (which is long lived for us) but used by the IDTokenVerifier created from the Provider (which is transient for us). Is there a recommended best practice for which of these objects should be short- vs long-lived (and which is safe to cache) in a server application that needs to verify many tokens (e.g. an API server which validates Bearer tokens) across many requests?

@ericchiang
Copy link
Collaborator

Hey sorry for the delay. I was traveling.

I'm not entirely sure if this answers your questions about goroutines, but the reason updateKeys doesn't use the context from keysFromRemote is because many clients may be requesting a key at a given time, and I don't want one of those clients canceling the context to impact other clients.

keys, err := r.updateKeys()

Probably worth a comment in the code.

We have the following documentation on RemoteKeySet but not IDTokenVerifier. Will send an update for the comment:

The returned KeySet is a long lived verifier that caches keys based on any keys change. Reuse a common remote key set instead of creating new ones as needed.

@parsnips
Copy link

parsnips commented Nov 4, 2022

@kuttas great sleuthing. I encountered this same thing this week with a perma-canceled cache context. A warning on the tin here would be good. Even better, if the context on the remoteKeySet struct were to be excised somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants