Non-obvious behavior with long lived `oidc.Provider` objects: `remoteKeySet` caches original context #339

kuttas · 2022-06-01T04:48:51Z

I noticed a non-obvious "bug" in our codebase when using the go-oidc package to verify JWTs. It wasn't expected to me or documented in oidc.NewProvider that the supplied context must never be canceled if you are going to re-use the resulting Provider, though on further investigation it is mentioned in the underlying comments for oidc.NewRemoteKeySet. This is a little weird because it means that a Provider object created in the context of an incoming HTTP server request cannot be cached and re-used safely in future requests unless you explicitly pass in a separate context that will never be canceled (e.g. context.Background().

My default assumption was to pass in the context from the HTTP request, but HTTP request contexts are canceled after ServeHTTP is finished (according to https://pkg.go.dev/net/http#Request.Context). This led to a fairly hard-to-track-down bug in our codebase since we have a runtime cache of oidc.Providers, but sometimes they would get created during one request but wouldn't fetch JWKS until a future request, at which point the fetch would fail with: failed to verify signature: fetching keys oidc: get keys failed Get "https://<some-oidc-provider>/.well-known/jwks.json": context canceled

The fact that remoteKeySet internally stores the context runs counter to the Golang standard recommendations of never storing contexts in structs, and always using the one passed in (e.g. the guidance given here https://go.dev/blog/context-and-structs and here https://pkg.go.dev/context). In this case, it seems odd/unintuitive that remoteKeySet (which actually accepts a context arg in most methods) would use its own stored context r.ctx in updateKeys() even though keysFromRemote (which is the only method that calls updateKeys()) takes a context arg and could just pass that in?

Surprisingly it seems almost no one else has run into this; I only found one other issue via google search that seemed similar (dexidp/dex#1255).

Is this an oversight or intentional?

The text was updated successfully, but these errors were encountered:

ericchiang · 2022-06-01T11:16:31Z

Some background:

Part of the issue is that the remote key set needs a context to refresh the keys in the background. In hindsight, creating a remote key set probably always should have taken a context.

So instead of:

func (p *Provider) Verifier(config *Config) *IDTokenVerifier

Something like:

func (p *Provider) NewVerifier(ctx context.Context, config *Config) *IDTokenVerifier

Which would hopefully produce a clearer intent about what context corresponds to what variable lifetime.

Maybe we could add "NewVerifier" and deprecate "Verifier"?

kuttas · 2022-06-01T16:38:24Z

I think that explicitly taking in a new context for the verifier makes sense, though I think that (generally) the context is less important in New* methods and more important in the methods that actually do the work. In this case I think IDTokenVerifier.Verify() is the method whose context should be used for requests, if possible?

When you say that the remote key set needs to refresh a context in the background, are you referring to the goroutine in keysFromRemote? If I understand right, the select statement in there is going to block til the background operation completes, so from a lexical scope point-of-view, the context that is passed in to keysFromRemote should be valid for the entire duration of the r.updateKeys() call inside, right? Is there a reason that updateKeys doesn't take ctx as a parameter? I could see that, perhaps, the intent is to ensure that the RemoteKeySet always uses the same context values as the context passed in to create the Provider? (Apologies if that doesn't make sense, I am fairly new to Golang concurrency)

On a related note, it seems odd that Provider creates the remoteKeySet but never actually uses it directly from what I can see; instead it only gets stored in the Provider (which is long lived for us) but used by the IDTokenVerifier created from the Provider (which is transient for us). Is there a recommended best practice for which of these objects should be short- vs long-lived (and which is safe to cache) in a server application that needs to verify many tokens (e.g. an API server which validates Bearer tokens) across many requests?

ericchiang · 2022-06-08T15:09:10Z

Hey sorry for the delay. I was traveling.

I'm not entirely sure if this answers your questions about goroutines, but the reason updateKeys doesn't use the context from keysFromRemote is because many clients may be requesting a key at a given time, and I don't want one of those clients canceling the context to impact other clients.

go-oidc/oidc/jwks.go

Line 185 in 26c5037

keys, err := r.updateKeys()

Probably worth a comment in the code.

We have the following documentation on RemoteKeySet but not IDTokenVerifier. Will send an update for the comment:

The returned KeySet is a long lived verifier that caches keys based on any keys change. Reuse a common remote key set instead of creating new ones as needed.

parsnips · 2022-11-04T04:52:15Z

@kuttas great sleuthing. I encountered this same thing this week with a perma-canceled cache context. A warning on the tin here would be good. Even better, if the context on the remoteKeySet struct were to be excised somehow.

ericchiang mentioned this issue Jan 13, 2023

oidc: restrict use of context.Background() #364

Merged

ericchiang closed this as completed in #364 Feb 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-obvious behavior with long lived `oidc.Provider` objects: `remoteKeySet` caches original context #339

Non-obvious behavior with long lived `oidc.Provider` objects: `remoteKeySet` caches original context #339

kuttas commented Jun 1, 2022 •

edited

Loading

ericchiang commented Jun 1, 2022

kuttas commented Jun 1, 2022

ericchiang commented Jun 8, 2022

parsnips commented Nov 4, 2022

Non-obvious behavior with long lived oidc.Provider objects: remoteKeySet caches original context #339

Non-obvious behavior with long lived oidc.Provider objects: remoteKeySet caches original context #339

Comments

kuttas commented Jun 1, 2022 • edited Loading

ericchiang commented Jun 1, 2022

kuttas commented Jun 1, 2022

ericchiang commented Jun 8, 2022

parsnips commented Nov 4, 2022

Non-obvious behavior with long lived `oidc.Provider` objects: `remoteKeySet` caches original context #339

Non-obvious behavior with long lived `oidc.Provider` objects: `remoteKeySet` caches original context #339

kuttas commented Jun 1, 2022 •

edited

Loading