Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TT-12965] ctx.GetOASDefinition() has very poor performance #6471

Open
JanMa opened this issue Aug 23, 2024 · 5 comments
Open

[TT-12965] ctx.GetOASDefinition() has very poor performance #6471

JanMa opened this issue Aug 23, 2024 · 5 comments

Comments

@JanMa
Copy link

JanMa commented Aug 23, 2024

Branch/Environment/Version

  • Branch/Version: 5.4.0
  • Environment: On-Prem

Describe the bug
To work around some limitations in Tyks mTLS support (see #6259), I am maintaining a custom go plugin for mTLS at Paymenttools. We recently enhanced that plugin to support OAS APIs as well and noticed a large performance regression.

To get custom per-api plugin configuration at runtime, the plugin calls ctx.GetOASDefinition on the context object provided to the plugin. This function performs a deep copy of the stored *oas.OAS object before returning it which first marshals and then unmarshals the object to JSON.

This turns out to be a very expensive operation and severely degrades performance of your plugin if the function is called in the hot path for every incoming request. By removing this function and manually getting the *oas.OAS pointer from the passed context, we could reduce our memory usage by 95% and CPU usage by 46% which severely improved the overall performance of our Tyk setup.

Below you can see two generated flame graphs which show the difference (please ignore the color inconsistency).

Memory:
mem_flamegraph

CPU:
cpu_flamegraph

Proposed solution

We've been running this fix for some time now in production and could not notice any negative side effects which might be explained by not operating on a deep copy of the passed OAS definition. Therefore I propose to remove the unneeded deep copy of the *oas.OAS object before returning it.

@JanMa JanMa added the bug label Aug 23, 2024
@buger
Copy link
Member

buger commented Aug 24, 2024

Hi! I wonder if you can overcome it by having in memory cache inside plugin. When api is reloaded your plugin will be reloaded in any case. What do you think?

@JanMa
Copy link
Author

JanMa commented Aug 26, 2024

Hi @buger, we already make heavy use of caching for the authentication part of our plugin. The cache is initialized in the plugin's init function and the high level workflow is as follows:

  1. get OAS definition from passed context
  2. fetch API name and plugin config data
  3. check cache if the key <api name>-<client cert hash> already exists
    1. if yes, allow request
  4. perform client cert validation with config from fetched plugin config data
    1. if cert is valid, add key to cache and allow request

We use the same plugin for multiple APIs so we need it to be configurable and make use of the plugin config data.
I guess we could work around fetching the OAS definition only once and then caching it based on some other value which we know without it 🤔 .

But this would mean we can't ensure that a new plugin config is immediately applied when the API config is changed in Tyk. Because we share the same plugin with multiple APIs, the plugin's init function is only triggered the first time it is loaded.

@buger
Copy link
Member

buger commented Aug 28, 2024

When api is changed on Tyk side, your plugin is reloaded with api. So its cache kind will be reloaded as well.

@lghiur lghiur changed the title ctx.GetOASDefinition() has very poor performance [TT-12965] ctx.GetOASDefinition() has very poor performance Sep 2, 2024
@blagerweij
Copy link

@JanMa we noticed a similar performance hit in our plugin, out of curiosity how have you been able to work around the issue? Something like this ?

	v := r.Context().Value(ctx.OASDefinition)
	var apiDef *oas.OAS = nil
	if v != nil {
		apiDef = v.(*oas.OAS)
	}

And is your Go plugin an 'Auth' plugin, and are you using the Session state to cache the user, or are you using a custom cache?

@JanMa
Copy link
Author

JanMa commented Sep 3, 2024

Hi @blagerweij ,
we worked around the issue like this:

	v := r.Context().Value(ctx.OASDefinition)
	if v == nil {
		w.WriteHeader(http.StatusInternalServerError)
		return
	}

	apidef, ok := v.(*oas.OAS)
	if !ok {
		w.WriteHeader(http.StatusInternalServerError)
		return
	}

The plugin is an Auth plugin, but for mutual TLS. This means we can't make use of the ID Extractor mechanism provided by Tyk because we have no auth information which is passed in a header. Instead we just use a go-cache cache which the plugin initializes in it's init() function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants