Skip to content

Conversation

@buraksezer
Copy link
Contributor

@buraksezer buraksezer commented Nov 7, 2025

PR for https://tyktech.atlassian.net/browse/TT-15683

Ticket Details

TT-15683
Status In Code Review
Summary Add JWKS cache flush to the Dashboard API and MDCB

Generated at: 2025-11-14 12:27:29

@buraksezer buraksezer force-pushed the feat/TT-15683/jwks-cache-flush-mdcb branch from 1e0c8f8 to f496005 Compare November 7, 2025 11:05
@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

🎯 Recommended Merge Targets

Based on JIRA ticket TT-15683: Add JWKS cache flush to the Dashboard API and MDCB

Fix Version: Tyk 5.11.0

⚠️ Warning: Expected release branches not found in repository

Required:

  • master - No matching release branches found. Fix will be included in future releases.

📋 Workflow

  1. Merge this PR to master first

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

API Changes

--- prev.txt	2025-11-14 12:28:12.031183591 +0000
+++ current.txt	2025-11-14 12:28:02.574202432 +0000
@@ -10930,8 +10930,9 @@
 	KeySpaceUpdateNotification   NotificationCommand = "KeySpaceUpdateNotification"
 	OAuthPurgeLapsedTokens       NotificationCommand = "OAuthPurgeLapsedTokens"
 	// NoticeDeleteAPICache is the command with which event is emitted from dashboard to invalidate cache for an API.
-	NoticeDeleteAPICache NotificationCommand = "DeleteAPICache"
-	NoticeUserKeyReset   NotificationCommand = "UserKeyReset"
+	NoticeDeleteAPICache            NotificationCommand = "DeleteAPICache"
+	NoticeUserKeyReset              NotificationCommand = "UserKeyReset"
+	NoticeInvalidateJWKSCacheForAPI NotificationCommand = "InvalidateJWKSCacheForAPI"
 )
 func (n NotificationCommand) String() string
 

@probelabs
Copy link

probelabs bot commented Nov 7, 2025

🔍 Code Analysis Results

This PR introduces a mechanism to flush the JSON Web Key Set (JWKS) cache for a specific API, triggerable either through a direct API call or via a Multi-Data Center Bridge (MDCB) notification.

The core change refactors the cache flushing logic into a new standalone function, invalidateJWKSCacheByAPIID, making it reusable. This function is now invoked by the existing HTTP handler and is also triggered by a new Redis pub/sub event, NoticeInvalidateJWKSCacheForAPI. This enables centralized cache invalidation from the Tyk Dashboard across a distributed gateway cluster. A new test, Test_NoticeInvalidateJWKSCacheForAPI, has been added to validate the notification-based flushing mechanism.

Files Changed Analysis

  • gateway/mw_jwt.go: The JWKS cache invalidation logic has been extracted from the HTTP handler invalidateJWKSCacheForAPIID into a new, reusable function invalidateJWKSCacheByAPIID(apiID string). The original handler is simplified to call this new function.
  • gateway/redis_signals.go: A new notification command, NoticeInvalidateJWKSCacheForAPI, is defined. The gateway's Redis event handler (handleRedisEvent) is updated to listen for this command and trigger invalidateJWKSCacheByAPIID using the API ID from the event payload.
  • gateway/mw_jwt_test.go: A new test case, Test_NoticeInvalidateJWKSCacheForAPI, has been added. It verifies that publishing the NoticeInvalidateJWKSCacheForAPI notification successfully clears the JWKS cache for the specified API.

Architecture & Impact Assessment

What this PR accomplishes

This PR provides a way to manually invalidate the JWKS cache for a specific API. This is critical in scenarios where keys at a JWKS URL are rotated, allowing administrators to force the gateway to fetch the new keys immediately rather than waiting for the cache to expire.

Key technical changes introduced

  1. Refactored Cache Invalidation: The core logic for flushing the JWKS cache is now in a dedicated function, invalidateJWKSCacheByAPIID, promoting reuse and clarity.
  2. New MDCB/Redis Signal: A new signal, NoticeInvalidateJWKSCacheForAPI, is introduced to broadcast cache flush requests to all connected gateways in a cluster.
  3. Dual Invalidation Triggers: The cache flush can now be initiated via two distinct paths: a direct REST API call to a gateway or a broadcast signal from a central component like the Tyk Dashboard.

Affected system components

  • Tyk Gateway: The JWT middleware and the Redis-based signaling component are directly modified.
  • Tyk Dashboard / MDCB: (Implicitly) The Dashboard will need to be updated to send the new NoticeInvalidateJWKSCacheForAPI signal to leverage this feature across a cluster.
  • Gateway Management API: The existing endpoint for cache invalidation (POST /tyk/apis/{apiID}/jwks-cache/flush) remains functionally the same but now uses the refactored logic.

Visualization

The following diagram illustrates the two flows for JWKS cache invalidation:

graph TD
    subgraph Direct API Call
        Admin -- "POST /tyk/apis/{apiID}/jwks-cache/flush" --> A[Gateway API Endpoint]
        A --> B(invalidateJWKSCacheForAPIID handler)
    end

    subgraph MDCB Notification
        Dashboard -- "Publishes event" --> D[Redis Pub/Sub]
        D -- "NoticeInvalidateJWKSCacheForAPI" --> E[Gateway Redis Listener]
        E --> F(handleRedisEvent)
    end

    B --> C{invalidateJWKSCacheByAPIID}
    F --> C
    C --> G([JWKS Cache for API ID])
    G -- "Flush()" --> H((Cache Flushed))
Loading

Scope Discovery & Context Expansion

The changes are well-contained within the gateway's JWT handling and Redis signaling modules. The introduction of the NoticeInvalidateJWKSCacheForAPI signal implies a corresponding change is required in the Tyk Dashboard (or another management component) to publish this event, which is outside the scope of this repository but necessary for the feature to be fully utilized in a distributed environment.

The existing API endpoint is part of the gateway's internal management API, ensuring that both the direct API call and the new MDCB notification use the same underlying cache flush logic for consistent behavior. This enhancement improves the gateway's manageability in a distributed architecture where configuration and state changes must be propagated efficiently.

Metadata
  • Review Effort: 2 / 5
  • Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2025-11-14T12:31:11.353Z | Triggered by: synchronize | Commit: cde4503

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link

probelabs bot commented Nov 7, 2025

🔍 Code Analysis Results

✅ Security Check Passed

No security issues found – changes LGTM.

Architecture Issues (1)

Severity Location Issue
🟡 Warning gateway/mw_jwt.go:1600-1602
The use of `panic` on a failed type assertion can cause the entire gateway process to crash. While a type mismatch in the `JWKCaches` map indicates a severe programming error, a panic is an overly aggressive failure mode for a production gateway. A single API's misconfigured cache could lead to a denial of service for all APIs handled by the gateway instance. A more resilient design would be to log a critical error and return, thereby isolating the fault to the specific API and preventing a system-wide outage.
💡 SuggestionReplace the `panic` with a critical error log. This ensures that the programming error is recorded for developers to fix, but it prevents the gateway from crashing, improving the overall stability and resilience of the system. For example: `mainLog.Errorf("Value in JWKCache is not of type cache.Repository for API ID: %s", apiID)`.

✅ Performance Check Passed

No performance issues found – changes LGTM.

Quality Issues (1)

Severity Location Issue
🟡 Warning gateway/mw_jwt.go:1601
The refactored function `invalidateJWKSCacheByAPIID` contains a `panic` which can crash the service when called from the Redis event handler goroutine. This goroutine does not have a `recover` block, so an unhandled panic will terminate the application. While this panic checks for a programmer error (an invariant violation), moving this code from an HTTP handler (where panics are often recovered) to a context where it's called by a background worker increases the reliability risk.
💡 SuggestionReplace the `panic` with robust error logging to prevent the gateway from crashing due to potential internal state corruption. This makes the service more resilient.

Dependency Issues (1)

Severity Location Issue
🟡 Warning gateway/redis_signals.go:44
This change introduces a new Redis notification command, `NoticeInvalidateJWKSCacheForAPI`, which the gateway is now configured to consume. For this feature to be fully functional in a Multi-Data Center Bridge (MDCB) environment, a corresponding change is required in the publisher component (Tyk Dashboard, using tyk-sink) to send this command. Without the publisher-side implementation, this new cache-flushing capability will not be triggerable via MDCB.
💡 SuggestionEnsure a corresponding ticket and pull request exist for the Tyk Dashboard/tyk-sink to implement the publishing of the `NoticeInvalidateJWKSCacheForAPI` command to complete the end-to-end feature.

Connectivity Issues (2)

Severity Location Issue
🟡 Warning gateway/redis_signals.go:163
The introduction of the `NoticeInvalidateJWKSCacheForAPI` notification command allows for remote flushing of the JWKS cache via Redis pub/sub. If an attacker gains the ability to publish messages to the Tyk Redis instance, they could send a high volume of these notifications. This would force Tyk gateways to continuously flush their JWKS caches and re-fetch keys from upstream JWKS providers, leading to performance degradation and a potential Denial of Service (DoS) against both the gateway and the upstream identity provider.
💡 SuggestionTo enhance resilience, consider implementing rate-limiting within the gateway for this cache flush operation. This would prevent a flood of notifications for the same API ID from causing excessive upstream requests. Additionally, ensure that Redis instances are deployed in a trusted network with strict access controls.
🟡 Warning gateway/redis_signals.go:44
The PR introduces a new Redis notification command `NoticeInvalidateJWKSCacheForAPI` for the gateway to consume. However, for this feature to be fully functional through the Multi-Data Center Bridge (MDCB), a corresponding change is required in the publisher component (e.g., Tyk Dashboard via tyk-sink) to send this command. Without the publisher-side implementation, this feature is incomplete, and the new listener logic will not be triggered in an MDCB environment.
💡 SuggestionVerify that a corresponding pull request exists or is created for the Tyk Dashboard/tyk-sink to publish the `NoticeInvalidateJWKSCacheForAPI` command. The functionality is not complete without the sender-side implementation.

Powered by Visor from Probelabs

Last updated: 2025-11-14T12:31:12.339Z | Triggered by: synchronize | Commit: cde4503

💡 TIP: You can chat with Visor using /visor ask <your question>

@buraksezer buraksezer force-pushed the feat/TT-15683/jwks-cache-flush-mdcb branch from f496005 to 670aaa5 Compare November 10, 2025 09:38
@buraksezer buraksezer force-pushed the feat/TT-15683/jwks-cache-flush-mdcb branch from 670aaa5 to cde4503 Compare November 14, 2025 12:27
@buraksezer buraksezer enabled auto-merge (squash) November 14, 2025 12:27
@sonarqubecloud
Copy link

@buraksezer buraksezer merged commit 849fb5a into master Nov 14, 2025
49 checks passed
@buraksezer buraksezer deleted the feat/TT-15683/jwks-cache-flush-mdcb branch November 14, 2025 13:04
@buraksezer
Copy link
Contributor Author

/release to release-5.11

@probelabs
Copy link

probelabs bot commented Dec 11, 2025

⚠️ Cherry-pick encountered conflicts. A draft PR was created: #7631

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants