Skip to content

Conversation

@NurayAhmadova
Copy link
Contributor

@NurayAhmadova NurayAhmadova commented Dec 10, 2025

Description

Related Issue

Motivation and Context

How This Has Been Tested

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring or add test (improvements in base code or adds test coverage to functionality)

Checklist

  • I ensured that the documentation is up to date
  • I explained why this PR updates go.mod in detail with reasoning why it's required
  • I would like a code coverage CI quality gate exception and have explained why

Ticket Details

TT-5545
Status In Dev
Summary Issue when retrieving a list of keys attached to an API

Generated at: 2025-12-17 06:09:39

@github-actions
Copy link
Contributor

github-actions bot commented Dec 10, 2025

API Changes

--- prev.txt	2025-12-17 06:10:10.027317190 +0000
+++ current.txt	2025-12-17 06:10:00.421316623 +0000
@@ -8873,6 +8873,14 @@
 CONSTANTS
 
 const (
+
+	// KeyListingWorkerCount defines how many parallel goroutines we use to fetch key details.
+	KeyListingWorkerCount = 20
+
+	// KeyListingBufferSize defines the channel buffer size.
+	KeyListingBufferSize = 100
+)
+const (
 	LDAPStorageEngine apidef.StorageEngineCode = "ldap"
 	RPCStorageEngine  apidef.StorageEngineCode = "rpc"
 )

@probelabs
Copy link

probelabs bot commented Dec 10, 2025

This PR addresses a bug where the /tyk/keys endpoint incorrectly returned all keys, ignoring the api_id query parameter. The fix ensures that when an api_id is provided, only keys with explicit access to that API are returned.

To handle potentially large numbers of keys efficiently, the key retrieval logic has been refactored to use a concurrent worker pool. This approach processes key session lookups in parallel, preventing the request from blocking for an extended period and improving performance. Additionally, the endpoint now respects context.Context, allowing it to handle request timeouts and cancellations gracefully, which enhances the gateway's overall resilience.

Files Changed Analysis

  • gateway/api.go: The core logic in handleGetAllKeys has been rewritten. It now uses a worker pool to concurrently fetch session details for each key and filter them based on the api_id. New helper functions (filterKeysByAPIID, keyHasAccess, etc.) have been introduced to encapsulate the concurrent processing.
  • gateway/api_test.go: Substantial new tests have been added. TestKeyHandler_BatchFiltering_Integration provides comprehensive coverage for the new filtering logic, including scenarios with specific API keys and master keys. TestKeyHandler_ContextCancellation verifies that the endpoint correctly times out if the request context is canceled. An existing test was also updated to assert the correct negative case.

Architecture & Impact Assessment

  • What this PR accomplishes: Fixes a critical bug in the key listing endpoint, ensuring the api_id filter functions correctly. It also significantly improves the performance and robustness of this endpoint through concurrency and timeout handling.
  • Key technical changes introduced:
    • Implementation of a worker pool to parallelize the fetching and validation of key session data.
    • Integration of context.Context to manage the request lifecycle and handle timeouts/cancellations.
    • Server-side filtering logic that inspects the AccessRights map within each key's session state.
  • Affected system components: This change affects the Gateway's Admin API, specifically the /tyk/keys endpoint. Consumers of this API, such as the Tyk Dashboard or other administrative tools, will now receive correctly filtered key lists.
flowchart TD
    A[GET /tyk/keys?api_id=...] --> B{api_id provided?};
    B -- No --> C[Return all keys];
    B -- Yes --> D[Fetch all key names from storage];
    D --> E[Create Worker Pool & Job Channel];
    E --> F[Feed key names to workers];
    subgraph Concurrent Processing
        G[For each key, worker fetches session & checks AccessRights]
    end
    F --> G;
    G -- Has Access --> H[Add key to result list];
    G -- No Access --> I[Discard key];
    H --> J[Wait for all workers to complete];
    I --> J;
    J --> K[Return filtered & sorted key list];
Loading

Scope Discovery & Context Expansion

  • The change is primarily contained within the gateway package. However, its performance characteristics directly impact the underlying session data store (e.g., Redis).
  • The implementation uses a concurrent N+1 query pattern, where SessionDetail is called for each key. This was a deliberate choice to maintain compatibility with environments like Redis Cluster, where batch operations can be problematic.
  • The concurrency level is managed by new constants (KeyListingWorkerCount, KeyListingBufferSize). While these values are hardcoded, they mitigate the risk of overwhelming the data store. Future work could involve making these configurable.
  • This fix is crucial for administrative workflows, particularly in the Tyk Dashboard, where viewing an accurate list of keys for a specific API is a common requirement.
Metadata
  • Review Effort: 3 / 5
  • Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2025-12-17T06:16:42.153Z | Triggered by: pr_updated | Commit: 916bb1d

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link

probelabs bot commented Dec 10, 2025

Security Issues (1)

Severity Location Issue
🟠 Error gateway/api.go:1788-1798
The changes introduce a potential Denial of Service vulnerability in the `GET /tyk/keys` endpoint. When the `api_id` query parameter is provided, the gateway now fetches session details for each key to check for access rights. However, if the set of keys to check is not limited by a `filter`, this can lead to scanning all keys in the Redis database and then performing a session lookup for each one. This is triggered in two scenarios: 1. When key hashing is disabled, a request with an `api_id` but an empty `filter` parameter will scan all keys. 2. When key hashing is enabled, the `filter` is always ignored and treated as empty, meaning any request with an `api_id` will scan all keys. This can exhaust server resources (CPU, memory) and overload the Redis database, leading to a denial of service.
💡 SuggestionDisallow filtering by `api_id` when the key set is not scoped by a filter. When not using hashed keys, enforce that the `filter` parameter is present and non-empty if `api_id` is also present. For hashed keys, where filtering is not supported, disallow the `api_id` parameter to prevent the expensive full-scan operation.

Example remediation:

// In keyHandler function
apiID := r.URL.Query().Get(&#34;api_id&#34;)

if gwConfig.HashKeys {
    if apiID != &#34;&#34; {
        doJSONWrite(w, http.StatusNotImplemented, apiError(&#34;Filtering keys by api_id is not supported when key hashing is enabled.&#34;))
        return
    }
    // ... existing code for hashed keys without api_id
    obj, code = gw.handleGetAllKeys(r.Context(), &#34;&#34;, apiID, true)
} else {
    filter := r.URL.Query().Get(&#34;filter&#34;)
    if apiID != &#34;&#34; &amp;&amp; filter == &#34;&#34; {
        doJSONWrite(w, http.StatusBadRequest, apiError(&#34;The &#39;filter&#39; parameter is required when filtering by &#39;api_id&#39;&#34;))
        return
    }
    obj, code = gw.handleGetAllKeys(r.Context(), filter, apiID, false)
}

Architecture Issues (1)

Severity Location Issue
🟡 Warning gateway/api.go:83-87
The worker pool for key filtering uses hardcoded constants for its worker count (`KeyListingWorkerCount = 20`) and channel buffer size (`KeyListingBufferSize = 100`). While these are reasonable defaults, they are not tunable. This can lead to suboptimal performance across different deployment environments, either by underutilizing resources on high-capacity systems or by placing excessive load on the session store in resource-constrained environments.
💡 SuggestionConsider making these parameters configurable in the gateway's main configuration file (`tyk.conf`). This would provide operators with the flexibility to tune the concurrency and memory footprint of this feature to match their specific workload and infrastructure capabilities, enhancing both scalability and stability.

Performance Issues (2)

Severity Location Issue
🟢 Info gateway/api.go:748-786
The `filterKeysByAPIID` function processes keys concurrently, but it still results in an N+1 query pattern, where 'N' is the number of keys. For each key, a separate call (`gw.GlobalSessionManager.SessionDetail`) is made to the data store to fetch its session details. While the use of a worker pool parallelizes these lookups and is a significant improvement over a sequential approach, it can still generate a high load on the data store when retrieving a list for an API with many keys.
💡 SuggestionThe current implementation is a pragmatic solution that improves performance through concurrency. For future optimization, if the data store supports it (e.g., Redis `MGET`), fetching sessions in batches could further reduce network overhead and load on the data store. However, this may be complex in a Redis Cluster environment where keys for `MGET` must belong to the same slot. The recommendation to use a secondary index (as mentioned for `getAllSessionKeys`) would be the most effective long-term solution as it would eliminate the need for this N+1 lookup pattern entirely.
🟡 Warning gateway/api.go:793-801
The `getAllSessionKeys` function retrieves all session keys from the storage backend using a pattern match (`*`), which can lead to a `SCAN` operation in Redis. For databases with a very large number of keys, this operation can be slow and resource-intensive, potentially impacting Redis performance. Although `SCAN` is non-blocking, it still requires iterating over the keyspace.
💡 SuggestionTo optimize key retrieval for large-scale deployments, consider creating and maintaining a secondary index. For instance, a Redis Set could be maintained for each API, containing the keys that have access to it (e.g., `api-keys:<api_id>`). This would allow fetching all relevant keys with a single `SMEMBERS` command, which is much more efficient than scanning. This would require changes to the key creation/update logic to maintain the index.

Quality Issues (1)

Severity Location Issue
🟡 Warning gateway/api.go:81-84
The worker count and buffer size for key listing are hardcoded. This can make it difficult to tune performance for different environments without changing the code. A high worker count might overwhelm the session store (e.g., Redis) in smaller deployments, while a low count might not fully utilize resources in larger ones.
💡 SuggestionConsider making `KeyListingWorkerCount` and `KeyListingBufferSize` configurable via the gateway's configuration file, with the current values as sensible defaults. This would provide operators with the flexibility to adjust the concurrency level based on their specific infrastructure and workload.

Powered by Visor from Probelabs

Last updated: 2025-12-17T06:16:46.129Z | Triggered by: pr_updated | Commit: 916bb1d

💡 TIP: You can chat with Visor using /visor ask <your question>

@NurayAhmadova NurayAhmadova force-pushed the TT-5545-issue-when-retrieving-a-list-of-keys-attached-to-an-api branch 2 times, most recently from 42ca167 to 5270341 Compare December 11, 2025 08:11
@NurayAhmadova NurayAhmadova requested review from MaciekMis, pvormste and shults and removed request for pvormste December 15, 2025 16:49
}

sessionsObj := apiAllKeys{fixedSessions}
jobs := make(chan string, KeyListingBufferSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if it's hard to fetch a couple of sessions in one query, extract this fetching block into a separate method please. Currently, it's hard to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, please re-review

@NurayAhmadova NurayAhmadova force-pushed the TT-5545-issue-when-retrieving-a-list-of-keys-attached-to-an-api branch from 2480bec to 916bb1d Compare December 17, 2025 06:09
@NurayAhmadova NurayAhmadova requested a review from shults December 17, 2025 06:09
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants