-
Notifications
You must be signed in to change notification settings - Fork 203
fix: cache vars and remapping normalization #2318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…en, cleanup skip/include variables when we get normalized op from cache
WalkthroughAdds two new caches (variables normalization and variables remapping) and cache-entry types; NormalizeVariables and RemapVariables return cache-hit flags; cache-hit booleans are recorded on operation context, emitted as HTTP headers and OTel attributes; tests, telemetry, and config updated for unified cache-response-header behavior. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (3)📚 Learning: 2025-09-29T10:28:07.122ZApplied to files:
📚 Learning: 2025-07-21T15:06:36.664ZApplied to files:
📚 Learning: 2025-08-20T10:08:17.857ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
🔇 Additional comments (1)
Comment |
Router-nonroot image scan passed✅ No security vulnerabilities found in image: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router/core/operation_processor.go (1)
944-979: Copy cached variables before storing entries
VariablesNormalizationCacheEntry.variablescurrently aliaseso.parsedOperation.Request.Variables, which is backed by the pooledparseKitbuffer. After the entry is cached, the kit gets reused (OperationProcessor.freeKit), so later requests mutate that shared backing array, corrupting the cached payload and any future hits.Please copy the slice before caching so the entry owns its data. Apply the same fix in both places where we construct
VariablesNormalizationCacheEntry.@@ - entry := VariablesNormalizationCacheEntry{ + cachedVariables := append([]byte(nil), o.parsedOperation.Request.Variables...) + entry := VariablesNormalizationCacheEntry{ uploadsMapping: uploadsMapping, id: o.parsedOperation.ID, normalizedRepresentation: o.parsedOperation.NormalizedRepresentation, - variables: o.parsedOperation.Request.Variables, + variables: cachedVariables, reparse: false, } @@ - entry := VariablesNormalizationCacheEntry{ + cachedVariables := append([]byte(nil), o.parsedOperation.Request.Variables...) + entry := VariablesNormalizationCacheEntry{ uploadsMapping: uploadsMapping, id: o.parsedOperation.ID, normalizedRepresentation: o.parsedOperation.NormalizedRepresentation, - variables: o.parsedOperation.Request.Variables, + variables: cachedVariables, reparse: true, }
🧹 Nitpick comments (1)
router/core/graph_server.go (1)
707-731: Expose cache metrics for the new cachesThe new variables/remap caches are now part of the mux, but we never register them with
metricInfos, so their hit/miss stats stay invisible in OTLP/Prometheus. Please append both caches tometricInfosalongside the existing ones so operators can monitor them.@@ if s.normalizationCache != nil { metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo("query_normalization", srv.engineExecutionConfiguration.NormalizationCacheSize, s.normalizationCache.Metrics)) } + if s.variablesNormalizationCache != nil { + metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo("variables_normalization", srv.engineExecutionConfiguration.NormalizationCacheSize, s.variablesNormalizationCache.Metrics)) + } + + if s.remapVariablesCache != nil { + metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo("variables_remap", srv.engineExecutionConfiguration.NormalizationCacheSize, s.remapVariablesCache.Metrics)) + } + if s.persistedOperationCache != nil { metricInfos = append(metricInfos, rmetric.NewCacheMetricInfo("persisted_query_normalization", 1024, s.persistedOperationCache.Metrics)) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
router-tests/normalization_cache_test.go(1 hunks)router/core/graph_server.go(4 hunks)router/core/operation_processor.go(10 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/normalization_cache_test.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.
Applied to files:
router/core/graph_server.go
🧬 Code graph analysis (2)
router-tests/normalization_cache_test.go (2)
router-tests/testenv/testenv.go (4)
Run(105-122)Config(284-341)Environment(1731-1767)GraphQLRequest(1907-1915)router/core/graphql_handler.go (1)
NormalizationCacheHeader(39-39)
router/core/graph_server.go (1)
router/core/operation_processor.go (3)
NormalizationCacheEntry(755-759)VariablesNormalizationCacheEntry(761-767)RemapVariablesCacheEntry(769-773)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: build-router
- GitHub Check: build_test
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: image_scan
- GitHub Check: integration_test (./telemetry)
- GitHub Check: image_scan (nonroot)
- GitHub Check: build_push_image
- GitHub Check: build_push_image (nonroot)
- GitHub Check: build_test
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
router/core/operation_processor.go (2)
813-817: Consider handling or documenting jsonparser.Delete errors.The
jsonparser.Deletecalls on Line 816 can return errors, but they're being silently ignored. While delete operations on valid JSON should rarely fail, consider either:
- Handling the error (log or return it)
- Adding a comment explaining why it's safe to ignore
Example:
for _, varName := range skipIncludeVariableNames { - o.parsedOperation.Request.Variables = jsonparser.Delete(o.parsedOperation.Request.Variables, varName) + var err error + o.parsedOperation.Request.Variables, err = jsonparser.Delete(o.parsedOperation.Request.Variables, varName) + if err != nil { + // Log or handle error + } }
943-950: Address or remove the TODO-style comment.The comment on Line 943 suggests uncertainty about
keyGenreset discipline: "should not be needed if we properly reset after use - check do we have any remaining places where we do not reset keygen - maybe wrap into a type which will reset once we got key".After reviewing the code, the resets appear properly placed (lines 885, 950, 995, 1045, 1053, 1193, 1202). Consider either:
- Removing the comment if the concern has been addressed
- Creating a wrapper type as suggested for better encapsulation
- Converting this to a proper TODO with a tracking ticket
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
router/core/operation_processor.go(12 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
router/core/operation_processor.go (1)
router-tests/testenv/testenv.go (1)
GraphQLRequest(1909-1917)
🔇 Additional comments (7)
router/core/operation_processor.go (7)
48-78: LGTM! Documentation improvements.The enhanced comments provide clearer explanations of the field purposes and when they're available.
106-128: LGTM! Cache fields properly added.The new cache fields follow the established naming conventions and type patterns used for other caches in the struct.
163-176: LGTM! Cache fields properly integrated.The cache fields are correctly added to the OperationCache struct following existing patterns.
767-794: LGTM! Well-documented cache entry types.The cache entry structures are well-designed with clear documentation. The
reparseflag inVariablesNormalizationCacheEntryis a smart optimization to avoid unnecessary document reparsing when the normalized form hasn't changed.
889-990: LGTM! Well-implemented caching logic with smart optimization.The caching implementation is solid:
- Cache key generation appropriately uses both variables and normalized representation
- The
reparseflag optimization avoids unnecessary document reparsing when the normalized form is unchanged (lines 954-963)- Proper error handling and cache miss fallback
- Upload mappings are correctly preserved in cache entries
992-1073: LGTM! RemapVariables caching correctly implemented.The caching logic is well-implemented:
- Cache key appropriately uses only the normalized representation (line 993), since variable remapping depends on operation structure, not input variable values
- Correct handling of cache hits with document reparsing (lines 1008-1010)
- Proper error handling and fallback
- keyGen properly reset after each use
1432-1441: LGTM! Cache initialization properly wired.The new caches are correctly integrated into the
NewOperationProcessorinitialization:
- Condition check includes the new cache types (lines 1432-1433)
- Fields properly assigned from options (lines 1439-1440)
- Follows the established pattern for other caches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
router/core/operation_processor.go (1)
944-944: Consider removing the TODO-style comment.The comment suggests checking whether
keyGenis properly reset after use, but inspection showskeyGen.Reset()is consistently called after each use throughout the file (lines 885, 951, 996, 1047, 1055). This appears to be a leftover development note that could be removed for code clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
router/core/graphql_prehandler.go(3 hunks)router/core/operation_processor.go(12 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.
Applied to files:
router/core/graphql_prehandler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.
Applied to files:
router/core/graphql_prehandler.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router/core/graphql_prehandler.go
🧬 Code graph analysis (2)
router/core/graphql_prehandler.go (1)
router/pkg/otel/attributes.go (2)
WgVariablesNormalizationCacheHit(39-39)WgVariablesRemappingCacheHit(40-40)
router/core/operation_processor.go (1)
router-tests/testenv/testenv.go (1)
GraphQLRequest(1909-1917)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
- GitHub Check: build_test
- GitHub Check: image_scan (nonroot)
- GitHub Check: image_scan
- GitHub Check: integration_test (./events)
- GitHub Check: build_push_image
- GitHub Check: build_push_image (nonroot)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: build_test
🔇 Additional comments (8)
router/core/operation_processor.go (5)
767-794: Well-documented cache entry types.The new cache entries are clearly documented with helpful comments explaining each field's purpose. The
reparseflag inVariablesNormalizationCacheEntryis a smart optimization that avoids unnecessary document reparsing when the normalized form hasn't changed.Based on learnings: This addresses a past review comment requesting comments and descriptions for these structures.
812-817: Correct handling of skip/include variables after cache hit.This logic properly removes skip/include variables when retrieving a cached normalized operation, ensuring consistency with the non-cached path where these variables are removed during normalization.
880-991: Solid caching implementation for variable normalization.The cache key appropriately includes both variables and normalized representation since normalization depends on both. The
reparseflag optimization is well-designed: when the normalized form doesn't change, it avoids the expensivesetAndParseOperationDoc()call on cache retrieval.
993-1075: Appropriate caching implementation for variable remapping.The cache key correctly includes only the normalized representation (not variable values) since remapping depends on operation structure. Unlike
NormalizeVariables, this always callssetAndParseOperationDoc()after a cache hit, which is appropriate since remapping modifies the operation structure.
1434-1443: Proper wiring of new caches into OperationProcessor.The new caches are correctly integrated following the existing pattern for other cache types.
router/core/graphql_prehandler.go (3)
786-786: Correct integration of variables normalization cache.The cached boolean is properly captured from
NormalizeVariablesand propagated to both the OTEL span and the request context for telemetry tracking.Also applies to: 801-802
804-831: Excellent documentation for file upload path mapping.The enhanced comments with concrete examples make the complex upload path remapping logic much clearer. The distinction between direct argument passing vs. nested object usage is now well-explained.
833-836: Correct integration of variable remapping cache.The cached boolean is properly captured and propagated to telemetry. The updated comment at lines 833-835 helpfully clarifies the key-value relationship in the
remapVariablesmap (new names as keys, old names as values).Also applies to: 852-853
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
router-tests/normalization_cache_test.go (4)
87-106: Clarify or remove misleading comment.The comment on line 90 states "Test with unused variables that should be removed," but the test doesn't include any unused variables. The
$idvariable is used in both queries. Consider removing this comment or clarifying what aspect of variable handling is being tested.
108-131: Add t.Parallel() for consistency.This subtest doesn't call
t.Parallel(), unlike the other subtests in this test function. For consistency and to improve test execution performance, consider adding it unless there's a specific reason for sequential execution.Apply this diff:
t.Run("Cache key isolation - different operations don't collide", func(t *testing.T) { + t.Parallel() testenv.Run(t, &testenv.Config{}, func(t *testing.T, xEnv *testenv.Environment) {
64-85: Consider adding explanatory comments for cache hit patterns.Line 83 expects
cacheHit{false, false, true}(normalization MISS, variables MISS, remapping HIT) when the inline value changes fromid: 1toid: 2. While the logic is correct, a brief comment explaining why remapping hits while the others miss would improve readability and make the test's intent clearer for future maintainers.
133-151: Consider adding explanatory comments for cache hit patterns.Line 149 expects
cacheHit{true, false, true}(normalization HIT, variables MISS, remapping HIT) when the variable value changes. A brief comment explaining this pattern would help clarify that normalization hits because the query structure is unchanged, variables misses because the value differs, and remapping hits because the structure remains the same.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
router-tests/normalization_cache_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/normalization_cache_test.go
🧬 Code graph analysis (1)
router-tests/normalization_cache_test.go (2)
router-tests/testenv/testenv.go (2)
TestResponse(1924-1928)GraphQLRequest(1909-1917)router/core/graphql_handler.go (3)
NormalizationCacheHeader(39-39)VariablesNormalizationCacheHeader(40-40)VariablesRemappingCacheHeader(41-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: image_scan (nonroot)
- GitHub Check: image_scan
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: build_push_image
- GitHub Check: build_push_image (nonroot)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: build_test
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
router-tests/normalization_cache_test.go (1)
22-38: LGTM! Well-designed test helper.The
assertCacheHeadershelper effectively reduces duplication and provides clear, specific error messages for each cache stage. Good use oft.Helper()for accurate test failure reporting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
router/core/operation_processor.go (1)
1430-1439: Cache wiring is correct; consider extracting condition for readability.The new caches are properly wired into the
OperationCache. The conditional on lines 1430-1431 checks all six cache options, which is functionally correct.The multi-condition check could be extracted for improved readability:
+func (opts *OperationProcessorOptions) hasCaching() bool { + return opts.NormalizationCache != nil || opts.ValidationCache != nil || + opts.QueryDepthCache != nil || opts.OperationHashCache != nil || + opts.EnablePersistedOperationsCache || opts.VariablesNormalizationCache != nil || + opts.RemapVariablesCache != nil +} + - if opts.NormalizationCache != nil || opts.ValidationCache != nil || opts.QueryDepthCache != nil || opts.OperationHashCache != nil || opts.EnablePersistedOperationsCache || - opts.VariablesNormalizationCache != nil || opts.RemapVariablesCache != nil { + if opts.hasCaching() { processor.operationCache = &OperationCache{
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
router-tests/normalization_cache_test.go(1 hunks)router/core/operation_processor.go(11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/normalization_cache_test.go
🧬 Code graph analysis (2)
router-tests/normalization_cache_test.go (2)
router-tests/testenv/testenv.go (2)
TestResponse(1924-1928)GraphQLRequest(1909-1917)router/core/graphql_handler.go (3)
NormalizationCacheHeader(39-39)VariablesNormalizationCacheHeader(40-40)VariablesRemappingCacheHeader(41-41)
router/core/operation_processor.go (1)
router-tests/testenv/testenv.go (1)
GraphQLRequest(1909-1917)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: integration_test (./events)
- GitHub Check: image_scan
- GitHub Check: image_scan (nonroot)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: build_push_image (nonroot)
- GitHub Check: build_test
- GitHub Check: build_push_image
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
🔇 Additional comments (7)
router-tests/normalization_cache_test.go (2)
14-38: LGTM - Clean test infrastructure.The
cacheHitstruct andassertCacheHeadershelper provide a clean abstraction for validating cache behavior. The error messages in the assertions are descriptive and will aid debugging if tests fail.
40-157: Excellent test coverage for cache interactions.The test suite comprehensively validates normalization, variables normalization, and remapping cache behavior across multiple scenarios. The comment on lines 150-152 is particularly helpful in explaining why specific caches hit or miss during list coercion.
The expected cache patterns align correctly with the cache key generation logic:
- Normalization cache keys include query structure
- Variables normalization keys include both normalized representation and variable values
- Remapping keys include only normalized representation
router/core/operation_processor.go (5)
767-794: Well-designed cache entry structures with clear documentation.The cache entry types appropriately capture the state needed for each normalization phase:
VariablesNormalizationCacheEntryincludes thereparseflag optimization to avoid unnecessary document reparsingRemapVariablesCacheEntrycaptures the internal ID used for downstream caches- Documentation clearly explains each field's purpose
880-887: Cache key generation correctly reflects dependencies.The cache key design appropriately captures dependencies:
normalizeVariablesCacheKeyincludes both variables and normalized representation, since variable normalization depends on actual valuesremapVariablesCacheKeyincludes only normalized representation, since remapping depends only on query structureThis design maximizes cache hits while maintaining correctness.
Also applies to: 991-996
890-989: Solid cache integration with smart reparse optimization.The
NormalizeVariablescache implementation is well-designed:
- Cache key includes both variables and normalized representation - correctly captures dependencies
- Reparse optimization (lines 952-965) avoids expensive document reparsing when only variable values changed but structure remained unchanged
- Error handling is consistent with existing patterns
- Return signature now includes
cachedflag andmappingfor upload path handlingThe logic correctly handles both cache hit and miss paths while maintaining all necessary state.
999-1071: RemapVariables cache integration follows consistent pattern.The
RemapVariablescache implementation correctly:
- Uses normalized representation as cache key - remapping depends only on query structure, not variable values
- Always reparses after cache hit (line 1008) - necessary since downstream processing needs the parsed document
- Calculates InternalID - used as the cache key for validation, complexity, and planner caches downstream
- Returns cached flag for observability (used in headers and telemetry)
The logic is consistent with the
NormalizeVariablespattern and correctly maintains all required state.
813-817: Correct cleanup of skip/include variables after cache hit.This code ensures consistency between cache hit and miss paths:
- On cache miss: normalization removes skip/include variables from
doc.Input.Variables, which is copied to request variables (line 838)- On cache hit: the normalized representation is loaded without skip/include, but the original request variables still have them
The manual deletion ensures both paths produce identical results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
router-tests/telemetry/telemetry_test.go (1)
3758-3758: Attribute count bumps on “Operation - Normalize” spans may be brittleBumping
require.Len(t, sn[2].Attributes(), …)to account for the two new normalization/remapping cache‑hit attributes is correct now, but these tests are sensitive to any future attribute additions. Consider asserting the presence (and values) of the new cache‑related attributes explicitly and relaxing the length check (e.g.>= baseline) to reduce churn when span metadata evolves.Also applies to: 4480-4480, 4935-4935, 7097-7097, 9554-9554
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
router-tests/telemetry/telemetry_test.go(31 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/telemetry/telemetry_test.go
📚 Learning: 2025-11-12T18:38:46.619Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2327
File: graphqlmetrics/core/metrics_service.go:435-442
Timestamp: 2025-11-12T18:38:46.619Z
Learning: In graphqlmetrics/core/metrics_service.go, the opGuardCache TTL (30 days) is intentionally shorter than the gql_metrics_operations database retention period (90 days). This design reduces memory usage while relying on database constraints to prevent duplicate inserts after cache expiration.
Applied to files:
router-tests/telemetry/telemetry_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: build_push_image
- GitHub Check: image_scan (nonroot)
- GitHub Check: build_push_image (nonroot)
- GitHub Check: image_scan
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: build_test
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (6)
router-tests/telemetry/telemetry_test.go (6)
581-612: New cache metrics for variables_normalization/remap_variables match existing patternsThe added hit/key/cost/max datapoints for
cache_type=variables_normalizationandremap_variablesare consistent with the existing cache families (plan/query_normalization/validation) and correctly use the same base attributes and capacities for the default, no–feature-flag scenario.Also applies to: 704-749, 839-868, 923-936
1081-1112: Persisted/non‑persisted cache telemetry extended coherently to new cache typesFor the mixed persisted/non‑persisted case, the hit/miss, keys
added, and cost totals forvariables_normalizationandremap_variablesalign with the request sequence and mirror how the tests treat the existing cache types. The expectations look internally consistent.Also applies to: 1204-1249, 1339-1368, 1423-1436
1545-1576: Prometheus+OTLP cache telemetry keeps new cache types in syncIn the Prometheus-enabled scenario, the additional
variables_normalization/remap_variablesdatapoints are wired exactly like the OTLP-only case (same attributes, same counts), so both exporters see a consistent cache story.Also applies to: 1668-1713, 1803-1832, 1887-1900
2011-2042: Small validation cache eviction metrics cover new cache types without skewing semanticsThe “small validation cache” test now reports request/key/cost/max metrics for
variables_normalizationandremap_variableswith added/evicted/updated values that stay in step with the other caches and don’t disturb the validation‑cache eviction expectations.Also applies to: 2134-2179, 2269-2298, 2353-2366
2476-2515: Feature‑flag cache telemetry correctly duplicates new cache types across main/FF dimensionsFor the feature‑flag scenario,
variables_normalizationandremap_variablesare added everywhere the other cache types appear (requests/keys/cost/max), with parallel counts formainAttributesandfeatureFlagAttributes. That keeps the attribute surface symmetric across feature‑flagged vs non‑flagged traffic.Also applies to: 2491-2505, 2507-2521, 2523-2537, 2603-2634, 2728-2775, 2874-2921, 3015-3046, 3112-3143, 3201-3214, 3244-3257
11704-11816: Exclusion filtering updated to cover new cache_type labelsIn the custom‑exporter exclusion test, including
variables_normalizationandremap_variablesin the expectedrouter.graphql.cache.requests.statsdatapoints ensures label‑filtering behavior is validated for all cache families, not just the original ones. Values are intentionally ignored, so the added datapoints are safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
router/core/graph_server.go (1)
747-755: Relying on cacheClose()nil-guard ingraphMux.Shutdownis acceptableCalling
Close()unconditionally on all caches (plan, persisted, normalization, variablesNormalization, remapVariables, complexity, validation, operationHash) simplifies the shutdown logic and is safe givenCache.Closealready handles nil receivers and closed caches. This keeps cleanup consistent as new caches are added.
🧹 Nitpick comments (3)
aws-lambda-router/cmd/main.go (1)
7-10: Lambda main wiring matches new engine config; consider softer shutdown error handling
- The updated
EngineExecutionConfigurationliteral (only settingEnableSingleFlight,EnableRequestTracing, andMaxConcurrentResolvers) correctly drops the now-removed execution-plan cache header flag and will compile against the new config surface.- Using
HTTP_PORTto switch between pure HTTP mode and Lambda, and ignoring the expected “server closed” error in the HTTP path, is a reasonable pattern here.- In the SIGTERM shutdown callback, you currently
paniconr.Shutdownerror; given this runs in the termination path, it might be slightly friendlier to log the error atError/Warnlevel and return, instead of panicking, unless you explicitly want crashes to surface in logs as hard failures.Also applies to: 26-33, 59-63, 76-81, 89-97
router/pkg/config/config.schema.json (1)
2755-2758: Schema property for unified cache response headers looks consistentThe new
engine.debug.enable_cache_response_headersboolean cleanly replaces the previous per-cache header toggles and matches the naming used in fixtures and testdata. The description is clear; if more cache types later expose hit/miss headers, you might broaden the wording to “GraphQL operation-level caches” instead of listing specific caches, but that’s optional.router/core/graph_server.go (1)
508-525: New variables/remap caches and metrics wiring are coherentThe additional
variablesNormalizationCacheandremapVariablesCachefields ongraphMux, their creation inbuildOperationCaches, and their inclusion inconfigureCacheMetrics(asvariables_normalizationandremap_variables) are internally consistent:
- They are enabled under the same
EnableNormalizationCache/NormalizationCacheSizegate as the main normalization cache.- They use the same sizing and metrics flags as other caches.
- Metric registration only occurs when the caches are non-nil.
This cleanly extends both caching and cache metrics without changing existing behavior.
You could optionally gate
remapVariablesCachecreation on!srv.engineExecutionConfiguration.DisableVariablesRemappingto avoid allocating an unused cache when remapping is disabled, but that’s not required for correctness.Also applies to: 567-603, 676-718
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
aws-lambda-router/cmd/main.go(2 hunks)router-tests/testenv/testenv.go(1 hunks)router/core/graph_server.go(6 hunks)router/core/graphql_handler.go(4 hunks)router/pkg/config/config.go(1 hunks)router/pkg/config/config.schema.json(1 hunks)router/pkg/config/fixtures/full.yaml(1 hunks)router/pkg/config/testdata/config_defaults.json(1 hunks)router/pkg/config/testdata/config_full.json(1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📚 Learning: 2025-11-12T18:38:46.619Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2327
File: graphqlmetrics/core/metrics_service.go:435-442
Timestamp: 2025-11-12T18:38:46.619Z
Learning: In graphqlmetrics/core/metrics_service.go, the opGuardCache TTL (30 days) is intentionally shorter than the gql_metrics_operations database retention period (90 days). This design reduces memory usage while relying on database constraints to prevent duplicate inserts after cache expiration.
Applied to files:
router/core/graph_server.go
📚 Learning: 2025-08-14T17:46:00.930Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2137
File: router/pkg/metric/oltp_event_metric_store.go:68-68
Timestamp: 2025-08-14T17:46:00.930Z
Learning: In the Cosmo router codebase, meter providers are shut down centrally as part of the router lifecycle, so individual metric stores like otlpEventMetrics use no-op Shutdown() methods rather than calling meterProvider.Shutdown(ctx) directly.
Applied to files:
router/core/graph_server.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.
Applied to files:
router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.
Applied to files:
router/pkg/config/config.schema.json
📚 Learning: 2025-08-26T17:19:28.178Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2167
File: router/internal/expr/retry_context.go:3-10
Timestamp: 2025-08-26T17:19:28.178Z
Learning: In the Cosmo router codebase, prefer using netErr.Timeout() checks over explicit context.DeadlineExceeded checks for timeout detection in retry logic, as Go's networking stack typically wraps context deadline exceeded errors in proper net.Error implementations.
Applied to files:
aws-lambda-router/cmd/main.go
📚 Learning: 2025-08-12T13:50:45.964Z
Learnt from: Noroth
Repo: wundergraph/cosmo PR: 2132
File: router-plugin/plugin.go:139-146
Timestamp: 2025-08-12T13:50:45.964Z
Learning: In the Cosmo router plugin system, the plugin framework creates its own logger independently. When creating a logger in NewRouterPlugin, it's only needed for the gRPC server setup (passed via setup.GrpcServerInitOpts.Logger) and doesn't need to be assigned back to serveConfig.Logger.
Applied to files:
aws-lambda-router/cmd/main.go
📚 Learning: 2025-08-12T09:13:38.973Z
Learnt from: Noroth
Repo: wundergraph/cosmo PR: 2132
File: router-plugin/plugin.go:88-104
Timestamp: 2025-08-12T09:13:38.973Z
Learning: In the Cosmo router plugin system, plugin logs are written to stdout and incorporated by the router into its zap logger, which handles timestamping. Therefore, plugin loggers should use DisableTime: true to avoid redundant timestamps that could interfere with the router's log processing.
Applied to files:
aws-lambda-router/cmd/main.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.
Applied to files:
router/core/graphql_handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.
Applied to files:
router/core/graphql_handler.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Applied to files:
router/core/graphql_handler.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/testenv/testenv.go
🧬 Code graph analysis (2)
router/core/graph_server.go (3)
router/core/operation_processor.go (3)
NormalizationCacheEntry(764-768)VariablesNormalizationCacheEntry(770-786)RemapVariablesCacheEntry(788-796)router/core/engine_loader_hooks.go (1)
NewEngineRequestHooks(55-86)router/core/graphql_handler.go (1)
HandlerOptions(68-84)
router-tests/testenv/testenv.go (1)
router/pkg/config/config.go (1)
EngineDebugConfiguration(355-371)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: Analyze (go)
🔇 Additional comments (8)
router/pkg/config/fixtures/full.yaml (1)
353-356: Engine debug fixture aligned with new cache header flag
engine.debug.enable_cache_response_headers: falsematches the new consolidated flag in the schema and keeps this “full” fixture consistent with the JSON testdata. No issues from a config-surface perspective.router/pkg/config/testdata/config_defaults.json (1)
349-366: Default debug config correctly reflects unified cache header toggleAdding
"EnableCacheResponseHeaders": falseunderEngineExecutionConfiguration.Debugcleanly replaces the old per-header booleans and keeps the defaults JSON in sync with the schema and YAML fixture. Looks good.router/pkg/config/testdata/config_full.json (1)
729-746: Full-config testdata updated for consolidated cache header flagThe addition of
"EnableCacheResponseHeaders": falseto the debug block aligns this “full” config snapshot with the new unified cache response header setting. No behavioral concerns.router-tests/testenv/testenv.go (1)
1303-1327: Engine execution/debug config wiring looks consistent with new cache/header flagsThe execution config in
configureRoutercorrectly enables normalization cache, sets a reasonable cache size, and uses the new unifiedEnableCacheResponseHeadersdebug flag; this should exercise the new cache behavior and headers in tests without surprises.router/pkg/config/config.go (1)
355-370: Unified debug flagEnableCacheResponseHeadersis well-integratedThe updated
EngineDebugConfigurationcleanly consolidates cache header toggles underEnableCacheResponseHeaderswith consistent env/yaml tags; this matches how the flag is consumed ingraph_server.goandgraphql_handler.go.router/core/graphql_handler.go (2)
36-42: Cache headers and unifiedsetDebugCacheHeadersbehavior look correctThe new cache header constants for execution plan, persisted operations, normalization, variables normalization, and variables remapping align with the new caches added in
graph_server.go.GraphQLHandler’senableCacheResponseHeadersflag andsetDebugCacheHeadershelper cleanly emit"HIT"/"MISS"across all cache types in one place, only when enabled, which keeps header behavior consistent and easy to reason about.Also applies to: 129-133, 442-456
68-103: Based on my verification, the review comment is incorrect. The codebase has an explicit design pattern to always initializetracerProviderto avoid nil pointer panics. At router initialization (router.go:240-242), a noop tracer provider is always created. HandlerOptions always receives this non-nil provider froms.tracerProvider, and there are no code paths where it could be nil. This aligns with the codebase's deliberate pattern documented in the router comment: "Create noop tracer and meter to avoid nil pointer panics and to avoid checking for nil everywhere."Nil TracerProvider defensive guard is unnecessary
The suggested defensive check contradicts the codebase's established design pattern. The router always initializes
TracerProviderwith a noop provider (router.go:242) specifically to eliminate nil checks throughout the codebase. All callers ofNewGraphQLHandlerprovide a non-nil tracer provider from the router instance, making the proposed guard redundant.router/core/graph_server.go (1)
1241-1255: Operation processor wiring and loader hooks integration for new caches look correct
OperationProcessorOptionsnow receivesVariablesNormalizationCacheandRemapVariablesCachefrom the correspondinggraphMuxfields, matching the entry types defined inoperation_processor.go.NewEngineRequestHooksis initialized with the mux’s metric store, optional subgraph access logger, tracer provider, and attribute expressions, and then passed viaHandlerOptions.EngineLoaderHooks.HandlerOptionsalso correctly wiresEnableCacheResponseHeadersfroms.engineExecutionConfiguration.Debug.EnableCacheResponseHeadersand enables response header propagation whenheaderRulesare present.This keeps the new caching behavior and loader instrumentation fully connected through the server → mux → handler stack.
Also applies to: 1344-1352, 1355-1364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
router/pkg/config/testdata/config_defaults.json (1)
363-365: Same consolidation concern asconfig_full.json.This test fixture has the same issue: deprecated cache header flags (lines 363-364) coexist with the new consolidated flag (line 365).
Refer to the review comment on
router/pkg/config/testdata/config_full.jsonlines 743-745 for the full analysis and suggested fixes. Both test fixtures should adopt a consistent approach regarding deprecated fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
router/core/router.go(1 hunks)router/pkg/config/config.go(1 hunks)router/pkg/config/config.schema.json(2 hunks)router/pkg/config/fixtures/full.yaml(1 hunks)router/pkg/config/testdata/config_defaults.json(1 hunks)router/pkg/config/testdata/config_full.json(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- router/pkg/config/fixtures/full.yaml
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-29T10:28:07.122Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2192
File: router/pkg/config/config.go:1028-1029
Timestamp: 2025-09-29T10:28:07.122Z
Learning: The deprecation strategy for IntrospectionEnabled field in router/pkg/config/config.go is to first mark it as deprecated, then migrate all call sites to use IntrospectionConfig.Enabled, and finally remove the deprecated field. The envDefault tag should be kept during the deprecation period for backward compatibility.
Applied to files:
router/core/router.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.
Applied to files:
router/pkg/config/config.go
🔇 Additional comments (3)
router/pkg/config/config.go (2)
368-372: LGTM! Proper deprecation strategy.The deprecation approach correctly retains the old fields with clear deprecation comments pointing to the new consolidated flag. This aligns with the deprecation handling in
router/core/router.go(lines 512-523) which logs warnings and enables the new flag when deprecated fields are set.Based on learnings, this follows the established deprecation pattern: mark as deprecated, migrate call sites, then remove in a future release.
377-408: No breaking API changes to EngineExecutionConfiguration struct—only formatting updates.The struct retains all previously public fields. The changes are purely cosmetic: field alignment adjustments and a deprecation comment for
EnableExecutionPlanCacheResponseHeader(directing users toEngineDebugConfiguration.EnableCacheResponseHeaders). No fields were removed, and no migration is required.router/core/router.go (1)
512-523: LGTM! Clean deprecation handling.The three deprecation blocks correctly handle the migration from individual cache header flags to the consolidated
EnableCacheResponseHeadersflag. Each block:
- Checks if a deprecated flag is set
- Logs a clear deprecation warning with migration guidance
- Enables the new consolidated flag for backward compatibility
This ensures existing configurations continue to work while guiding users toward the new unified flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
router/pkg/config/config.go (1)
362-415: Header debug flags and deprecation path are wired correctlyThe introduction of
EnableCacheResponseHeadersinEngineDebugConfiguration, while retaining the per-cache header flags andEnableExecutionPlanCacheResponseHeaderwith clear deprecation comments, keeps configuration backwards compatible and makes the unified flag easy to adopt (env and yaml tags are consistent with the rest of the config surface).Only follow-up I’d suggest is to ensure the JSON schema and user-facing docs explicitly mark the three deprecated header flags as such so users discover the new unified flag easily, but there’s no code issue in this struct as written.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
router-tests/testenv/testenv.go(1 hunks)router/core/context.go(1 hunks)router/core/graph_server.go(6 hunks)router/core/router.go(1 hunks)router/pkg/config/config.go(1 hunks)router/pkg/config/config.schema.json(2 hunks)router/pkg/config/testdata/config_defaults.json(1 hunks)router/pkg/config/testdata/config_full.json(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- router/core/router.go
- router/pkg/config/testdata/config_defaults.json
- router/pkg/config/testdata/config_full.json
- router/pkg/config/config.schema.json
- router/core/context.go
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-12T18:38:46.619Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2327
File: graphqlmetrics/core/metrics_service.go:435-442
Timestamp: 2025-11-12T18:38:46.619Z
Learning: In graphqlmetrics/core/metrics_service.go, the opGuardCache TTL (30 days) is intentionally shorter than the gql_metrics_operations database retention period (90 days). This design reduces memory usage while relying on database constraints to prevent duplicate inserts after cache expiration.
Applied to files:
router/core/graph_server.go
📚 Learning: 2025-08-14T17:46:00.930Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2137
File: router/pkg/metric/oltp_event_metric_store.go:68-68
Timestamp: 2025-08-14T17:46:00.930Z
Learning: In the Cosmo router codebase, meter providers are shut down centrally as part of the router lifecycle, so individual metric stores like otlpEventMetrics use no-op Shutdown() methods rather than calling meterProvider.Shutdown(ctx) directly.
Applied to files:
router/core/graph_server.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router-tests/testenv/testenv.go
📚 Learning: 2025-09-29T10:28:07.122Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2192
File: router/pkg/config/config.go:1028-1029
Timestamp: 2025-09-29T10:28:07.122Z
Learning: The deprecation strategy for IntrospectionEnabled field in router/pkg/config/config.go is to first mark it as deprecated, then migrate all call sites to use IntrospectionConfig.Enabled, and finally remove the deprecated field. The envDefault tag should be kept during the deprecation period for backward compatibility.
Applied to files:
router-tests/testenv/testenv.gorouter/pkg/config/config.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.
Applied to files:
router-tests/testenv/testenv.go
🧬 Code graph analysis (1)
router-tests/testenv/testenv.go (1)
router/pkg/config/config.go (1)
EngineDebugConfiguration(362-382)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: build_test
- GitHub Check: image_scan (nonroot)
- GitHub Check: build_push_image (nonroot)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: build_push_image
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: build_test
- GitHub Check: Analyze (go)
🔇 Additional comments (8)
router/core/graph_server.go (7)
510-528: LGTM! Cache fields added correctly.The new cache fields follow the existing patterns and are properly grouped with other cache fields in the struct.
583-605: LGTM! Cache initialization follows established patterns.The cache configurations appropriately mirror the normalization cache settings, and reusing
NormalizationCacheSizemakes sense given that variables normalization and remapping are normalization-related operations.
715-721: LGTM! Cache metrics properly registered.The metrics registration follows the existing pattern and uses descriptive cache names that align with the functionality.
751-758: LGTM! Simplified shutdown logic.The unconditional cache Close() calls are safe since the ristretto cache Close() method handles nil internally. This simplification improves readability.
1282-1283: LGTM! Caches correctly wired into operation processor.The new caches are properly passed to the operation processor, following the same pattern as existing caches.
1377-1386: LGTM! Readability improvement.Extracting the loaderHooks into a variable improves code readability by separating the complex initialization from the struct literal.
1387-1397: LGTM! HandlerOptions properly configured.The handler options correctly use the extracted loaderHooks variable, maintaining the same functionality while improving code organization.
router-tests/testenv/testenv.go (1)
1310-1334: EngineExecutionConfiguration test defaults and unified cache headers look consistentThe test
engineExecutionConfigliteral matches the updatedEngineExecutionConfiguration/EngineDebugConfigurationshape, and usingDebug.EnableCacheResponseHeaders = trueinstead of the per-cache header flags keeps tests aligned with the new unified header behaviour. No functional issues spotted here.
StarpTech
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Use cache for variables normalization and variables remapping. One new config options to enable hit/miss of caches was introduced and 3 config options were deprecated. Co-authored-by: Yury Smolski <140245+ysmolski@users.noreply.github.com>
Use cache for variables normalization and variables remapping.
Closes ENG-8477
Summary by CodeRabbit
Performance
Observability
New Features / Debug
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.