-
Notifications
You must be signed in to change notification settings - Fork 28
Attempt fallback source when query is not retryable #799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt fallback source when query is not retryable #799
Conversation
WalkthroughThis PR changes error handling for missing-parameter/invalid-data cases to use a new ImpossibleForTheQuery error variant instead of retry-with-backoff, updates related error payloads, and adds SourceManager source-rotation/fallback plumbing plus an E2E test that exercises fallback when the primary source returns missing params. Changes
Sequence Diagram(s)sequenceDiagram
participant Q as QueryExecutor
participant SM as SourceManager
participant S1 as PrimarySource
participant S2 as FallbackSource
Q->>SM: executeQuery(initialSource = S1)
SM->>S1: GetItemsOrThrow()
S1-->>SM: Error: ImpossibleForTheQuery("missing required fields...")
rect #ffdede
Note over SM: resolve fallback using currentSource
SM->>SM: getNextSyncSource(~currentSource = S1)
SM-->>SM: returns S2
end
SM->>SM: sourceRef = S2\nshouldUpdateActiveSource = true
SM->>S2: GetItemsOrThrow()
S2-->>SM: Success (data)
rect #eaffea
Note over SM: commit activeSource if flagged
SM->>SM: if shouldUpdateActiveSource => activeSource = sourceRef
end
SM-->>Q: Return data from S2
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res (1)
261-265: Good switch to fallback-triggering retry type. Add range context to message.The new ImpossibleForTheQuery is correct for invalid/missing fields. Recommend including the attempted block range to aid triage.
Apply this diff:
- ImpossibleForTheQuery({ - message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith( - ", ", - )}`, - }) + ImpossibleForTheQuery({ + message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(", ")}. Range: ${fromBlock->Int.toString}-${toBlock->Option.getWithDefault(currentBlockHeight)->Int.toString}`, + })codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res (1)
291-296: Consistent fallback semantics; include block range in message.Align message with range context for faster debugging.
Apply this diff:
- ImpossibleForTheQuery({ - message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith( - ", ", - )}`, - }) + ImpossibleForTheQuery({ + message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(", ")}. Range: ${fromBlock->Int.toString}-${toBlock->Option.getWithDefault(currentBlockHeight)->Int.toString}`, + })scenarios/test_codegen/test/E2E_test.res (1)
562-635: Nice E2E for fallback path; add one assertion for clarity.Add an assertion that fallback height wasn’t queried (or was queried exactly once if that’s expected) to make the “don’t switch active source” claim explicit.
Example:
// The fallback source should now be called immediately Assert.deepEqual( sourceMockFallback.getItemsOrThrowCalls->Array.length, 1, ~message="fallback source should be called after primary fails with invalid data", ) + // Ensure we didn't switch active source for height resolution + Assert.deepEqual( + sourceMockFallback.getHeightOrThrowCalls->Array.length, + 0, + ~message="should not switch to fallback for height after transient invalid-data failure", + )codegenerator/cli/npm/envio/src/sources/RpcSource.res (2)
675-681: Good conversion to new error shape; include source and range.Add source name and block range to the message for easier support.
Apply this diff:
- FailedGettingItems({ - exn, - attemptedToBlock: toBlock, - retry: ImpossibleForTheQuery({ - message: "Failed to parse events using hypersync client decoder. Please double-check your ABI.", - }), - }), + FailedGettingItems({ + exn, + attemptedToBlock: toBlock, + retry: ImpossibleForTheQuery({ + message: `Failed to parse events using hypersync client decoder. Source: ${name}. Range: ${fromBlock->Int.toString}-${toBlock->Int.toString}. Please double-check your ABI.`, + }), + }),
810-816: Consistent fallback trigger on viem parse errors; include source name.Message already includes block/log; add source for parity.
Apply this diff:
- retry: ImpossibleForTheQuery({ - message: `Failed to parse event with viem, please double-check your ABI. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}`, - }), + retry: ImpossibleForTheQuery({ + message: `Failed to parse event with viem. Source: ${name}. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}. Please double-check your ABI.`, + }),codegenerator/cli/npm/envio/src/sources/Source.res (1)
26-26: Verification confirmed—add optional documentation for clarity.The code correctly implements
ImpossibleForTheQuery. SourceManager (line 443–449) immediately attemptsgetNextSyncSourcewith fallback enabled when this variant is encountered, resetting retry count and not updating active source. Apply the suggested doc:+ /** Retry hint semantics: + * - WithSuggestedToBlock: shrink range and retry same source. + * - WithBackoff: transient issue; retry same source after delay. + * - ImpossibleForTheQuery: data contract mismatch or invalid source response; try a different source immediately. + */ type getItemsRetry =
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res(1 hunks)codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res(1 hunks)codegenerator/cli/npm/envio/src/sources/RpcSource.res(2 hunks)codegenerator/cli/npm/envio/src/sources/Source.res(1 hunks)codegenerator/cli/npm/envio/src/sources/SourceManager.res(10 hunks)scenarios/test_codegen/test/E2E_test.res(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{res,resi}
📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)
**/*.{res,resi}: Never use[| item |]to create an array. Use[ item ]instead.
Must always use=for setting value to a field. Use:=only for ref values created usingreffunction.
ReScript has record types which require a type definition before hand. You can access record fields by dot likefoo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.
Files:
scenarios/test_codegen/test/E2E_test.rescodegenerator/cli/npm/envio/src/sources/Source.rescodegenerator/cli/npm/envio/src/sources/HyperSyncSource.rescodegenerator/cli/npm/envio/src/sources/SourceManager.rescodegenerator/cli/npm/envio/src/sources/HyperFuelSource.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
**/*.res
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Prefer reading ReScript .res modules directly
Files:
scenarios/test_codegen/test/E2E_test.rescodegenerator/cli/npm/envio/src/sources/Source.rescodegenerator/cli/npm/envio/src/sources/HyperSyncSource.rescodegenerator/cli/npm/envio/src/sources/SourceManager.rescodegenerator/cli/npm/envio/src/sources/HyperFuelSource.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
codegenerator/cli/**
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
The Rust CLI lives in codegenerator/cli
Files:
codegenerator/cli/npm/envio/src/sources/Source.rescodegenerator/cli/npm/envio/src/sources/HyperSyncSource.rescodegenerator/cli/npm/envio/src/sources/SourceManager.rescodegenerator/cli/npm/envio/src/sources/HyperFuelSource.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
codegenerator/cli/npm/envio/**
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Shared library-fied runtime lives in codegenerator/cli/npm/envio
Files:
codegenerator/cli/npm/envio/src/sources/Source.rescodegenerator/cli/npm/envio/src/sources/HyperSyncSource.rescodegenerator/cli/npm/envio/src/sources/SourceManager.rescodegenerator/cli/npm/envio/src/sources/HyperFuelSource.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_and_test
🔇 Additional comments (6)
codegenerator/cli/npm/envio/src/sources/SourceManager.res (6)
352-357: LGTM: Source tracking pattern improves concurrency safety.The use of a local
sourceRefwith deferred commit tosourceManager.activeSourceprevents concurrent queries from observing intermediate source switches during retries. This improves isolation between concurrent partition fetches.
397-434: LGTM: Permanent source removal correctly handled.The error handling for irrecoverable errors (
UnsupportedSelection,FailedGettingFieldSelection) correctly:
- Computes next source before deletion to maintain rotation order
- Permanently removes the failed source from the source set
- Raises an error if no alternative exists
- Commits the source switch with
shouldUpdateActiveSource=truefor permanent transitions
443-469: Clarify the temporary fallback behavior for ImpossibleForTheQuery.The error handling sets
shouldUpdateActiveSource := falseat line 466, meaning the source switch is temporary and subsequent queries will revert to the originalactiveSource. This contrasts with other error types that commit the switch permanently.This behavior implies that
ImpossibleForTheQueryrepresents a query-specific failure (e.g., missing parameters for this particular query) rather than a source-level failure. However, if a query is "impossible" for a source, why would the next query succeed with the same source?Additionally, the check at line 451 (
nextSource !== initialSource) determines if another source is available. This comparison means "we've rotated back to the starting source," but consider this scenario:
initialSource= sourceA (sync)source(current) = sourceB (sync)nextSource= sourceA (rotated back)- The check would pass (another source exists), but we've already tried all sources
Could you clarify:
- What types of failures trigger
ImpossibleForTheQuery?- Why is the source switch temporary rather than permanent?
- Should the availability check compare
nextSource !== sourceinstead to detect if rotation found a different source?
470-518: LGTM: Graduated retry strategy with backoff correctly implemented.The error handling for
WithBackoffimplements a graduated strategy:
- Tolerates first 2 failures on the same source
- Attempts source rotation every 2nd retry thereafter
- Includes fallback sources after 10 retries for broader resilience
- Permanently commits successful switches with
shouldUpdateActiveSource=trueThe check at line 506 (
nextSource !== source) correctly determines if rotation yielded a different source.
303-339: ****The rotation concern raised is incorrect. At line 398,
getNextSyncSourceis called with~currentSource=sourcebefore the source is deleted at line 402. The deletion happens only aftergetNextSyncSourcereturns andnextSourceis assigned.All three call sites follow this pattern:
- Line 398:
getNextSyncSource(~currentSource=source)called, thensources->Utils.Set.delete(source)at line 402- Line 445:
getNextSyncSource(~currentSource=source)called during query failure handling- Line 486:
getNextSyncSource(~currentSource=source)called during retry logicSince
currentSourceis always the source variable currently executing a query, it remains insourceManager.sourceswhengetNextSyncSourceis invoked. The condition at line 319 (if source === currentSource) will correctly sethasActive := true, and the rotation logic will properly split sources into before/after arrays.Likely an incorrect or invalid review comment.
525-527: Clarify whether non-deterministic activeSource selection is intentional or requires synchronization.The concern identifies a real issue, though the characterization needs refinement. JavaScript's single-threaded event loop means the writes at line 288 (
waitForNewBlock) and line 526 (executeQuery) execute sequentially, not in true parallel—but the finalactiveSourcevalue is indeed unpredictable based on completion order rather than optimal source selection.The deferred update strategy with
initialSourcecapture (line 352) appears intentional to isolate query execution, but it doesn't account for writes fromwaitForNewBlock(line 288), which can overwrite the deferred update. No documentation explains whether this non-deterministic behavior is acceptable or needs coordination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
codegenerator/cli/npm/envio/src/sources/RpcSource.res(2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{res,resi}
📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)
**/*.{res,resi}: Never use[| item |]to create an array. Use[ item ]instead.
Must always use=for setting value to a field. Use:=only for ref values created usingreffunction.
ReScript has record types which require a type definition before hand. You can access record fields by dot likefoo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.
Files:
codegenerator/cli/npm/envio/src/sources/RpcSource.res
codegenerator/cli/**
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
The Rust CLI lives in codegenerator/cli
Files:
codegenerator/cli/npm/envio/src/sources/RpcSource.res
**/*.res
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Prefer reading ReScript .res modules directly
Files:
codegenerator/cli/npm/envio/src/sources/RpcSource.res
codegenerator/cli/npm/envio/**
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Shared library-fied runtime lives in codegenerator/cli/npm/envio
Files:
codegenerator/cli/npm/envio/src/sources/RpcSource.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_and_test
| FailedGettingItems({ | ||
| exn, | ||
| blockNumber: fromBlock, | ||
| logIndex: 0, | ||
| attemptedToBlock: toBlock, | ||
| retry: ImpossibleForTheQuery({ | ||
| message: "Failed to parse events using hypersync client decoder. Please double-check your ABI.", | ||
| }), | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semantic mismatch: FailedGettingItems used for parsing failure.
This error is raised when decoder.decodeEvents fails to parse already-fetched events, but FailedGettingItems implies a fetch/retrieval failure. The data was successfully retrieved from the RPC; it's the decoding that failed. Using FailedGettingItems conflates distinct failure modes (fetch vs. parse) and could mislead operators debugging issues or analyzing error metrics.
Consider using a more semantically accurate variant name like FailedParsingItems or introducing a dedicated parsing-error variant.
| FailedGettingItems({ | ||
| exn, | ||
| blockNumber, | ||
| logIndex, | ||
| attemptedToBlock: toBlock, | ||
| retry: ImpossibleForTheQuery({ | ||
| message: `Failed to parse event with viem, please double-check your ABI. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}`, | ||
| }), | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semantic mismatch and loss of precision in error reporting.
Two concerns:
-
Semantic mismatch: Like line 681, this uses
FailedGettingItemsfor a Viem parsing failure. The log was successfully fetched; parsing failed. This conflates fetch and parse failures. -
Loss of precision: The error uses
attemptedToBlock: toBlock(the query's upper bound from line 623), but the actual failing log has a specificblockNumber(line 784) andlogIndex(line 785). While the message string includes these values, storing them in structured fields would improve queryability in logging/monitoring systems and make debugging easier.
Consider using a semantically accurate error variant and preserving block-level precision in structured fields.
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/sources/RpcSource.res around lines 816 to
822, the code currently emits FailedGettingItems for a viem parsing failure and
only stores attemptedToBlock (query upper bound) while embedding the real
blockNumber/logIndex in a message string; change the error to a semantically
accurate variant (e.g., FailedParsingEvent or another parsing-specific error
used elsewhere) and include structured fields for the actual failing block and
log index (e.g., actualBlockNumber and actualLogIndex or blockNumber and
logIndex) in the error object instead of relying solely on attemptedToBlock,
leaving attemptedToBlock only if still useful for context; keep the
human-readable message but ensure the structured fields carry the precise values
for observability/querying.
Summary by CodeRabbit
Bug Fixes
New Features
Tests