Attempt fallback source when query is not retryable #799

DZakh · 2025-10-28T15:30:18Z

Summary by CodeRabbit

Bug Fixes
- Clarified error messages for sources returning invalid/missing fields and adjusted retry behavior so certain invalid-data cases are no longer retried.
New Features
- Improved source selection and failover so the system can switch to a fallback source when a primary returns invalid data, deferring active-source updates until after query execution.
Tests
- Added an end-to-end test verifying fallback behavior when primary source fails with missing parameters.

coderabbitai · 2025-10-28T15:30:46Z

Walkthrough

This PR changes error handling for missing-parameter/invalid-data cases to use a new ImpossibleForTheQuery error variant instead of retry-with-backoff, updates related error payloads, and adds SourceManager source-rotation/fallback plumbing plus an E2E test that exercises fallback when the primary source returns missing params.

Changes

Cohort / File(s)	Summary
Error Type System `codegenerator/cli/npm/envio/src/sources/Source.res`	Added `ImpossibleForTheQuery({message: string})` to `getItemsRetry`; removed `FailedParsingItems` from `getItemsError`.
HyperFuel / HyperSync `codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res`, `.../HyperSyncSource.res`	Replaced retry-with-backoff for `UnexpectedMissingParams` with `ImpossibleForTheQuery` and updated error messages; removed `backoffMillis` from those branches.
RPC Parsing Errors `codegenerator/cli/npm/envio/src/sources/RpcSource.res`	Replaced `FailedParsingItems` usages with `FailedGettingItems` carrying `attemptedToBlock` and a `retry` field containing `ImpossibleForTheQuery` with contextual messages (now include block/log info in one path).
Source Rotation & Fallback Logic `codegenerator/cli/npm/envio/src/sources/SourceManager.res`	`getNextSyncSource` signature now accepts `~currentSource`; `executeQuery` uses `sourceRef` + `shouldUpdateActiveSource` to defer activeSource mutation and to select/commit fallback sources on specific errors.
E2E Testing `scenarios/test_codegen/test/E2E_test.res`	New test: "Should attempt fallback source when primary source fails with missing params" — wires primary+fallback, asserts fallback invoked on `ImpossibleForTheQuery` and call counts updated.
Manifest `package.json`	Unchanged/for context.

Sequence Diagram(s)

sequenceDiagram
    participant Q as QueryExecutor
    participant SM as SourceManager
    participant S1 as PrimarySource
    participant S2 as FallbackSource

    Q->>SM: executeQuery(initialSource = S1)
    SM->>S1: GetItemsOrThrow()
    S1-->>SM: Error: ImpossibleForTheQuery("missing required fields...")

    rect #ffdede
    Note over SM: resolve fallback using currentSource
    SM->>SM: getNextSyncSource(~currentSource = S1)
    SM-->>SM: returns S2
    end

    SM->>SM: sourceRef = S2\nshouldUpdateActiveSource = true
    SM->>S2: GetItemsOrThrow()
    S2-->>SM: Success (data)

    rect #eaffea
    Note over SM: commit activeSource if flagged
    SM->>SM: if shouldUpdateActiveSource => activeSource = sourceRef
    end

    SM-->>Q: Return data from S2

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas to focus on:
- SourceManager.res — verify deferred activeSource mutation and getNextSyncSource(~currentSource) behavior (race/state correctness).
- RpcSource.res — ensure callers handle the new FailedGettingItems payload shape and nested retry.
- Consistency across HyperFuel/HyperSync/Rpc sources for ImpossibleForTheQuery semantics and logging/messages.
- E2E test (E2E_test.res) — validate it accurately exercises the fallback path and assertions reflect expected call counts.

Possibly related PRs

Internal E2E tests framework + 2.29 start block regression fix #744 — Adds centralized Mock.Source and test harness utilities used by the new E2E fallback test and source-fallback logic.

Suggested reviewers

JasoonS
JonoPrest

Poem

🐇 I hopped between sources, quick and spry,
When params went missing I didn't sigh.
No backoff pause, I pivoted true,
Found a fallback and fetched something new —
A little hop, resilience in the sky.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "Attempt fallback source when query is not retryable" is directly related to the core changes across the entire changeset. The modifications introduce a new `ImpossibleForTheQuery` error variant that marks certain queries as non-retryable, update multiple source files to use this variant for specific error conditions, and significantly alter the `SourceManager` to attempt fallback sources when encountering these non-retryable errors instead of repeatedly retrying. The title accurately captures the main behavioral shift—the system's new capability to switch to fallback sources rather than retrying impossible queries. It is concise, specific, and would allow a teammate scanning history to understand the primary architectural change being introduced.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch dz/attempt-fallback-source-when-query-is-not-retryable

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res (1)

261-265: Good switch to fallback-triggering retry type. Add range context to message.

The new ImpossibleForTheQuery is correct for invalid/missing fields. Recommend including the attempted block range to aid triage.

Apply this diff:

-              ImpossibleForTheQuery({
-                message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(
-                    ", ",
-                  )}`,
-              })
+              ImpossibleForTheQuery({
+                message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(", ")}. Range: ${fromBlock->Int.toString}-${toBlock->Option.getWithDefault(currentBlockHeight)->Int.toString}`,
+              })

codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res (1)

291-296: Consistent fallback semantics; include block range in message.

Align message with range context for faster debugging.

Apply this diff:

-              ImpossibleForTheQuery({
-                message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(
-                    ", ",
-                  )}`,
-              })
+              ImpossibleForTheQuery({
+                message: `Source returned invalid data with missing required fields: ${missingParams->Js.Array2.joinWith(", ")}. Range: ${fromBlock->Int.toString}-${toBlock->Option.getWithDefault(currentBlockHeight)->Int.toString}`,
+              })

scenarios/test_codegen/test/E2E_test.res (1)

562-635: Nice E2E for fallback path; add one assertion for clarity.

Add an assertion that fallback height wasn’t queried (or was queried exactly once if that’s expected) to make the “don’t switch active source” claim explicit.

Example:

       // The fallback source should now be called immediately
       Assert.deepEqual(
         sourceMockFallback.getItemsOrThrowCalls->Array.length,
         1,
         ~message="fallback source should be called after primary fails with invalid data",
       )
+      // Ensure we didn't switch active source for height resolution
+      Assert.deepEqual(
+        sourceMockFallback.getHeightOrThrowCalls->Array.length,
+        0,
+        ~message="should not switch to fallback for height after transient invalid-data failure",
+      )

codegenerator/cli/npm/envio/src/sources/RpcSource.res (2)

675-681: Good conversion to new error shape; include source and range.

Add source name and block range to the message for easier support.

Apply this diff:

-            FailedGettingItems({
-              exn,
-              attemptedToBlock: toBlock,
-              retry: ImpossibleForTheQuery({
-                message: "Failed to parse events using hypersync client decoder. Please double-check your ABI.",
-              }),
-            }),
+            FailedGettingItems({
+              exn,
+              attemptedToBlock: toBlock,
+              retry: ImpossibleForTheQuery({
+                message: `Failed to parse events using hypersync client decoder. Source: ${name}. Range: ${fromBlock->Int.toString}-${toBlock->Int.toString}. Please double-check your ABI.`,
+              }),
+            }),

810-816: Consistent fallback trigger on viem parse errors; include source name.

Message already includes block/log; add source for parity.

Apply this diff:

-                        retry: ImpossibleForTheQuery({
-                          message: `Failed to parse event with viem, please double-check your ABI. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}`,
-                        }),
+                        retry: ImpossibleForTheQuery({
+                          message: `Failed to parse event with viem. Source: ${name}. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}. Please double-check your ABI.`,
+                        }),

codegenerator/cli/npm/envio/src/sources/Source.res (1)

26-26: Verification confirmed—add optional documentation for clarity.

The code correctly implements ImpossibleForTheQuery. SourceManager (line 443–449) immediately attempts getNextSyncSource with fallback enabled when this variant is encountered, resetting retry count and not updating active source. Apply the suggested doc:
+ /** Retry hint semantics:
+  * - WithSuggestedToBlock: shrink range and retry same source.
+  * - WithBackoff: transient issue; retry same source after delay.
+  * - ImpossibleForTheQuery: data contract mismatch or invalid source response; try a different source immediately.
+  */
  type getItemsRetry =

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e3cba5 and e8892f5.

📒 Files selected for processing (6)

codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res (1 hunks)
codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res (1 hunks)
codegenerator/cli/npm/envio/src/sources/RpcSource.res (2 hunks)
codegenerator/cli/npm/envio/src/sources/Source.res (1 hunks)
codegenerator/cli/npm/envio/src/sources/SourceManager.res (10 hunks)
scenarios/test_codegen/test/E2E_test.res (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

scenarios/test_codegen/test/E2E_test.res
codegenerator/cli/npm/envio/src/sources/Source.res
codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res
codegenerator/cli/npm/envio/src/sources/SourceManager.res
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res
codegenerator/cli/npm/envio/src/sources/RpcSource.res

**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

scenarios/test_codegen/test/E2E_test.res
codegenerator/cli/npm/envio/src/sources/Source.res
codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res
codegenerator/cli/npm/envio/src/sources/SourceManager.res
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res
codegenerator/cli/npm/envio/src/sources/RpcSource.res

codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

codegenerator/cli/npm/envio/src/sources/Source.res
codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res
codegenerator/cli/npm/envio/src/sources/SourceManager.res
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res
codegenerator/cli/npm/envio/src/sources/RpcSource.res

codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

codegenerator/cli/npm/envio/src/sources/Source.res
codegenerator/cli/npm/envio/src/sources/HyperSyncSource.res
codegenerator/cli/npm/envio/src/sources/SourceManager.res
codegenerator/cli/npm/envio/src/sources/HyperFuelSource.res
codegenerator/cli/npm/envio/src/sources/RpcSource.res

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build_and_test

🔇 Additional comments (6)

codegenerator/cli/npm/envio/src/sources/SourceManager.res (6)

352-357: LGTM: Source tracking pattern improves concurrency safety.

The use of a local sourceRef with deferred commit to sourceManager.activeSource prevents concurrent queries from observing intermediate source switches during retries. This improves isolation between concurrent partition fetches.

397-434: LGTM: Permanent source removal correctly handled.

The error handling for irrecoverable errors (UnsupportedSelection, FailedGettingFieldSelection) correctly:

Computes next source before deletion to maintain rotation order

Permanently removes the failed source from the source set

Raises an error if no alternative exists

Commits the source switch with shouldUpdateActiveSource=true for permanent transitions

443-469: Clarify the temporary fallback behavior for ImpossibleForTheQuery.

The error handling sets shouldUpdateActiveSource := false at line 466, meaning the source switch is temporary and subsequent queries will revert to the original activeSource. This contrasts with other error types that commit the switch permanently.

This behavior implies that ImpossibleForTheQuery represents a query-specific failure (e.g., missing parameters for this particular query) rather than a source-level failure. However, if a query is "impossible" for a source, why would the next query succeed with the same source?

Additionally, the check at line 451 (nextSource !== initialSource) determines if another source is available. This comparison means "we've rotated back to the starting source," but consider this scenario:

initialSource = sourceA (sync)

source (current) = sourceB (sync)

nextSource = sourceA (rotated back)

The check would pass (another source exists), but we've already tried all sources

Could you clarify:

What types of failures trigger ImpossibleForTheQuery?

Why is the source switch temporary rather than permanent?

Should the availability check compare nextSource !== source instead to detect if rotation found a different source?

470-518: LGTM: Graduated retry strategy with backoff correctly implemented.

The error handling for WithBackoff implements a graduated strategy:

Tolerates first 2 failures on the same source

Attempts source rotation every 2nd retry thereafter

Includes fallback sources after 10 retries for broader resilience

Permanently commits successful switches with shouldUpdateActiveSource=true

The check at line 506 (nextSource !== source) correctly determines if rotation yielded a different source.

303-339: ****

The rotation concern raised is incorrect. At line 398, getNextSyncSource is called with ~currentSource=source before the source is deleted at line 402. The deletion happens only after getNextSyncSource returns and nextSource is assigned.

All three call sites follow this pattern:

Line 398: getNextSyncSource(~currentSource=source) called, then sources->Utils.Set.delete(source) at line 402

Line 445: getNextSyncSource(~currentSource=source) called during query failure handling

Line 486: getNextSyncSource(~currentSource=source) called during retry logic

Since currentSource is always the source variable currently executing a query, it remains in sourceManager.sources when getNextSyncSource is invoked. The condition at line 319 (if source === currentSource) will correctly set hasActive := true, and the rotation logic will properly split sources into before/after arrays.

Likely an incorrect or invalid review comment.

525-527: Clarify whether non-deterministic activeSource selection is intentional or requires synchronization.

The concern identifies a real issue, though the characterization needs refinement. JavaScript's single-threaded event loop means the writes at line 288 (waitForNewBlock) and line 526 (executeQuery) execute sequentially, not in true parallel—but the final activeSource value is indeed unpredictable based on completion order rather than optimal source selection.

The deferred update strategy with initialSource capture (line 352) appears intentional to isolate query execution, but it doesn't account for writes from waitForNewBlock (line 288), which can overwrite the deferred update. No documentation explains whether this non-deterministic behavior is acceptable or needs coordination.

…-retryable

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8892f5 and 92539ab.

📒 Files selected for processing (1)

codegenerator/cli/npm/envio/src/sources/RpcSource.res (2 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

codegenerator/cli/npm/envio/src/sources/RpcSource.res

codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

codegenerator/cli/npm/envio/src/sources/RpcSource.res

**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

codegenerator/cli/npm/envio/src/sources/RpcSource.res

codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

codegenerator/cli/npm/envio/src/sources/RpcSource.res

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build_and_test

coderabbitai · 2025-10-29T12:33:28Z

codegenerator/cli/npm/envio/src/sources/RpcSource.res

+            FailedGettingItems({
              exn,
-              blockNumber: fromBlock,
-              logIndex: 0,
+              attemptedToBlock: toBlock,
+              retry: ImpossibleForTheQuery({
+                message: "Failed to parse events using hypersync client decoder. Please double-check your ABI.",
+              }),
            }),


⚠️ Potential issue | 🟠 Major

Semantic mismatch: FailedGettingItems used for parsing failure.

This error is raised when decoder.decodeEvents fails to parse already-fetched events, but FailedGettingItems implies a fetch/retrieval failure. The data was successfully retrieved from the RPC; it's the decoding that failed. Using FailedGettingItems conflates distinct failure modes (fetch vs. parse) and could mislead operators debugging issues or analyzing error metrics.

Consider using a more semantically accurate variant name like FailedParsingItems or introducing a dedicated parsing-error variant.

coderabbitai · 2025-10-29T12:33:28Z

codegenerator/cli/npm/envio/src/sources/RpcSource.res

+                      FailedGettingItems({
                        exn,
-                        blockNumber,
-                        logIndex,
+                        attemptedToBlock: toBlock,
+                        retry: ImpossibleForTheQuery({
+                          message: `Failed to parse event with viem, please double-check your ABI. Block number: ${blockNumber->Int.toString}, log index: ${logIndex->Int.toString}`,
+                        }),
                      }),


⚠️ Potential issue | 🟠 Major

Semantic mismatch and loss of precision in error reporting.

Two concerns:

Semantic mismatch: Like line 681, this uses FailedGettingItems for a Viem parsing failure. The log was successfully fetched; parsing failed. This conflates fetch and parse failures.

Loss of precision: The error uses attemptedToBlock: toBlock (the query's upper bound from line 623), but the actual failing log has a specific blockNumber (line 784) and logIndex (line 785). While the message string includes these values, storing them in structured fields would improve queryability in logging/monitoring systems and make debugging easier.

Consider using a semantically accurate error variant and preserving block-level precision in structured fields.

🤖 Prompt for AI Agents

In codegenerator/cli/npm/envio/src/sources/RpcSource.res around lines 816 to 822, the code currently emits FailedGettingItems for a viem parsing failure and only stores attemptedToBlock (query upper bound) while embedding the real blockNumber/logIndex in a message string; change the error to a semantically accurate variant (e.g., FailedParsingEvent or another parsing-specific error used elsewhere) and include structured fields for the actual failing block and log index (e.g., actualBlockNumber and actualLogIndex or blockNumber and logIndex) in the error object instead of relying solely on attemptedToBlock, leaving attemptedToBlock only if still useful for context; keep the human-readable message but ensure the structured fields carry the precise values for observability/querying.

Attempt fallback source when query is not retriable

e8892f5

coderabbitai bot reviewed Oct 28, 2025

View reviewed changes

DZakh requested a review from JonoPrest October 28, 2025 16:51

JonoPrest approved these changes Oct 29, 2025

View reviewed changes

Merge branch 'main' into dz/attempt-fallback-source-when-query-is-not…

92539ab

…-retryable

DZakh enabled auto-merge (squash) October 29, 2025 12:30

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

DZakh merged commit d682d59 into main Oct 29, 2025
2 checks passed

DZakh deleted the dz/attempt-fallback-source-when-query-is-not-retryable branch October 29, 2025 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Attempt fallback source when query is not retryable #799

Attempt fallback source when query is not retryable #799

Uh oh!

DZakh commented Oct 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 28, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Attempt fallback source when query is not retryable #799

Attempt fallback source when query is not retryable #799

Uh oh!

Conversation

DZakh commented Oct 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DZakh commented Oct 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 28, 2025 •

edited

Loading