-
Couldn't load subscription status.
- Fork 28
2x faster historical sync with RPC source #726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a new InternalConfig.sourceSync type and migrates syncConfig references to it across templates and RPC code. Refactors RpcSource backoff to support a global “max” interval signal, updating function signatures and event construction. Adjusts HyperSync field-missing logic control flow. Updates/introduces tests to reflect new return types and add a skipped HyperSync scenario. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant RpcSource
participant Node as RPC Node
participant Dict as mutSuggestedBlockIntervals
participant Loader as Block Loader
Caller->>RpcSource: getNextPage(~mutSuggestedBlockIntervals, ~syncConfig: sourceSync, ~partitionId)
RpcSource->>Node: eth_getLogs(from..to)
alt Success
Node-->>RpcSource: logs[]
RpcSource->>Loader: getBlock(log.blockNumber)
Loader-->>RpcSource: block{ number, timestamp, ... }
RpcSource-->>Caller: page{ events with block cast to Internal.eventBlock }
else Error
Node-->>RpcSource: error exn
RpcSource->>RpcSource: getSuggestedBlockIntervalFromExn(exn) -> Some((range, isMax))?
alt Some((range, true)) [max suggested]
RpcSource->>Dict: set("max", range)
else Some((range, false)) [partition suggested]
RpcSource->>Dict: set(partitionId, range)
else None
RpcSource-->>Caller: propagate/handle without interval update
end
RpcSource-->>Caller: next action/result (per existing control flow)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (5)
codegenerator/cli/npm/envio/src/InternalConfig.res (1)
22-30: Clarify units and bounds; consider naming consistency.
- Add brief field docs (blocks vs milliseconds; expected ranges).
- Optional: rename fallbackStallTimeout -> fallbackStallTimeoutMillis for consistency with other "*Millis" fields.
Apply doc comments:
type sourceSync = { - initialBlockInterval: int, - backoffMultiplicative: float, - accelerationAdditive: int, - intervalCeiling: int, - backoffMillis: int, - queryTimeoutMillis: int, - fallbackStallTimeout: int, + /* initial page size in blocks */ + initialBlockInterval: int, + /* multiplicative backoff factor (> 1.0) */ + backoffMultiplicative: float, + /* additive acceleration on success (blocks) */ + accelerationAdditive: int, + /* upper bound for block interval (blocks) */ + intervalCeiling: int, + /* delay between retries (ms) */ + backoffMillis: int, + /* per-query timeout (ms) */ + queryTimeoutMillis: int, + /* stall timeout when no progress observed (ms) */ + fallbackStallTimeout: int, }codegenerator/cli/npm/envio/src/sources/HyperSync.res (1)
79-95: Add a regression test for “missing transaction with empty required fields”.Ensure we still surface "transaction" in missingParams when nonOptionalTransactionFieldNames is []. This guards the Hyperliquid-specific case.
I can add a targeted test in scenarios/test_codegen/test/HyperSync_test.res to cover this path—want me to open a follow-up PR?
codegenerator/cli/npm/envio/src/sources/RpcSource.res (3)
19-26: Backoff recursion: consider a cap and jitter.Unbounded doubling can lead to long stalls or lock-step retries. Consider a ceiling (e.g., <= 60s) and small jitter to reduce thundering herds.
210-218: Timeout promise may cause unhandled rejections after race settles.If logsPromise resolves first, queryTimoutPromise can still reject later without a consumer. Attach a no-op catch to the timeout or use an abortable pattern.
Example (minimal swallow):
- let queryTimoutPromise = + let queryTimoutPromise = Time.resolvePromiseAfterDelay(~delayMilliseconds=sc.queryTimeoutMillis)->Promise.then(() => Promise.reject( QueryTimout( `Query took longer than ${Belt.Int.toString(sc.queryTimeoutMillis / 1000)} seconds`, ), ) ) + |> Promise.catch(_ => Promise.resolve())Also applies to: 237-246
4-4: Typo: QueryTimout → QueryTimeout (exception and usages).Minor naming polish for clarity.
Apply:
- exception QueryTimout(string) + exception QueryTimeout(string) - QueryTimout( + QueryTimeout(Also applies to: 214-214
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (7)
codegenerator/cli/npm/envio/src/InternalConfig.res(1 hunks)codegenerator/cli/npm/envio/src/sources/HyperSync.res(1 hunks)codegenerator/cli/npm/envio/src/sources/RpcSource.res(13 hunks)codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs(1 hunks)codegenerator/cli/templates/static/codegen/src/Config.res(1 hunks)scenarios/test_codegen/test/HyperSync_test.res(1 hunks)scenarios/test_codegen/test/RpcSource_test.res(4 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}
📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)
**/*.{res,resi}: Never use[| item |]to create an array. Use[ item ]instead.
Must always use=for setting value to a field. Use:=only for ref values created usingreffunction.
ReScript has record types which require a type definition before hand. You can access record fields by dot likefoo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.
Files:
scenarios/test_codegen/test/HyperSync_test.rescodegenerator/cli/npm/envio/src/InternalConfig.rescodegenerator/cli/npm/envio/src/sources/HyperSync.resscenarios/test_codegen/test/RpcSource_test.rescodegenerator/cli/templates/static/codegen/src/Config.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
**/*.res
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored
Files:
scenarios/test_codegen/test/HyperSync_test.rescodegenerator/cli/npm/envio/src/InternalConfig.rescodegenerator/cli/npm/envio/src/sources/HyperSync.resscenarios/test_codegen/test/RpcSource_test.rescodegenerator/cli/templates/static/codegen/src/Config.rescodegenerator/cli/npm/envio/src/sources/RpcSource.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*
📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)
Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src
Files:
codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbscodegenerator/cli/templates/static/codegen/src/Config.res
🧠 Learnings (1)
📚 Learning: 2025-08-11T08:49:57.399Z
Learnt from: CR
PR: enviodev/hyperindex#0
File: scenarios/test_codegen/.cursor/rules/hyperindex.mdc:0-0
Timestamp: 2025-08-11T08:49:57.399Z
Learning: Applies to scenarios/test_codegen/**/*.ts : Always cast timestamps from events to BigInt (e.g., BigInt(event.block.timestamp))
Applied to files:
codegenerator/cli/npm/envio/src/sources/RpcSource.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_and_test
🔇 Additional comments (7)
codegenerator/cli/npm/envio/src/InternalConfig.res (1)
22-30: Good addition: centralized sync config type looks solid.Seven-field record is clear and aligns with the PR goals.
codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs (1)
7-7: No lingering Config.syncConfig usages – templates now use onlysyncConfigOptionsand referenceInternalConfig.sourceSync. ConfirmInternalConfigis included in bs-dependencies so generated projects compile.scenarios/test_codegen/test/RpcSource_test.res (1)
423-423: Tests updated for option<(int, bool)> shape — LGTM.Assertions now check Some((n, isMaxRange)) as per the new API. Good coverage of both suggested ranges and max-range cases.
Also applies to: 463-463, 528-528, 558-558
codegenerator/cli/npm/envio/src/sources/RpcSource.res (4)
597-604: Additive increase guard when a global max exists — LGTM.Skipping additive growth when the global max key is set prevents overshooting provider constraints. Nice.
371-373: Block fetch path simplification — LGTM.Fetching the block by number via the loader/backoff path is clearer and avoids stale field casts.
Also applies to: 493-501
666-678: Type assumptions on Ethers block fields — add verification/asserts.You’re casting Ethers block to Internal.eventBlock via Utils.magic and assuming number/timestamp shapes; mismatches across providers or Ethers versions will surface here. Consider a constructor that validates and maps fields explicitly (or asserts once and logs provider/version).
Would you like a typed adapter for Ethers.JsonRpcProvider.block -> Internal.eventBlock?
241-246: Persist suggested interval under max or partition key — LGTM
ConfirmedpartitionIdis consistently declared as astringin RPC source, SourceManager, and FetchState (and only numeric strings appear in tests); since it’s sourced from user config, its stability is a configuration concern and no in-code normalization is required.
| @inline | ||
| let addMissingParams = (acc, fieldNames, returnedObj, ~prefix) => { | ||
| fieldNames->Array.forEach(fieldName => { | ||
| switch returnedObj | ||
| ->(Utils.magic: 'a => Js.Dict.t<unknown>) | ||
| ->Utils.Dict.dangerouslyGetNonOption(fieldName) { | ||
| | Some(_) => () | ||
| | None => acc->Array.push(prefix ++ "." ++ fieldName)->ignore | ||
| if fieldNames->Utils.Array.notEmpty { | ||
| if !(returnedObj->Obj.magic) { | ||
| acc->Array.push(prefix)->ignore | ||
| } else { | ||
| for idx in 0 to fieldNames->Array.length - 1 { | ||
| let fieldName = fieldNames->Array.getUnsafe(idx) | ||
| switch returnedObj | ||
| ->(Utils.magic: 'a => Js.Dict.t<unknown>) | ||
| ->Utils.Dict.dangerouslyGetNonOption(fieldName) { | ||
| | Some(_) => () | ||
| | None => acc->Array.push(prefix ++ "." ++ fieldName)->ignore | ||
| } | ||
| } | ||
| } | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Always report missing object; avoid truthiness + magic casts.
The non-object detection is gated by fieldNames->notEmpty. If nonOptionalTransactionFieldNames is empty, a missing transaction object won’t be reported—contrary to the PR objective. Also, relying on JS truthiness via Obj.magic is brittle. Replace with explicit JSON classification and hoist the dict cast.
Apply this refactor:
-@inline
-let addMissingParams = (acc, fieldNames, returnedObj, ~prefix) => {
- if fieldNames->Utils.Array.notEmpty {
- if !(returnedObj->Obj.magic) {
- acc->Array.push(prefix)->ignore
- } else {
- for idx in 0 to fieldNames->Array.length - 1 {
- let fieldName = fieldNames->Array.getUnsafe(idx)
- switch returnedObj
- ->(Utils.magic: 'a => Js.Dict.t<unknown>)
- ->Utils.Dict.dangerouslyGetNonOption(fieldName) {
- | Some(_) => ()
- | None => acc->Array.push(prefix ++ "." ++ fieldName)->ignore
- }
- }
- }
- }
-}
+@inline
+let addMissingParams = (acc, fieldNames, returnedObj: Js.Json.t, ~prefix) => {
+ switch Js.Json.classify(returnedObj) {
+ | Js.Json.JSONObject(obj) => {
+ if fieldNames->Utils.Array.notEmpty {
+ let dict = obj
+ for idx in 0 to fieldNames->Array.length - 1 {
+ let fieldName = fieldNames->Array.getUnsafe(idx)
+ switch Js.Dict.get(dict, fieldName) {
+ | Some(_) => ()
+ | None => acc->Array.push(prefix ++ "." ++ fieldName)->ignore
+ }
+ }
+ }
+ }
+ | _ =>
+ /* Missing or non-object => record the whole prefix */
+ acc->Array.push(prefix)->ignore
+ }
+}Update call sites accordingly (outside the changed hunk):
/* in convertEvent */
missingParams->addMissingParams(Log.fieldNames, (event.log: Js.Json.t), ~prefix="log")
missingParams->addMissingParams(nonOptionalBlockFieldNames, (event.block: Js.Json.t), ~prefix="block")
missingParams->addMissingParams(nonOptionalTransactionFieldNames, (event.transaction: Js.Json.t), ~prefix="transaction")🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/sources/HyperSync.res around lines 79–95, the
non-object detection is currently gated by fieldNames->Utils.Array.notEmpty and
uses Obj.magic truthiness; change it to always detect non-object by classifying
returnedObj with Js.Json.classify, and if it's not an object push the prefix
once (so missing object is always reported regardless of fieldNames) and return;
if it is an object hoist a single Js.Dict.t<unknown> cast for reuse then, only
if fieldNames is non-empty, iterate fieldNames and use
Utils.Dict.dangerouslyGetNonOption against the hoisted dict to push missing
field names as prefix ++ "." ++ fieldName; also update the convertEvent call
sites as suggested to pass Js.Json.t for event.log/event.block/event.transaction
with ~prefix values.
| let extractBlockRange = (execResult, ~isMaxRange) => | ||
| switch execResult->Js.Re.captures { | ||
| | [_, Js.Nullable.Value(blockRangeLimit)] => | ||
| switch blockRangeLimit->Int.fromString { | ||
| | Some(blockRangeLimit) if blockRangeLimit > 0 => Some(blockRangeLimit) | ||
| | Some(blockRangeLimit) if blockRangeLimit > 0 => Some(blockRangeLimit, isMaxRange) | ||
| | _ => None | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: incorrect Option tuple construction (Some needs a single tuple argument).
In ReScript, option<(int, bool)> must be constructed as Some((x, y)), not Some(x, y). Current code won’t type-check and will fail at build time.
Apply:
- | Some(blockRangeLimit) if blockRangeLimit > 0 => Some(blockRangeLimit, isMaxRange)
+ | Some(blockRangeLimit) if blockRangeLimit > 0 => Some((blockRangeLimit, isMaxRange))
- | (Some(fromBlock), Some(toBlock)) if toBlock >= fromBlock =>
- Some(toBlock - fromBlock + 1, false)
+ | (Some(fromBlock), Some(toBlock)) if toBlock >= fromBlock =>
+ Some((toBlock - fromBlock + 1, false))
- | Some(_) => Some(2000, true)
+ | Some(_) => Some((2000, true))
- | Some(_) => Some(10000, true)
+ | Some(_) => Some((10000, true))Also applies to: 127-128, 153-154, 159-160
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/sources/RpcSource.res around lines 110-116,
the code constructs options incorrectly using multiple arguments for Some;
change those to pass a single tuple value (e.g., Some((blockRangeLimit,
isMaxRange))) so option<(int,bool)> is built correctly; apply the same fix at
the other occurrences reported (lines 127-128, 153-154, 159-160) replacing
Some(x, y) with Some((x, y)).
| ?fallbackStallTimeout, | ||
| }: syncConfigOptions, | ||
| ): syncConfig => { | ||
| ): InternalConfig.sourceSync => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Return type switch to InternalConfig.sourceSync looks good; verify template/call-site alignment.
This aligns the template with the new public type. Please double-check that all consumers (templates and runtime) now expect InternalConfig.sourceSync and that no references to Config.syncConfig remain.
Run to confirm no stale references:
🏁 Script executed:
#!/bin/bash
rg -nP 'Config\.syncConfig\b|:?\s*syncConfig\b'
rg -nP 'InternalConfig\.sourceSync\b'Length of output: 6790
Stale Config.syncConfig references found—update to InternalConfig.sourceSync
- scenarios/test_codegen/test/**/* and mocks still use
Config.getSyncConfig - static template at
codegenerator/cli/templates/static/codegen/src/eventFetching/NetworkSources.res - dynamic template
codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs - HBS code in
codegenerator/cli/src/hbs_templating/codegen_templates.rs - runtime sources in
codegenerator/cli/npm/envio/src/sources/RpcSource.res
Ensure every call-site uses InternalConfig.sourceSync and remove all Config.syncConfig/getSyncConfig occurrences.
🤖 Prompt for AI Agents
In codegenerator/cli/templates/static/codegen/src/Config.res around line 29,
there are stale references to Config.syncConfig/getSyncConfig; replace all
call-sites to use InternalConfig.sourceSync instead, update the return/type
usage to match InternalConfig.sourceSync, and remove any remaining
Config.syncConfig/getSyncConfig occurrences; repeat this replacement in the
listed templates and files (scenarios/test_codegen/test/**/*, mocks,
codegenerator/cli/templates/static/codegen/src/eventFetching/NetworkSources.res,
codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs,
codegenerator/cli/src/hbs_templating/codegen_templates.rs, and
codegenerator/cli/npm/envio/src/sources/RpcSource.res), ensuring imports/aliases
are adjusted and any type mismatches are fixed so callers compile against
InternalConfig.sourceSync.
| @@ -0,0 +1,41 @@ | |||
| open RescriptMocha | |||
|
|
|||
| let testApiToken = "3dc856dd-b0ea-494f-b27e-017b8b6b7e07" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded API token committed — remove and rotate immediately.
Never commit secrets, even in skipped tests. Replace with an env-driven value and rotate the exposed token.
Apply:
- let testApiToken = "3dc856dd-b0ea-494f-b27e-017b8b6b7e07"
+ // Read from your CI/test runner env and document the requirement.
+ let testApiToken = "<SET_HYPERSYNC_API_TOKEN_IN_ENV>"Optionally, gate the test at runtime when the token is missing.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let testApiToken = "3dc856dd-b0ea-494f-b27e-017b8b6b7e07" | |
| // Read from your CI/test runner env and document the requirement. | |
| let testApiToken = "<SET_HYPERSYNC_API_TOKEN_IN_ENV>" |
🤖 Prompt for AI Agents
In scenarios/test_codegen/test/HyperSync_test.res around line 3, a hardcoded API
token was committed; remove the literal string and replace it with an
environment-driven value (e.g., read from process.env or a test-specific config)
so the test reads the token at runtime, rotate the exposed token immediately,
and add a runtime guard in the test to skip/fail with a clear message if the env
var is missing to avoid accidental failures.
Summary by CodeRabbit
New Features
Bug Fixes
Configuration
Tests