Skip to content

Conversation

@DZakh
Copy link
Member

@DZakh DZakh commented Sep 22, 2025

This is only a refactoring PR, which shouldn't change the indexer behaviour.

It prevents storing Dynamic Contracts in the store until their registration event is processed.

This allows us:

  • to simplify indexer restart logic
  • don't need log index on indexer restart

Also, rewritten related test to use the new internal test framework.

Needed for new rollback on reorg solution - since it won't support writing history for blocks which are not processed.

Summary by CodeRabbit

  • New Features
    • Smarter per-chain batching of dynamic contract registrations.
    • Streamlined startup behavior by removing an extra cleanup step.
  • Performance
    • Reduced overhead in dynamic contract queue handling.
    • Added in-place array clearing to lower allocations.
  • Reliability
    • Improved rollback handling for dynamic contracts.
    • More consistent state propagation across components.
  • Refactor
    • Renamed batch state field to align with updated flow.
  • Tests
    • Added E2E rollback test for dynamic contracts.
    • Updated unit tests for new data shapes; removed obsolete restart/cleanup tests.

@DZakh DZakh requested a review from JonoPrest September 22, 2025 11:34
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 22, 2025

Walkthrough

Renamed Batch.t field fetchStates to updatedFetchStates. Converted FetchState.dcsToStore from option to array and updated associated logic. ChainManager reworked batch formation to compute per-chain batchDcs and leftover dcs, propagating updatedFetchStates. Removed DynamicContractRegistry cleanup query and its usages/tests. Added Array.clearInPlace. Updated tests accordingly and added a new rollback test.

Changes

Cohort / File(s) Summary
Batch struct rename
codegenerator/cli/npm/envio/src/Batch.res
Renamed public field fetchStates to updatedFetchStates in type t.
FetchState dcsToStore type change
codegenerator/cli/npm/envio/src/FetchState.res
Changed dcsToStore from option<array<indexingContract>> to array<indexingContract>; updated init, merge, registerDynamicContracts, and rollback paths accordingly.
ChainManager batching flow
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
Reworked batching: computes per-chain batchDcs vs leftDcsToStore, branches on sizePerChain, updates and emits updatedFetchStates. Replaced downstream fetchStates usages.
Global state wiring
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
Switched references from batch.fetchStates to batch.updatedFetchStates in processing and dispatch paths.
Storage cleanup removal
codegenerator/cli/npm/envio/src/PgStorage.res, codegenerator/cli/npm/envio/src/db/InternalTable.res
Removed DynamicContractRegistry.makeCleanUpOnRestartQuery and its invocation from resumeInitialState.
Utils addition
codegenerator/cli/npm/envio/src/Utils.res
Added Array.clearInPlace: array<'a> => unit (JS arr.length = 0).
Test updates for API changes
scenarios/test_codegen/test/ChainManager_test.res, scenarios/test_codegen/test/lib_tests/FetchState_test.res, scenarios/test_codegen/test/lib_tests/SourceManager_test.res
Updated tests to use updatedFetchStates and treat dcsToStore as [] instead of None.
Removed tests for cleanup query
scenarios/test_codegen/test/lib_tests/PgStorage_test.res
Deleted tests covering makeCleanUpOnRestartQuery.
Deleted complex recovery test
scenarios/erc20_multichain_factory/test/DynamicContractRecovery_test.res
Removed entire multi-chain dynamic contract restart recovery test suite.
New rollback test
scenarios/test_codegen/test/rollback/Rollback_test.res
Added E2E test covering dynamic contract registration, rollback, re-query, and registry updates.
Test helpers change
scenarios/test_codegen/test/helpers/Mock.res
Made itemMock.spec.handler optional; added optional contractRegister; updated loader and mapping logic.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor App
  participant ChainManager
  participant FetchStates as ChainMap<FetchState>
  participant DC as DynamicContracts
  participant Batch

  App->>ChainManager: createBatch(sizePerChain)
  ChainManager->>FetchStates: iterate chains
  loop per chain
    ChainManager->>DC: evaluate dcs vs nextProgress pointers
    alt include in current batch
      ChainManager->>FetchStates: set dcsToStore := batchDcs
    else defer
      ChainManager->>FetchStates: set dcsToStore := leftDcsToStore
    end
  end
  ChainManager->>Batch: emit { updatedFetchStates, items, ... }
Loading
sequenceDiagram
  autonumber
  actor Runtime
  participant GlobalState
  participant ChainManager
  participant UpdateQueues

  Runtime->>ChainManager: getNextBatch()
  ChainManager-->>Runtime: Batch { updatedFetchStates, items, ... }
  Runtime->>GlobalState: ProcessEventBatch(Batch)
  GlobalState->>GlobalState: updatedFetchStates := Batch.updatedFetchStates
  GlobalState->>UpdateQueues: dispatch(UpdateQueues, updatedFetchStates)
  Note over GlobalState,UpdateQueues: Downstream consumers now read updatedFetchStates
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • JonoPrest
  • JasoonS

Poem

I nibbled the fields, renamed with care,
Packed dcs in baskets—batch here, save there.
Cleared arrays quick with a whisk of my ear,
Hopped past old cleanup, nothing to fear.
Rollbacks resolved, state freshly dressed—
Thump-thump! This bunny loves a well-formed request. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Persist only processed DCs" succinctly and accurately reflects the PR's primary intent to avoid storing dynamic contracts until their registration events have been processed, which matches the PR objectives and the diffs. It is concise, specific to the change, and free of extraneous detail. A teammate scanning history will understand the main behavioral change from the title alone.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dz/prevent-dc-overstore

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
codegenerator/cli/npm/envio/src/FetchState.res (4)

1244-1247: Good: rollback filters pending DCs by removed addresses. Consider updating metrics.

We drop DCs from both indexingContracts and dcsToStore; update Prometheus.IndexingAddresses for accuracy after rollback.


1289-1303: Avoid unsafe_get when maxBatchSize is 0.

Current comment admits the hazard; add a fast path to prevent undefined access.

Apply:

   let hasFullBatch = ({queue} as fetchState: t, ~maxBatchSize) => {
+    if maxBatchSize <= 0 {
+      false
+    } else {
       // Queue is ordered from latest to earliest, so the earliest eligible
       // item for a full batch of size B is at index (length - B).
       // Do NOT subtract an extra 1 here; when length === B we should still
       // classify the queue as full and probe index 0.
       let targetBlockIdx = queue->Array.length - maxBatchSize
       if targetBlockIdx < 0 {
         false
       } else {
-        // Unsafe can fail when maxBatchSize is 0,
-        // but we ignore the case
-        queue->Js.Array2.unsafe_get(targetBlockIdx)->Internal.getItemBlockNumber <=
-          fetchState->bufferBlockNumber
+        queue->Js.Array2.unsafe_get(targetBlockIdx)
+        ->Internal.getItemBlockNumber <= fetchState->bufferBlockNumber
       }
     }
   }

86-105: Optional: copy dcsToStore in copy(fetchState) for symmetry.

Queue is copied; dcsToStore shares the underlying array. Copying avoids accidental external mutation.

-    dcsToStore: fetchState.dcsToStore,
+    dcsToStore: fetchState.dcsToStore->Array.copy,

590-601: Guard against duplicate DCs in dcsToStore — merge by address and keep earliest startBlock

Concatenating existing dcsToStore with the new batch can reintroduce duplicates across repeated registerDynamicContracts calls; merge entries by address and choose the earliest startBlock so writes remain idempotent.

Apply:

-        ~dcsToStore=switch fetchState.dcsToStore {
-        | [] => dcsToStore
-        | existingDcs => Array.concat(existingDcs, dcsToStore)
-        },
+        ~dcsToStore={
+          let byAddr = Js.Dict.empty()
+          /* seed with existing buffered DCs */
+          fetchState.dcsToStore->Array.forEach(dc =>
+            byAddr->Js.Dict.set(dc.address->Address.toString, dc)
+          )
+          /* merge new batch, keeping the entry with the smallest startBlock */
+          dcsToStore->Array.forEach(dc => {
+            let key = dc.address->Address.toString
+            switch byAddr->Js.Dict.get(key) {
+            | None => byAddr->Js.Dict.set(key, dc)
+            | Some(existing) =>
+              if dc.startBlock < existing.startBlock {
+                byAddr->Js.Dict.set(key, dc)
+              } else {
+                ()
+              }
+            }
+          })
+          byAddr->Js.Dict.values
+        },
scenarios/test_codegen/test/helpers/Mock.res (1)

505-528: Simplify optional handler wiring.

Use Option.map to reduce branching; behavior unchanged.

-                            handler: switch item.handler {
-                            | Some(handler) =>
-                              (
-                                ({context} as args) => {
-                                  // We don't want preload optimization for the tests
-                                  if context.isPreload {
-                                    Promise.resolve()
-                                  } else {
-                                    handler(args)
-                                  }
-                                }
-                              )->(
-                                Utils.magic: Types.HandlerTypes.loader<unit, unit> => option<
-                                  Internal.handler,
-                                >
-                              )
-
-                            | None => None
-                            },
+                            handler:
+                              item.handler->Option.map(handler =>
+                                (
+                                  ({context} as args) =>
+                                    context.isPreload ? Promise.resolve() : handler(args)
+                                )->(
+                                  Utils.magic: Types.HandlerTypes.loader<unit, unit> =>
+                                  Internal.handler
+                                )
+                              ),
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

955-958: Avoid scheduling no-op DC writes when all per-chain arrays are empty.

Utils.Dict.size counts keys, not values. If ChainManager inserts keys with empty arrays, this branch still fires unnecessarily.

Apply this guard:

-        if dcsToStoreByChainId->Utils.Dict.size > 0 {
-          let shouldSaveHistory = state.config->Config.shouldSaveHistory(~isInReorgThreshold)
-          inMemoryStore->InMemoryStore.setDcsToStore(dcsToStoreByChainId, ~shouldSaveHistory)
-        }
+        let hasAnyDcs =
+          dcsToStoreByChainId
+          ->Js.Dict.entries
+          ->Array.some(((_, dcs)) => dcs->Array.length > 0)
+        if hasAnyDcs {
+          let shouldSaveHistory = state.config->Config.shouldSaveHistory(~isInReorgThreshold)
+          inMemoryStore->InMemoryStore.setDcsToStore(dcsToStoreByChainId, ~shouldSaveHistory)
+        }
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

130-173: Persist only processed DCs: logic looks solid; skip writing empty per-chain DC lists.

The progress-based gating for DC persistence aligns with “processed only.” One nit: avoid adding entries with empty arrays to dcsToStoreByChainId to prevent downstream no-ops.

Apply:

-          dcsToStoreByChainId->Js.Dict.set(fetchState.chainId->Int.toString, batchDcs)
+          if batchDcs->Array.length > 0 {
+            dcsToStoreByChainId->Js.Dict.set(fetchState.chainId->Int.toString, batchDcs)
+          }

Also consider replacing dangerouslyGetNonOption with a safe getter to reduce footguns (non-blocking).

scenarios/test_codegen/test/lib_tests/FetchState_test.res (2)

708-709: Make dcsToStore order-insensitive in assertions (optional).

Several asserts depend on array order that isn’t guaranteed by design. To reduce flakiness, compare on a sorted projection (e.g., by (address, startBlock)).

Example helper:

let sortDcs = dcs =>
  dcs->Array.sortBy(dc => (Address.toString(dc.address), dc.startBlock))

Then wrap both sides with sortDcs in the affected assertions.

Also applies to: 758-759, 1226-1228


2119-2119: Remove stray log from test output.

-    Js.log(updatedFetchState)
+    // (removed noisy log)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8be862b and bc09209.

📒 Files selected for processing (14)
  • codegenerator/cli/npm/envio/src/Batch.res (1 hunks)
  • codegenerator/cli/npm/envio/src/FetchState.res (4 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (0 hunks)
  • codegenerator/cli/npm/envio/src/Utils.res (1 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (3 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (2 hunks)
  • scenarios/erc20_multichain_factory/test/DynamicContractRecovery_test.res (0 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (3 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (2 hunks)
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res (14 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (0 hunks)
  • scenarios/test_codegen/test/lib_tests/SourceManager_test.res (1 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (1 hunks)
💤 Files with no reviewable changes (4)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/erc20_multichain_factory/test/DynamicContractRecovery_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • scenarios/test_codegen/test/lib_tests/SourceManager_test.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • scenarios/test_codegen/test/lib_tests/SourceManager_test.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (13)
codegenerator/cli/npm/envio/src/FetchState.res (2)

63-66: dcsToStore now a concrete array — LGTM.

The non-optional shape simplifies control flow and aligns with persistence semantics.


1131-1134: Initialize dcsToStore to [] — LGTM.

Consistent with the new non-nullable contract.

scenarios/test_codegen/test/helpers/Mock.res (1)

357-362: itemMock optional fields — LGTM.

Optional handler/contractRegister make the helper more flexible.

scenarios/test_codegen/test/lib_tests/SourceManager_test.res (1)

193-199: Mocks updated to dcsToStore: [] — LGTM.

Matches the new non-optional state shape.

scenarios/test_codegen/test/ChainManager_test.res (3)

162-166: Switched to updatedFetchStates in pattern match — LGTM.

Keeps tests aligned with Batch.t changes.


181-187: Use updatedFetchStates when propagating fetchState — LGTM.

Correctly maps chain → updated fetchState.


277-283: Mocks updated to dcsToStore: [] — LGTM.

Consistent with FetchState.t.

codegenerator/cli/npm/envio/src/Batch.res (1)

12-14: Rename acknowledged — no downstream destructuring found

Repo search shows no object destructuring of {items, fetchStates}; destructuring sites use updatedFetchStates (e.g., GlobalState.res, ChainManager_test.res). Remaining fetchStates references are local params/vars (Batch.res, ChainManager template) and need no change.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

179-181: updatedFetchStates propagation and progressedChains computation look correct.

Using updatedFetchStates for buffers/metrics and returning it in Batch.t matches the new API.

Also applies to: 210-216, 239-244

scenarios/test_codegen/test/lib_tests/FetchState_test.res (2)

160-168: Tests updated for non-optional dcsToStore: LGTM.

Expectations now consistently use [] instead of None, matching the new FetchState shape.

Also applies to: 236-241, 303-306, 401-404, 878-880, 918-920, 1224-1228, 1655-1670, 1698-1704, 1778-1784, 1868-1872


126-129: v11 undefined workaround acknowledged.

Keep until v12 upgrade; add a TODO to remove afterwards.

Also applies to: 415-418

scenarios/test_codegen/test/rollback/Rollback_test.res (1)

777-1028: New “Rollback Dynamic Contract” E2E test validates “persist only processed DCs.”

Good coverage: pre-persist gating, rollback depth discovery, and post-rollback DB state. Assertions on getItemsOrThrowCalls sequence are especially helpful.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

905-906: Switch to batch.updatedFetchStates is correct; UpdateQueues wiring looks consistent.

Repository-wide search found no remaining references to "Batch.fetchStates". Batch.t declares updatedFetchStates (codegenerator/cli/npm/envio/src/Batch.res) and GlobalState uses batch.updatedFetchStates (codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res — lines ~905 and ~938–942); ChainManager still uses a local fetchStates variable (expected).

Comment on lines +191 to +194
let clearInPlace: array<'a> => unit = %raw(`(arr) => {
arr.length = 0
}`)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Is this the best api to achieve clearing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. One of the best

Comment on lines +130 to +133
// Needed to:
// - Recalculate the computed queue sizes
// - Accumulate registered dynamic contracts to store in the db
// - Trigger onBlock pointer update
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about leaving this as a doc comment?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind I thought it was a function

Comment on lines +360 to +361
handler?: Types.HandlerTypes.loader<unit, unit>,
contractRegister?: Types.HandlerTypes.contractRegister<unit>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this now optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for tests, we can have an event either with handler, or contract register, or both, or without.

Copy link
Collaborator

@JonoPrest JonoPrest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, looks good 👍🏼

@DZakh DZakh merged commit 718a8f6 into main Sep 23, 2025
2 checks passed
@DZakh DZakh deleted the dz/prevent-dc-overstore branch September 23, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants