Skip to content

Conversation

@DZakh
Copy link
Member

@DZakh DZakh commented Oct 27, 2025

Summary by CodeRabbit

  • New Features

    • Added per-chain latency metrics to monitor progress.
    • Expanded effect-call metrics: active calls, per-call timings, and cumulative timing.
  • Security

    • Console endpoints now require Bearer token authorization; Authorization header supported.
  • Improvements

    • Prometheus data saving enabled unconditionally.
    • Refined storage-load and observability metrics for clearer operational insights.
  • Tests

    • E2E adjusted to expect metric reset behavior after restart.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 27, 2025

Walkthrough

Adds Batch.last-event lookup and per-chain progress latency emission; refactors effect call tracking to use active/total counters with per-call timing and prevCallStartTimerRef; updates Prometheus metrics and LoadLayer instrumentation; introduces console Bearer token authorization for console endpoints; makes Prometheus saving unconditional.

Changes

Cohort / File(s) Summary
Batch utilities
codegenerator/cli/npm/envio/src/Batch.res
Added findLastEventItem(batch: t, ~chainId) that scans batch.items from the end and returns the last Internal.Event matching chainId or None.
Effect type & init
codegenerator/cli/npm/envio/src/Internal.res, codegenerator/cli/npm/envio/src/Envio.res
Renamed mutable callsCountactiveCallsCount on effect; added mutable prevCallStartTimerRef: Hrtime.timeRef; updated experimental_createEffect to initialize activeCallsCount and prevCallStartTimerRef (field rename in returned record).
Prometheus metrics
codegenerator/cli/npm/envio/src/Prometheus.res
Added ProgressLatency gauge (envio_progress_latency) with chainId label; refactored EffectCalls into timeCounter, sumTimeCounter, totalCallsCount, activeCallsCount; replaced StorageLoad.totalTimeCounter with sumTimeCounter and adjusted accumulation.
Global progress & latency
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
When updating progressed chains, compute per-chain latency by finding last event via findLastEventItem, derive block timestamp, compute latencyMs, and set Prometheus.ProgressLatency for that chain if present.
Effect lifecycle / LoadLayer
codegenerator/cli/templates/static/codegen/src/LoadLayer.res
Use effectName for keys/labels; track activeCallsCount with gauge increments/decrements; initialize/use prevCallStartTimerRef for cumulative timing; record per-call timeCounter, increment sumTimeCounter and totalCallsCount in finally; adjust caching behavior and error traces.
Server & config
codegenerator/cli/templates/static/codegen/src/Index.res, codegenerator/cli/templates/static/codegen/src/Env.res
Replaced shouldUseTui param with consoleBearerToken: option<string> in startServer; added Authorization header to allowed headers and Bearer-token authorization for /console/state and /console/syncCache; made Prometheus saving unconditional in SaveDataStrategy.
Tests
scenarios/test_codegen/test/E2E_test.res
Adjusted expectations after restart: check enviro_effect_calls_count reset and expect in-memory resume value "1" instead of "2".

Sequence Diagram(s)

sequenceDiagram
    participant LL as LoadLayer
    participant H as Effect.Handler
    participant M as Prometheus (EffectCalls)
    participant C as Cache
    participant T as TimerRef

    rect rgb(245,250,255)
    Note over LL,M: Effect call lifecycle (new flow)
    LL->>LL: increment effect.activeCallsCount
    LL->>M: set activeCallsCount gauge
    LL->>T: maybe init prevCallStartTimerRef
    LL->>H: invoke handler
    H-->>LL: return output / throw
    LL->>C: write in-memory cache (if success)
    LL->>LL: finally block
    LL->>LL: decrement activeCallsCount
    LL->>M: update activeCallsCount gauge
    LL->>M: observe timeCounter (per-call)
    LL->>M: increment sumTimeCounter (cumulative)
    LL->>M: increment totalCallsCount
    LL->>T: reset/update prevCallStartTimerRef
    end
Loading
sequenceDiagram
    participant GS as GlobalState.updateProgressedChains
    participant B as Batch
    participant F as Batch.findLastEventItem
    participant P as Prometheus.ProgressLatency

    Note over GS,P: Per-chain latency computation
    GS->>B: receive batch for chainId
    GS->>F: findLastEventItem(batch, chainId)
    alt found
        F-->>GS: Some(eventItem)
        GS->>GS: extract block timestamp, compute latencyMs
        GS->>P: set(~latencyMs, ~chainId)
    else not found
        F-->>GS: None
        GS->>GS: skip latency metric
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas needing extra attention:
    • LoadLayer.res: correctness of timer init/reset and finally-block ordering.
    • Prometheus.res: metric names, labels, and counter/gauge semantics.
    • Internal/Envio: effect record rename/initialization and all call sites.
    • Index.res: bearer-token authorization logic and request header handling.

Possibly related PRs

Suggested reviewers

  • JonoPrest
  • JasoonS

Poem

🐰 I hopped through batches, timers in paw,

Counted active calls with a careful law,
Latency found where the last event lay,
Bearer-token guards the console's way,
A rabbit cheers for metrics saved today 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Effect API metrics + latency metrics + Dev Console with disabled TUI" directly and specifically relates to the changeset. It identifies three core features being introduced: (1) updated Effect API metrics via Prometheus changes across multiple files (Prometheus.res, LoadLayer.res, Envio.res, Internal.res), (2) latency metrics tracking added through new ProgressLatency module and GlobalState.res integration, and (3) Dev Console authorization with token-based access replacing TUI-based control (Index.res changes). The title is clear, specific, and captures the main scope of this multi-feature pull request without being vague or overly broad.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dz/effect-official

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 388d5ee and 461cf9a.

📒 Files selected for processing (1)
  • scenarios/test_codegen/test/E2E_test.res (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/E2E_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • scenarios/test_codegen/test/E2E_test.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (2)
scenarios/test_codegen/test/E2E_test.res (2)

419-423: LGTM! Correct validation of metric reset on restart.

The assertion correctly verifies that the envio_effect_calls_count metric is reset to an empty list after restarting the indexer, which aligns with the refactored metric system that tracks active calls rather than cumulative totals.


494-494: LGTM! Correct expected value after metric refactoring.

The change from "2" to "1" correctly reflects the refactored metric behavior. After restart, the envio_effect_calls_count metric no longer accumulates lifetime totals but tracks calls per session. In the second batch, only one new effect call is executed ("test-2" is a cache miss), hence the expected value is "1".


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
codegenerator/cli/npm/envio/src/Batch.res (1)

532-550: Logic LGTM; minor readability and API polish suggested

Reverse scan is correct and bounds-safe. Consider small refinements:

  • Use a different name than eventItem for the cast result to avoid shadowing.
  • Add an explicit return type and a brief doc comment.

Suggested tweak:

-let findLastEventItem = (batch: t, ~chainId) => {
+/** Returns the last event item for a chainId in the batch, if any. */
+let findLastEventItem = (batch: t, ~chainId): option<Internal.eventItem> => {
   let idx = ref(batch.items->Array.length - 1)
   let result = ref(None)
   while idx.contents >= 0 && result.contents === None {
     let item = batch.items->Array.getUnsafe(idx.contents)
     switch item {
-    | Internal.Event(_) as eventItem => {
-        let eventItem = eventItem->Internal.castUnsafeEventItem
-        if eventItem.chain->ChainMap.Chain.toChainId === chainId {
-          result := Some(eventItem)
+    | Internal.Event(_) as evItem => {
+        let ev = evItem->Internal.castUnsafeEventItem
+        if ev.chain->ChainMap.Chain.toChainId === chainId {
+          result := Some(ev)
         } else {
           idx := idx.contents - 1
         }
       }
     | Internal.Block(_) => idx := idx.contents - 1
     }
   }
   result.contents
 }
codegenerator/cli/templates/static/codegen/src/Index.res (3)

90-99: Harden Bearer parsing.

Current check requires exact match. Consider trimming and case-insensitive “Bearer ” prefix to be resilient to clients.

-      switch req.headers->Js.Dict.get("authorization") {
-      | Some(authorization) if authorization === `Bearer ${token}` => true
+      switch req.headers->Js.Dict.get("authorization")->Belt.Option.map(s => s->Js.String.trim) {
+      | Some(authorization)
+          if authorization->Js.String.toLowerCase->Js.String.startsWith("bearer ") &&
+             authorization->Js.String.sliceToEnd(7) === token => true
       | _ => false
       }

111-120: /console/syncCache auth gate looks good.

Auth-guarded dumpEffectCache with boolean response is fine. Consider 401/403 instead of false for clarity, similar to state endpoint.


255-316: Dev token sourcing; prefer explicit env var.

Deriving consoleBearerToken from Env.Db.password === "testing" is brittle. Consider a dedicated env var (e.g., ENVIO_CONSOLE_TOKEN) with clear dev-only default.

-      ~consoleBearerToken={
-        if Env.Db.password === "testing" {
-          Some("testing")
-        } else {
-          None
-        }
-      },
+      ~consoleBearerToken={
+        switch NodeJs.process->NodeJs.getEnv("ENVIO_CONSOLE_TOKEN") {
+        | Some(token) when token !== "" => Some(token)
+        | _ => None
+        }
+      },
codegenerator/cli/npm/envio/src/Prometheus.res (1)

599-623: EffectCalls metric suite: LGTM; minor doc tweak suggested.

timeCounter/sumTimeCounter/totalCallsCount/activeCallsCount cover needed facets. If totalCallsCount includes failures, update help to “completed (fulfilled or rejected)” to match behavior.

-    ~help="Cumulative number of resolved Effect function calls during the indexing process.",
+    ~help="Cumulative number of completed (fulfilled or rejected) Effect function calls during the indexing process.",
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e3cba5 and b502a51.

📒 Files selected for processing (8)
  • codegenerator/cli/npm/envio/src/Batch.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Envio.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Internal.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Prometheus.res (3 hunks)
  • codegenerator/cli/templates/static/codegen/src/Env.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/Index.res (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res
  • codegenerator/cli/templates/static/codegen/src/Index.res
  • codegenerator/cli/npm/envio/src/Envio.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
codegenerator/cli/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

The Rust CLI lives in codegenerator/cli

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res
  • codegenerator/cli/templates/static/codegen/src/Index.res
  • codegenerator/cli/npm/envio/src/Envio.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
codegenerator/cli/templates/static/**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Static templates are raw ReScript files copied verbatim under codegenerator/cli/templates/static

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res
  • codegenerator/cli/templates/static/codegen/src/Index.res
{codegenerator/cli/templates/static/codegen/src/**,codegenerator/cli/templates/dynamic/codegen/src/**}

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Template versions of runtime live under templates/{static,dynamic}/codegen/src and are the recommended editing targets

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res
  • codegenerator/cli/templates/static/codegen/src/Index.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/templates/static/codegen/src/Env.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/templates/static/codegen/src/LoadLayer.res
  • codegenerator/cli/templates/static/codegen/src/Index.res
  • codegenerator/cli/npm/envio/src/Envio.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
codegenerator/cli/npm/envio/**

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Shared library-fied runtime lives in codegenerator/cli/npm/envio

Files:

  • codegenerator/cli/npm/envio/src/Internal.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/npm/envio/src/Envio.res
  • codegenerator/cli/npm/envio/src/Prometheus.res
🧠 Learnings (2)
📚 Learning: 2025-10-13T12:26:45.351Z
Learnt from: CR
PR: enviodev/hyperindex#0
File: .cursor/rules/subgraph-migration.mdc:0-0
Timestamp: 2025-10-13T12:26:45.351Z
Learning: Applies to src/effects/**/*.ts : Create external calls (RPC/API) as Effect API functions under src/effects and define schemas via `experimental_createEffect`

Applied to files:

  • codegenerator/cli/npm/envio/src/Envio.res
📚 Learning: 2025-10-21T16:31:22.127Z
Learnt from: CR
PR: enviodev/hyperindex#0
File: .cursor/rules/navigation.mdc:0-0
Timestamp: 2025-10-21T16:31:22.127Z
Learning: Applies to generated/src/Prometheus.res : In generated runtime, Prometheus.res handles metrics

Applied to files:

  • codegenerator/cli/npm/envio/src/Prometheus.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (10)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

224-236: Fix ms arithmetic overflow; compute in floats and clamp to 0

currentTimeMs and blockTimestampMs are computed as ints. Unix ms (~1.7e12) exceed 32-bit int; Float.toInt truncates and “* 1000” overflows, yielding wrong or negative latency. Use floats, then cast once, clamping negatives.

Apply:

-        // Calculate and set latency metrics
-        switch batch->Batch.findLastEventItem(~chainId=chain->ChainMap.Chain.toChainId) {
-        | Some(eventItem) => {
-            let blockTimestamp = eventItem.event.block->Types.Block.getTimestamp
-            let currentTimeMs = Js.Date.now()->Float.toInt
-            let blockTimestampMs = blockTimestamp * 1000
-            let latencyMs = currentTimeMs - blockTimestampMs
-            
-            Prometheus.ProgressLatency.set(~latencyMs, ~chainId=chain->ChainMap.Chain.toChainId)
-          }
-        | None => ()
-        }
+        // Calculate and set latency metrics (use floats to avoid int overflow)
+        switch batch->Batch.findLastEventItem(~chainId=chain->ChainMap.Chain.toChainId) {
+        | Some(eventItem) => {
+            let blockTimestampSec = eventItem.event.block->Types.Block.getTimestamp
+            let currentTimeMsF = Js.Date.now()
+            let blockTimestampMsF = Float.fromInt(blockTimestampSec) *. 1000.0
+            let latencyMs =
+              Js.Math.max(0.0, currentTimeMsF -. blockTimestampMsF)->Float.toInt
+            Prometheus.ProgressLatency.set(~latencyMs, ~chainId=chain->ChainMap.Chain.toChainId)
+          }
+        | None => ()
+        }

Also confirm Types.Block.getTimestamp returns seconds (not ms); otherwise drop the “*. 1000.0”.

codegenerator/cli/templates/static/codegen/src/LoadLayer.res (3)

74-76: Consistent effect name/key usage.

Using a local effectName for keys and labels is clearer and avoids accidental renames. LGTM.


85-85: Cache readiness check keyed by effectName.

Gate on Ready({cache}) and presence of effectName is sound. LGTM.


109-116: Correct labels for cache invalidation.

Invalidations counter and structured log now use effectName; good fix.

codegenerator/cli/templates/static/codegen/src/Index.res (3)

70-74: CORS headers updated to allow Authorization.

Good addition; avoids preflight failures for Bearer tokens. LGTM.


57-57: Verified: All startServer call sites updated.

The codebase contains a single startServer call site at line 253, which correctly passes all three parameters including the new ~consoleBearerToken parameter. The signature change has been fully applied.


101-110: Verify HTTP status code strategy with console client team.

The codebase intentionally returns Disabled({}) with HTTP 200 for unauthorized /console/state requests, consistent with other console endpoints like /console/syncCache (which returns Boolean(false)). This violates HTTP standards (RFC 7235 recommends 401 for auth failures with WWW-Authenticate header), but changing it would break the external console client's API contract. Before refactoring to use 401/403, verify that the console client can handle these status codes; if maintaining the current design for backward compatibility, document this behavior explicitly.

codegenerator/cli/npm/envio/src/Prometheus.res (3)

583-594: Progress latency metric: good addition.

envio_progress_latency gauge with chainId labels is clear and useful. LGTM.


658-663: StorageLoad sum time counter rename.

envio_storage_load_sum_time is clearer than totalTimeCounter. LGTM.


715-718: EndOperation uses sumTimeCounter correctly.

Per-call elapsed added to sumTimeCounter; matches naming. LGTM.

Comment on lines +71 to 73
activeCallsCount: 0,
prevCallStartTimerRef: %raw(`null`),
// This is the way to make the createEffect API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Avoid %raw(null); align with an option-typed ref or a valid default

prevCallStartTimerRef is set via %raw(null) while typed as Hrtime.timeRef. Unsafe and bypasses the type system; may crash if used unguarded.

  • If you adopt option<Hrtime.timeRef> in Internal.res, initialize with None here:
-    activeCallsCount: 0,
-    prevCallStartTimerRef: %raw(`null`),
+    activeCallsCount: 0,
+    prevCallStartTimerRef: None,
  • If keeping a non-optional type, initialize with a valid Hrtime ref (e.g., Hrtime.makeTimer()) and add a boolean flag to indicate “active timing” instead of null.

Based on learnings.


🏁 Script executed:

#!/bin/bash
rg -nP --type=res -C3 'prevCallStartTimerRef'

Length of output: 27



---

🏁 Script executed:

```shell
#!/bin/bash
# Find Envio.res file
fd -name "Envio.res"

Length of output: 230


🏁 Script executed:

#!/bin/bash
# Search for prevCallStartTimerRef without file type constraint
rg -n -C3 "prevCallStartTimerRef" codegenerator/

Length of output: 3874


🏁 Script executed:

#!/bin/bash
# Look for Internal.res file
fd -name "Internal.res" codegenerator/

Length of output: 230



---

🏁 Script executed:

```shell
#!/bin/bash
# Properly find Internal.res file
fd Internal.res codegenerator/

Length of output: 44


🏁 Script executed:

#!/bin/bash
# Find Hrtime type definition
rg -n "type timeRef" codegenerator/

Length of output: 146


🏁 Script executed:

#!/bin/bash
# Check LoadLayer context to see if first read happens on initial creation
rg -n -B10 "prevCallStartTimerRef" codegenerator/cli/templates/static/codegen/src/LoadLayer.res | head -60

Length of output: 1495


🏁 Script executed:

#!/bin/bash
# Search for where effect record is first created to understand initialization order
rg -n "hadActiveCalls" codegenerator/cli/templates/static/codegen/src/LoadLayer.res -B5

Length of output: 464


Avoid %raw(null); align with an option-typed ref or a valid default

The concern is valid. prevCallStartTimerRef is initialized with %raw(\null`)at line 72 in Envio.res while typed as the non-optionalHrtime.timeRef` in Internal.res, which bypasses type safety.

Although current usage in LoadLayer.res guards against reading null on the first call (if hadActiveCalls at line 141), this fragile defensive programming should be replaced with proper typing:

  • Adopt option<Hrtime.timeRef> and initialize with None, or
  • Initialize with a valid Hrtime.timeRef default (e.g., Hrtime.makeTimer())
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/Envio.res around lines 71–73, replace the
unsafe %raw(`null`) for prevCallStartTimerRef with a properly-typed value:
change prevCallStartTimerRef to option<Hrtime.timeRef> and initialize it to None
(or, if you prefer the alternate approach, initialize with a valid
Hrtime.makeTimer()); then update all places that read prevCallStartTimerRef to
pattern-match on None/Some (or provide a safe default) instead of assuming a
non-null value so you preserve type safety and remove the raw null bypass.

Comment on lines +302 to 305
// The number of functions that are currently running.
mutable activeCallsCount: int,
mutable prevCallStartTimerRef: Hrtime.timeRef,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

prevCallStartTimerRef should be optional; current design invites null-sentinel misuse

Field is typed as Hrtime.timeRef but Envio.res initializes it with raw null. Without an option type, downstream code can crash if it dereferences without a guard.

Recommended change:

-  mutable prevCallStartTimerRef: Hrtime.timeRef,
+  mutable prevCallStartTimerRef: option<Hrtime.timeRef>,

Then initialize with None in Envio.res and handle Some(_) where used. This removes %raw(null) and restores type safety. If keeping non-optional, provide a non-null sentinel from Hrtime (e.g., Hrtime.makeTimer()) and ensure all sites don’t assume null.

Check usages and legacy name:


🏁 Script executed:

#!/bin/bash
# Ensure all uses guard null/option appropriately
rg -nP -C3 'prevCallStartTimerRef' 

# Ensure no stale references to callsCount remain
rg -nP -C2 '\bcallsCount\b'

Length of output: 3874


Change prevCallStartTimerRef to option<Hrtime.timeRef> for type safety.

The field is initialized with %raw(\null`)at Envio.res:72 but typed as non-optionalHrtime.timeRefat Internal.res:304. It's dereferenced directly in LoadLayer.res (lines 144, 174) without option guards. While usage appears guarded byhadActiveCalls`, the non-optional type with null initialization is unsafe and fragile.

Apply the recommended change:

-  mutable prevCallStartTimerRef: Hrtime.timeRef,
+  mutable prevCallStartTimerRef: option<Hrtime.timeRef>,

Update initialization in Envio.res:72 to prevCallStartTimerRef: None, and wrap assignments and dereferences in LoadLayer.res with option patterns (e.g., Some(timerRef)).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// The number of functions that are currently running.
mutable activeCallsCount: int,
mutable prevCallStartTimerRef: Hrtime.timeRef,
}
// The number of functions that are currently running.
mutable activeCallsCount: int,
mutable prevCallStartTimerRef: option<Hrtime.timeRef>,
}
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/Internal.res around lines 302-305, change the
type of prevCallStartTimerRef from Hrtime.timeRef to option<Hrtime.timeRef>;
also update the initializer in codegenerator/cli/npm/envio/src/Envio.res at line
72 to prevCallStartTimerRef: None; then update LoadLayer.res at the call sites
around lines 144 and 174 to wrap assignments with Some(timerRef) when storing a
timer and to pattern-match (e.g., switch/if-let/when Some(timerRef)) before
dereferencing, handling the None case as appropriate.

Comment on lines +73 to 75
let shouldSavePrometheus = _ => true

let shouldSaveData = self => self->shouldSavePrometheus || self->shouldSaveJsonFile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Forcing Prometheus “on” ignores ENVIO_SAVE_BENCHMARK_DATA and breaks config expectations

This unconditionally enables Prometheus saving, making SaveDataStrategy.shouldSaveData always true regardless of user input. Likely unintended and a breaking behavior change for users who disable metrics or only want json-file.

Suggested fix (restore gating by variant and Bool(true)):

-    let shouldSavePrometheus = _ => true
+    let shouldSavePrometheus = self =>
+      switch self {
+      | Prometheus
+      | Bool(true) => true
+      | _ => false
+      }

Optionally, if “always-on Prometheus” is truly desired, please:

  • Rename the strategy or add a separate env (e.g., ENVIO_PROMETHEUS_ENABLED) to make intent explicit.
  • Document the change in release notes.

Run to confirm no other code assumes shouldSaveData can be false:


🏁 Script executed:

#!/bin/bash
rg -nP --type=res -C2 '\bBenchmark\.shouldSaveData\b|SaveDataStrategy\.shouldSaveData'

Length of output: 27


🏁 Script executed:

#!/bin/bash
# Search for shouldSaveData usage in ReScript files
rg -nP -C2 'shouldSaveData' --glob='*.res'

Length of output: 3349


🏁 Script executed:

#!/bin/bash
# Find SaveDataStrategy type definition and its variants
rg -nP -A10 'type.*SaveDataStrategy\b|type t\s*=|^[[:space:]]*(Bool|Prometheus|Json)' codegenerator/cli/templates/static/codegen/src/Env.res | head -50

Length of output: 343


🏁 Script executed:

#!/bin/bash
# Find ENVIO_SAVE_BENCHMARK_DATA references to confirm environment variable mapping
rg -nP 'ENVIO_SAVE_BENCHMARK_DATA|saveDataStrategy' codegenerator/cli/templates/static/codegen/src/Env.res | head -30

Length of output: 202


Forcing Prometheus "on" unconditionally breaks SaveDataStrategy configuration

The change at line 73 forces shouldSavePrometheus = _ => true, which makes line 75's shouldSaveData logic always evaluate to true via short-circuit OR evaluation (true || X = true). This breaks the configuration system—users setting Bool(false) (the default) via ENVIO_SAVE_BENCHMARK_DATA=false have their preference ignored, and benchmarking cannot be disabled. Code at EventProcessing.res:457 and GlobalState.res:372, 896 contains conditional checks implying shouldSaveData should be able to return false.

Restore gating by SaveDataStrategy variant:

-    let shouldSavePrometheus = _ => true
+    let shouldSavePrometheus = self =>
+      switch self {
+      | Prometheus
+      | Bool(true) => true
+      | _ => false
+      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let shouldSavePrometheus = _ => true
let shouldSaveData = self => self->shouldSavePrometheus || self->shouldSaveJsonFile
let shouldSavePrometheus = self =>
switch self {
| Prometheus
| Bool(true) => true
| _ => false
}
let shouldSaveData = self => self->shouldSavePrometheus || self->shouldSaveJsonFile
🤖 Prompt for AI Agents
In codegenerator/cli/templates/static/codegen/src/Env.res around lines 73 to 75,
the helper shouldSavePrometheus is hardcoded to always true which forces
shouldSaveData to always be true and ignores
SaveDataStrategy/ENVIO_SAVE_BENCHMARK_DATA; revert this by making
shouldSavePrometheus derive from the SaveDataStrategy variant (or the ENV flag)
instead of returning true unconditionally — implement pattern matching on
self.saveDataStrategy so Prometheus is only enabled for the variant(s) that
require it (and ensure shouldSaveData uses that result).

Comment on lines +134 to +151
let hadActiveCalls = effect.activeCallsCount > 0
effect.activeCallsCount = effect.activeCallsCount + remainingCallsCount
Prometheus.EffectCalls.activeCallsCount->Prometheus.SafeGauge.handleInt(
~labels=effectName,
~value=effect.activeCallsCount,
)

if hadActiveCalls {
Prometheus.EffectCalls.timeCounter->Prometheus.SafeCounter.incrementMany(
~labels=effectName,
~value=effect.prevCallStartTimerRef
->Hrtime.timeSince
->Hrtime.toMillis
->Hrtime.intFromMillis,
)
}
let timerRef = Hrtime.makeTimer()
effect.prevCallStartTimerRef = timerRef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Active-calls gauge and union-of-interval timing: verify initialization.

The 0→N and N→M transitions with prevCallStartTimerRef segment total “busy” time correctly under the event loop. Ensure prevCallStartTimerRef is initialized at effect construction to a Hrtime.makeTimer() to avoid a first-interval spike.


🏁 Script executed:

#!/bin/bash
# Verify prevCallStartTimerRef default initialization in Internal/Envio
rg -n -C3 'prevCallStartTimerRef' --type=rs --type=res --type=resi

Length of output: 26


🏁 Script executed:

#!/bin/bash
# Search for prevCallStartTimerRef with corrected syntax using glob patterns
rg -n -C3 'prevCallStartTimerRef' -g '*.res' -g '*.resi'

Length of output: 3874


🏁 Script executed:

#!/bin/bash
# Check if Hrtime.makeTimer() is called during initialization anywhere
# Also verify if there are other similar patterns or initialization paths
rg -n 'makeTimer\(\)' -g '*.res' -g '*.resi' -C2

Length of output: 8189


🏁 Script executed:

#!/bin/bash
# Check for null guards before Hrtime.timeSince calls on prevCallStartTimerRef
rg -n -B5 -A2 'prevCallStartTimerRef\s*->\s*Hrtime\.timeSince' -g '*.res' -g '*.resi'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look at the exact context around the timeSince calls to confirm guard presence
rg -n -B8 'timeSince' codegenerator/cli/templates/static/codegen/src/LoadLayer.res | grep -A10 'prevCallStartTimerRef'

Length of output: 1064


Initialize prevCallStartTimerRef to Hrtime.makeTimer() at effect construction.

The prevCallStartTimerRef field is initialized to %raw(\null`)incodegenerator/cli/npm/envio/src/Envio.res:72, but should follow the pattern used elsewhere in the codebase (e.g., SourceManager.res:73) and be initialized to Hrtime.makeTimer(). The current null initialization causes the first 0→1 transition to include a timing spike, as the guard at line 141 (if hadActiveCalls`) prevents the timeSince call on the initial invocation, deferring measurement until the next call after activeCallsCount becomes non-zero.

Fix: Change line 72 in codegenerator/cli/npm/envio/src/Envio.res from:

prevCallStartTimerRef: %raw(`null`),

to:

prevCallStartTimerRef: Hrtime.makeTimer(),
🤖 Prompt for AI Agents
In codegenerator/cli/npm/envio/src/Envio.res around line 72,
prevCallStartTimerRef is initialized to %raw(`null`) which causes the first 0→1
active-call transition to skip proper timing and produce a spike; replace that
initialization with a call to Hrtime.makeTimer() (following the pattern in
SourceManager.res) so prevCallStartTimerRef starts as a valid timer and
measurements are correct on the first transition.

Comment on lines +150 to +151
let timerRef = Hrtime.makeTimer()
effect.prevCallStartTimerRef = timerRef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

sumTimeCounter overcounts: shared timerRef used for all calls in the batch.

timerRef is created once (Line 150) but used for every call’s sumTimeCounter (Lines 184‑187). This will add overlapping durations multiple times. Create a per-call timerRef.

Apply this diff:

@@
-      let timerRef = Hrtime.makeTimer()
-      effect.prevCallStartTimerRef = timerRef
+      let batchTimerRef = Hrtime.makeTimer()
+      effect.prevCallStartTimerRef = batchTimerRef
@@
-          Some(
-            effect.handler(arg)
+          Some({
+            let callTimerRef = Hrtime.makeTimer()
+            effect.handler(arg)
             ->Promise.thenResolve(output => {
               inMemTable.dict->Js.Dict.set(arg.cacheKey, output)
               if shouldStoreCache {
                 inMemTable.idsToStore->Array.push(arg.cacheKey)->ignore
               }
             })
             ->Promise.finally(() => {
@@
               Prometheus.EffectCalls.totalCallsCount->Prometheus.SafeCounter.increment(
                 ~labels=effectName,
               )
               Prometheus.EffectCalls.sumTimeCounter->Prometheus.SafeCounter.incrementMany(
                 ~labels=effectName,
-                ~value=timerRef->Hrtime.timeSince->Hrtime.toMillis->Hrtime.intFromMillis,
+                ~value=callTimerRef->Hrtime.timeSince->Hrtime.toMillis->Hrtime.intFromMillis,
               )
             }),
-          )
+          })

Also applies to: 184-187

🤖 Prompt for AI Agents
In codegenerator/cli/templates/static/codegen/src/LoadLayer.res around lines
150-151 and 184-187, timerRef is created once and reused for every call, causing
sumTimeCounter to overcount overlapping durations; change to create a new
Hrtime.makeTimer() per-call (move timerRef creation into the per-call loop or
into the place where each effect is initialized) and assign that per-call
timerRef to effect.prevCallStartTimerRef so each call measures its own duration
before updating sumTimeCounter.

Comment on lines +166 to 188
->Promise.finally(() => {
effect.activeCallsCount = effect.activeCallsCount - 1
Prometheus.EffectCalls.activeCallsCount->Prometheus.SafeGauge.handleInt(
~labels=effectName,
~value=effect.activeCallsCount,
)
Prometheus.EffectCalls.timeCounter->Prometheus.SafeCounter.incrementMany(
~labels=effectName,
~value=effect.prevCallStartTimerRef
->Hrtime.timeSince
->Hrtime.toMillis
->Hrtime.intFromMillis,
)
effect.prevCallStartTimerRef = Hrtime.makeTimer()

Prometheus.EffectCalls.totalCallsCount->Prometheus.SafeCounter.increment(
~labels=effectName,
)
Prometheus.EffectCalls.sumTimeCounter->Prometheus.SafeCounter.incrementMany(
~labels=effectName,
~value=timerRef->Hrtime.timeSince->Hrtime.toMillis->Hrtime.intFromMillis,
)
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

totalCallsCount counts failures; align metric/help.

Incrementing totalCallsCount in finally counts both fulfilled and rejected calls, but help says “resolved” (commonly interpreted as “fulfilled”). Either:

  • Change help to “completed (fulfilled or rejected)”, or
  • Only increment on success and add a separate failedCallsCount.

Option A (update help text in Prometheus.res):

-    ~help="Cumulative number of resolved Effect function calls during the indexing process.",
+    ~help="Cumulative number of completed (fulfilled or rejected) Effect function calls during the indexing process.",

Option B (count successes only here and add a failed counter elsewhere):

-            ->Promise.finally(() => {
+            ->Promise.thenResolve(() => {
               Prometheus.EffectCalls.totalCallsCount->Prometheus.SafeCounter.increment(
                 ~labels=effectName,
               )
+            })
+            ->Promise.catch(_err => {
+              // Optionally: Prometheus.EffectCalls.failedCallsCount->SafeCounter.increment(~labels=effectName)
+              Js.Promise.resolve()
+            })
+            ->Promise.finally(() => {
               Prometheus.EffectCalls.sumTimeCounter->Prometheus.SafeCounter.incrementMany(
                 ~labels=effectName,
-                ~value=callTimerRef->Hrtime.timeSince->Hrtime.toMillis->Hrtime.intFromMillis,
+                ~value=callTimerRef->Hrtime.timeSince->Hrtime.toMillis->Hrtime.intFromMillis,
               )
             }),

🏁 Script executed:

#!/bin/bash
# Spot other sites incrementing totalCallsCount to keep semantics consistent
rg -n -C2 'EffectCalls\.totalCallsCount'

Length of output: 529


🏁 Script executed:

#!/bin/bash
# Locate Prometheus.res file and find totalCallsCount definition with help text
fd -t f 'Prometheus\.res' | head -20

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Search for totalCallsCount definition with surrounding context to find help text
rg -n -B5 -A5 'totalCallsCount' --type=res | grep -A10 -B10 'help'

Length of output: 27


🏁 Script executed:

#!/bin/bash
# Also check dynamic templates for consistency
rg -n 'EffectCalls\.totalCallsCount' --type=res

Length of output: 27


🏁 Script executed:

#!/bin/bash
# Read the Prometheus.res file to find totalCallsCount definition
cat codegenerator/cli/npm/envio/src/Prometheus.res | grep -A10 -B5 totalCallsCount

Length of output: 623


🏁 Script executed:

#!/bin/bash
# Search for totalCallsCount with help text across all files (no type restriction)
rg -n -B3 -A3 'totalCallsCount' | grep -A10 -B10 'help'

Length of output: 1110


Update help text to clarify totalCallsCount measures all completed calls, not just fulfilled ones.

The verification confirms the review's concern: the help text states "Cumulative number of resolved Effect function calls," but the code increments totalCallsCount within Promise.finally() (LoadLayer.res:181), which fires for both fulfilled and rejected promises. In Promise semantics, "resolved" typically denotes fulfilled-only outcomes, creating misleading documentation.

The minimal fix is to update the help text in codegenerator/cli/npm/envio/src/Prometheus.res:614:

-    ~help="Cumulative number of resolved Effect function calls during the indexing process.",
+    ~help="Cumulative number of completed (fulfilled or rejected) Effect function calls during the indexing process.",

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In codegenerator/cli/templates/static/codegen/src/LoadLayer.res around lines
166-188 the counter totalCallsCount is incremented inside Promise.finally (so it
counts both fulfilled and rejected calls); update the help text in
codegenerator/cli/npm/envio/src/Prometheus.res at/near line 614 to accurately
reflect that this metric measures completed/settled calls (both fulfilled and
rejected) — e.g. change "Cumulative number of resolved Effect function calls" to
"Cumulative number of completed (fulfilled or rejected) Effect function calls"
or similar wording that explicitly states both outcomes.

@DZakh DZakh changed the title Dz/effect official Official Effect API release Oct 28, 2025
@DZakh DZakh changed the title Official Effect API release Effect API metrics + latency metrics + Dev Console with disabled TUI Oct 29, 2025
@DZakh DZakh enabled auto-merge (squash) October 29, 2025 13:44
@DZakh DZakh merged commit 2e2976d into main Oct 29, 2025
2 checks passed
@DZakh DZakh deleted the dz/effect-official branch October 29, 2025 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants