-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add iteration counter injection and infinite loop detection for loop transformations #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… loop transformations
…or enhanced analysis
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
📝 WalkthroughWalkthroughAdds AST-level infinite-loop detection and injects per-loop runtime iteration guards using a new Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Transformer as AgentScript Transformer
participant ChildVM as Child VM (sandbox)
participant ParentVM as Parent VM (validator)
participant Reporter as Perf Reporter
rect rgb(240,248,255)
note left of Transformer: Build-time loop transformation
Transformer->>Transformer: scan AST & collect loops
Transformer->>Transformer: inject per-loop counter\n(let __iter_N = 0) and guard\n(if (++__iter_N > __maxIterations) throw)
end
rect rgb(245,255,240)
note left of ChildVM: Runtime execution & validation
ChildVM->>ChildVM: execute transformed code\n(loop checks reference __maxIterations)
ChildVM->>ParentVM: emit rapid-enum events / ops
ParentVM-->>ChildVM: validate using\nrapidEnumerationThreshold / overrides
end
rect rgb(255,248,240)
note left of Reporter: Performance reporting
ChildVM->>Reporter: recordMetric(...)
Reporter->>Reporter: aggregate metrics\nwrite perf-results.json
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Fix all issues with AI Agents 🤖
In @libs/ast-guard/src/rules/infinite-loop.rule.ts:
- Around line 100-102: Remove the unreachable check for the `'!!'` operator in
the unary-expression handling: the AST never produces operator `'!!'`
(double-bang is parsed as two nested `UnaryExpression`s with operator `'!'`), so
delete the branch checking `node.operator === '!!'` in the isAlwaysTruthy logic
(or the method handling UnaryExpression) and rely on existing handling for `'!'`
and nested expressions instead.
In @libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts:
- Around line 233-242: The loop intends to prevent GC while measuring memory but
using "void obj" does nothing; replace that with retaining the created objects
in a local array (e.g., retained = []), push each obj returned by factory() into
that array after calling captureMemory(), use footprints.push(after.heapUsed -
before.heapUsed) as before, and only clear the retained array (retained.length =
0) when you want to allow GC (for example after the loop or at a deliberate
checkpoint); remove the no-op "void obj" so the reference is actually preserved.
In @libs/enclave-vm/src/double-vm/parent-vm-bootstrap.ts:
- Around line 733-735: Add a short docs section to
docs/draft/docs/libraries/enclave.mdx documenting the runtime global
__maxIterations: explain that scripts executed by the enclave VM can read the
global __maxIterations (set from the maxIterations configuration), that it is
exposed as a non-writable, non-configurable number for safety, and show a
minimal example snippet of reading it from user code plus any
migration/compatibility notes. Ensure the section name is clear (e.g., "Runtime
globals — __maxIterations") and include a cross-reference to the existing
maxIterations configuration documentation.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
libs/ast-guard/src/agentscript-transformer.ts (1)
351-418: Consider handling the case where parentBody is not found.If
findParentBodyreturnsnullor the loop isn't found in the parent body (line 413), the counter declaration won't be inserted, but the iteration check referencing__iter_Nwill still be injected. This could cause aReferenceErrorat runtime.While this scenario may be unlikely with valid JavaScript, it would be safer to handle this edge case explicitly.
🔎 Suggested defensive check
// Insert counter declaration before the loop const parentBody = findParentBody(ancestors); if (parentBody) { const loopIndex = parentBody.indexOf(node); if (loopIndex >= 0) { parentBody.splice(loopIndex, 0, createCounterDeclaration(counterName)); + } else { + // Loop not found in parent body - this shouldn't happen with valid AST + // but if it does, remove the injected check to avoid ReferenceError + console.warn(`Loop transformation warning: could not insert counter declaration`); } + } else { + console.warn(`Loop transformation warning: could not find parent body for counter declaration`); }
🧹 Nitpick comments (13)
libs/ast-guard/src/rules/infinite-loop.rule.ts (1)
128-134: Consider edge case:isAlwaysFalsy(null)returnstruefor missing node.Line 129 returns
truewhennodeis null/undefined, which is correct for internal usage (no expression means the check is "vacuously falsy" in the negation context). However, the Literal case comment on line 133 mentionsundefined, butundefinedis anIdentifier, not aLiteralin JS AST. The code handles this correctly via theIdentifiercase on line 142-145, so this is just a documentation nit.🔎 Suggested comment fix
case 'Literal': - // false, 0, "", null, undefined + // false, 0, "", null (note: `undefined` is an Identifier, not a Literal) return !node.value && node.value !== undefined;libs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.ts (3)
32-50: Potential naming collision:BenchmarkReportinterface exists instatistics.tswith different shape.The
relevant_code_snippetsshowslibs/enclave-vm/src/__tests__/perf/utils/statistics.tsalready exports aBenchmarkReportinterface with a different structure (containinglatency,throughput,memoryfields). This could cause confusion or import conflicts when both are used together.Consider renaming this interface to
BenchmarkTestReportorBenchmarkSuiteReportto differentiate it from the statistics-focused report.🔎 Proposed rename
-export interface BenchmarkReport { +export interface BenchmarkSuiteReport { timestamp: string; duration: number; // ... rest of interface }Then update usages in
buildReportandprintSummarymethods.
110-124: Consider race condition in concurrent test scenarios.The
loadMetricsFromFileandsaveMetricsToFilefunctions perform read-modify-write without locking. If Jest runs tests in parallel, metrics could be lost due to concurrent file access.This is likely acceptable for performance tests (which typically run sequentially), but worth noting if parallel execution is ever enabled.
156-172: Test metrics not associated with test results.In
buildReport,testResultsarray is populated with emptymetrics: []arrays (line 169), but theallMetricsfromgetAllMetrics()are stored separately inreport.metrics. This means individual test results don't show their associated metrics in the JSON output.This may be intentional design, but if you want metrics associated with specific tests, consider matching metrics to test results by test name.
libs/enclave-vm/src/double-vm/suspicious-patterns.ts (1)
66-91: Consider simplifying the config access logic.The current implementation has redundant checks and complex fallback paths. Lines 70-73 check
globalThis.validationConfigtwice using different access patterns, but they're semantically equivalent.🔎 Suggested simplification
let threshold = 30; // Default fallback try { - // eslint-disable-next-line @typescript-eslint/no-explicit-any - const config = - (globalThis as any).validationConfig || - (typeof (globalThis as any)['validationConfig'] !== 'undefined' - ? (globalThis as any)['validationConfig'] - : null); - if (!config) { - // Try direct access (works in parent VM where validationConfig is a local variable) - // @ts-expect-error validationConfig exists in parent VM runtime scope - const localConfig = typeof validationConfig !== 'undefined' ? validationConfig : null; - if (localConfig) { - const defaultThreshold = localConfig.rapidEnumerationThreshold ?? 30; - const overrides = localConfig.rapidEnumerationOverrides ?? {}; - threshold = overrides[operationName] ?? defaultThreshold; - } - } else { - const defaultThreshold = config.rapidEnumerationThreshold ?? 30; - const overrides = config.rapidEnumerationOverrides ?? {}; - threshold = overrides[operationName] ?? defaultThreshold; + // Try globalThis first, then local scope (parent VM injects validationConfig) + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const config = (globalThis as any).validationConfig + // @ts-expect-error validationConfig exists in parent VM runtime scope + ?? (typeof validationConfig !== 'undefined' ? validationConfig : null); + + if (config) { + const defaultThreshold = config.rapidEnumerationThreshold ?? 30; + const overrides = config.rapidEnumerationOverrides ?? {}; + threshold = overrides[operationName] ?? defaultThreshold; } } catch {libs/enclave-vm/src/double-vm/double-vm-wrapper.ts (1)
153-164: Type narrowing issue with string error handling.The code at line 124 casts
errortoError(const err = error as Error), but then line 154 checkstypeof err === 'string'. While this works at runtime (TypeScript casts don't affect runtime values), it's misleading and bypasses type safety.🔎 Suggested improvement for clearer type handling
- } catch (error: unknown) { - const err = error as Error; + } catch (error: unknown) { // Update stats stats.duration = Date.now() - startTime; stats.endTime = Date.now(); // Report memory usage if tracking was enabled if (memoryTracker) { const memSnapshot = memoryTracker.getSnapshot(); stats.memoryUsage = memSnapshot.peakTrackedBytes; } // Handle MemoryLimitError specially - if (err instanceof MemoryLimitError) { + if (error instanceof MemoryLimitError) { return { success: false, error: { name: 'MemoryLimitError', - message: err.message, + message: error.message, code: 'MEMORY_LIMIT_EXCEEDED', - data: { usedBytes: err.usedBytes, limitBytes: err.limitBytes }, + data: { usedBytes: error.usedBytes, limitBytes: error.limitBytes }, }, stats, }; } // Determine whether to sanitize stack traces const shouldSanitize = config.sanitizeStackTraces ?? true; // Handle thrown strings (e.g., iteration limit exceeded throws a string literal) - if (typeof err === 'string') { + if (typeof error === 'string') { return { success: false, error: { name: 'DoubleVMExecutionError', - message: err, + message: error, code: 'DOUBLE_VM_EXECUTION_ERROR', }, stats, }; } + const err = error as Error; return { success: false,libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (3)
80-84: DuplicateforceGCimplementation with different return type.This
forceGCreturnsvoid, whilememory-utils.ts(lines 43-49) hasforceGCreturningbooleanto indicate whether GC was actually triggered. Consider consolidating to a single implementation in one module and re-exporting, or align the return types for consistency across the utility suite.
96-100: Unusednameparameter.The
nameparameter is declared but never used in the function body. Either remove it or use it (e.g., for logging or labeling results).🔎 Proposed fix
export async function benchmark( - name: string, fn: () => Promise<unknown>, options?: Partial<BenchmarkOptions>, ): Promise<BenchmarkSample[]> {Or alternatively, include it in sample metadata if tracing is desired.
164-175: Unusednameparameter and missing cooldown support.Similar to
benchmark, thenameparameter is unused. Additionally,benchmarkSyncdoesn't applycooldownMsduring warmup iterations (unlike the async version at lines 111-113), which creates an inconsistency in behavior between the two functions.libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (2)
181-189: DuplicateformatBytesimplementation.This function is identical to
formatBytesinstatistics.ts(lines 222-230). Consider consolidating into a single shared utility to avoid divergence.
137-143: Redundant return type annotation.The return type
Omit<MemoryTrackingResult<T>, 'samples'> & { samples: MemorySnapshot[] }is equivalent toMemoryTrackingResult<T>sincesamplesis alreadyMemorySnapshot[]in the interface. Consider simplifying to justMemoryTrackingResult<T>.libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (2)
50-65:percentileassumes pre-sorted input.The function expects a sorted array but this isn't enforced. Consider either renaming to
percentileFromSortedto make the precondition explicit, or sorting internally for safety. Currently,calculateLatencyStatssorts before calling, which is correct, but other callers might miss this requirement.🔎 Proposed fix (option A: rename for clarity)
-export function percentile(sorted: number[], p: number): number { +export function percentileFromSorted(sortedValues: number[], p: number): number {
220-231: DuplicateformatBytesimplementation.This is identical to
memory-utils.ts(lines 181-189). Consolidate into one location.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (29)
.gitignorelibs/ast-guard/src/__tests__/agentscript-transformer.spec.tslibs/ast-guard/src/__tests__/infinite-loop.rule.spec.tslibs/ast-guard/src/agentscript-transformer.tslibs/ast-guard/src/presets/agentscript.preset.tslibs/ast-guard/src/rules/index.tslibs/ast-guard/src/rules/infinite-loop.rule.tslibs/enclave-vm/jest.perf.config.tslibs/enclave-vm/project.jsonlibs/enclave-vm/src/__tests__/double-vm.security.spec.tslibs/enclave-vm/src/__tests__/enclave.infinite-loop-attacks.spec.tslibs/enclave-vm/src/__tests__/enclave.multi-tenant.spec.tslibs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.tslibs/enclave-vm/src/__tests__/perf/enclave.perf.spec.tslibs/enclave-vm/src/__tests__/perf/setup.tslibs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.tslibs/enclave-vm/src/__tests__/perf/utils/index.tslibs/enclave-vm/src/__tests__/perf/utils/memory-utils.tslibs/enclave-vm/src/__tests__/perf/utils/perf-utils.tslibs/enclave-vm/src/__tests__/perf/utils/statistics.tslibs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.tslibs/enclave-vm/src/adapters/worker-pool/worker-script.tslibs/enclave-vm/src/double-vm/double-vm-wrapper.tslibs/enclave-vm/src/double-vm/parent-vm-bootstrap.tslibs/enclave-vm/src/double-vm/suspicious-patterns.tslibs/enclave-vm/src/double-vm/types.tslibs/enclave-vm/src/enclave.tslibs/enclave-vm/src/safe-runtime.tslibs/enclave-vm/src/types.ts
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/ast-guard/src/rules/index.tslibs/enclave-vm/src/safe-runtime.tslibs/enclave-vm/src/double-vm/parent-vm-bootstrap.tslibs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.tslibs/enclave-vm/src/__tests__/perf/utils/index.tslibs/enclave-vm/src/adapters/worker-pool/worker-script.tslibs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.tslibs/enclave-vm/jest.perf.config.tslibs/enclave-vm/src/types.tslibs/ast-guard/src/__tests__/agentscript-transformer.spec.tslibs/enclave-vm/src/__tests__/double-vm.security.spec.tslibs/enclave-vm/src/double-vm/double-vm-wrapper.tslibs/enclave-vm/src/double-vm/suspicious-patterns.tslibs/enclave-vm/src/__tests__/perf/enclave.perf.spec.tslibs/ast-guard/src/agentscript-transformer.tslibs/enclave-vm/src/double-vm/types.tslibs/ast-guard/src/rules/infinite-loop.rule.tslibs/enclave-vm/src/enclave.tslibs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.tslibs/enclave-vm/src/__tests__/enclave.infinite-loop-attacks.spec.tslibs/ast-guard/src/__tests__/infinite-loop.rule.spec.tslibs/enclave-vm/project.jsonlibs/enclave-vm/src/__tests__/perf/utils/memory-utils.tslibs/enclave-vm/src/__tests__/enclave.multi-tenant.spec.tslibs/enclave-vm/src/__tests__/perf/setup.tslibs/enclave-vm/src/__tests__/perf/utils/perf-utils.tslibs/enclave-vm/src/__tests__/perf/utils/statistics.tslibs/ast-guard/src/presets/agentscript.preset.ts
🧬 Code graph analysis (9)
libs/ast-guard/src/__tests__/agentscript-transformer.spec.ts (1)
libs/ast-guard/src/agentscript-transformer.ts (1)
transformAgentScript(89-191)
libs/enclave-vm/src/__tests__/double-vm.security.spec.ts (2)
libs/enclave-vm/src/types.ts (1)
ToolHandler(518-518)libs/enclave-vm/src/enclave.ts (1)
Enclave(156-705)
libs/enclave-vm/src/double-vm/suspicious-patterns.ts (2)
libs/enclave-vm/src/double-vm/types.ts (2)
SuspiciousPattern(30-44)OperationHistory(15-22)libs/enclave-vm/src/double-vm/index.ts (2)
SuspiciousPattern(16-16)OperationHistory(19-19)
libs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.ts (1)
libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (1)
BenchmarkReport(41-48)
libs/ast-guard/src/__tests__/infinite-loop.rule.spec.ts (2)
libs/ast-guard/src/rules/infinite-loop.rule.ts (1)
InfiniteLoopRule(32-171)libs/ast-guard/src/presets/agentscript.preset.ts (1)
createAgentScriptPreset(179-434)
libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (2)
libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (1)
forceGC(80-84)libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (1)
formatBytes(223-231)
libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (1)
libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (1)
forceGC(44-50)
libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (2)
libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (1)
BenchmarkSample(11-18)libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (1)
formatBytes(181-189)
libs/ast-guard/src/presets/agentscript.preset.ts (1)
libs/ast-guard/src/rules/infinite-loop.rule.ts (1)
InfiniteLoopRule(32-171)
🔇 Additional comments (60)
.gitignore (1)
100-100: LGTM!Appropriately ignoring the generated performance results file to keep it out of version control.
libs/enclave-vm/src/__tests__/double-vm.security.spec.ts (5)
535-551: LGTM!The test correctly validates the new >30 threshold for rapid enumeration detection. The assertion
expect(callCount).toBeLessThanOrEqual(32)properly accounts for the threshold check happening after some calls succeed.
553-586: LGTM!Good addition of a legitimate pagination test case that verifies normal usage patterns (20 calls) are not blocked by the new 30-call threshold.
621-690: LGTM!Well-structured tests for custom
rapidEnumerationThresholdconfiguration covering both low threshold (triggering detection early) and high threshold (allowing more calls) scenarios.
692-763: LGTM!Thorough testing of per-operation
rapidEnumerationOverridesfeature, verifying that different operations can have independent thresholds while respecting the default for unconfigured operations.
1444-1459: LGTM!Test updated consistently with the new >30 threshold for rapid enumeration detection.
libs/ast-guard/src/rules/infinite-loop.rule.ts (2)
40-82: LGTM!The
validatemethod correctly implements configurable loop detection using acorn-walk, with sensible defaults (all loop types enabled). The handler registration pattern is clean and efficient.
32-38: Remove documentation requirement:InfiniteLoopRuleis not exported as a public API.
InfiniteLoopRuleis used internally by theagentscriptpreset but is not exported fromlibs/ast-guard/src/index.ts. It is therefore not a public API change subject to the documentation guidelines forlibs/**libraries. No documentation update is required.Likely an incorrect or invalid review comment.
libs/enclave-vm/src/__tests__/perf/setup.ts (1)
1-9: LGTM!The intentional no-op is well-documented, explaining that metrics are managed by the reporter to enable cross-file collection during test runs.
libs/enclave-vm/src/enclave.ts (1)
653-655: LGTM!Adding
__maxIterationsto the allowed globals list is necessary for the loop transformation feature. This enables transformed loops to access the iteration limit at runtime while passing AST validation.libs/ast-guard/src/rules/index.ts (1)
22-22: LGTM!Standard barrel export for the new
InfiniteLoopRule, consistent with other rule exports in this file.libs/enclave-vm/src/adapters/worker-pool/worker-script.ts (3)
335-337: LGTM! Consistent iteration limit exposure.The
__maxIterationsglobal is properly exposed to transformed loop code with a sensible default of 10000, matching the fallback used in the safe loop wrappers throughout this file (lines 417, 448, 478).
733-736: LGTM! Inner VM iteration limit properly exposed.The
__maxIterationsvalue is correctly injected into the inner VM context, making the iteration limit available to transformed loop code. The value is sourced from themaxIterationsparameter passed to the bootstrap generator.Note: Line 733 also adds
parallel: createSecureProxy(innerParallel)which appears to be a separate feature addition (parallel execution support). Consider whether this should be mentioned in the PR objectives or split into a separate change for clarity.
182-183: No action needed. TherapidEnumerationThresholdandrapidEnumerationOverridesfields are properly typed in theSerializableParentValidationConfiginterface asnumberandRecord<string, number>respectively.libs/ast-guard/src/presets/agentscript.preset.ts (3)
204-207: LGTM! Proper allowlist addition for runtime global.Adding
__maxIterationsto the allowed globals prevents false positives from theUnknownGlobalRulewhen validating transformed code that references this runtime-injected iteration limit. The comment clearly documents its purpose.
367-372: Excellent defense-in-depth approach!Adding the
InfiniteLoopRuleprovides early detection of obvious infinite loop patterns during static analysis, complementing the runtime iteration guards. The explanatory comment is clear and helpful. Good placement afterForbiddenLoopRule(step 7) as a logical extension (step 7b).
17-17: InfiniteLoopRule is properly exported.The rule is exported from
libs/ast-guard/src/rules/infinite-loop.rule.tsand re-exported via the index file, allowing the import in the preset file to work correctly. No issues found.libs/enclave-vm/src/safe-runtime.ts (1)
572-574: LGTM! Consistent iteration limit exposure across adapters.The
__maxIterationsfield properly exposes the configured iteration limit to the safe runtime, enabling loop transformation logic to reference this value. This is consistent with similar additions inworker-script.ts(lines 335-337) andparent-vm-bootstrap.ts(line 735).libs/enclave-vm/project.json (1)
9-16: Performance test configuration file verified and properly configured.The
libs/enclave-vm/jest.perf.config.tsfile exists and is well-configured for performance benchmarking. The configuration includes appropriate settings: extended timeout (2 minutes per test), serial execution to prevent benchmark interference, and custom JSON reporter for results output.libs/enclave-vm/src/__tests__/enclave.multi-tenant.spec.ts (1)
307-335: LGTM!The
maxIterations: 150000configuration is correctly added to allow the 100k iteration loop in this test to complete without hitting the default iteration limit. This change is necessary given the new loop iteration guards being introduced in this PR.libs/enclave-vm/src/__tests__/enclave.infinite-loop-attacks.spec.ts (3)
24-42: LGTM! Comprehensive AST-level infinite loop detection tests.The test suite effectively validates that obvious infinite loop patterns like
for(;;)are blocked at AST validation time. The test correctly verifies both failure status and error message content.
80-118: Good coverage for ForbiddenLoopRule behavior.These tests correctly document that
while(true)anddo{}while(true)are blocked by the ForbiddenLoopRule (since while/do-while are disabled by default in AgentScript preset), distinct from the InfiniteLoopRule that targets infinite for loops.
139-157: Good runtime iteration limit test.This test validates the runtime enforcement of iteration limits for loops that pass AST validation but exceed the configured
maxIterations(100) at runtime. The bounded 1000-iteration loop is syntactically valid but exceeds the limit.libs/enclave-vm/src/double-vm/suspicious-patterns.ts (1)
35-54: Good documentation of the configurable threshold behavior.The updated docblock clearly explains the default threshold of 30, legitimate use cases, and configuration options. This helps maintainers understand the rationale behind the threshold values.
libs/enclave-vm/src/types.ts (1)
985-995: LGTM!The new
rapidEnumerationThresholdandrapidEnumerationOverridesdefaults are appropriately added toDEFAULT_DOUBLE_VM_CONFIG. The default threshold of 30 aligns with the fallback value in the detection logic, and the empty overrides object provides a clean starting point for per-operation customization.libs/ast-guard/src/__tests__/agentscript-transformer.spec.ts (3)
241-260: Good test for basic for loop iteration counter injection.The test correctly verifies that:
- A counter declaration is added (
let __iter_N = 0)- The
__maxIterationscheck is present- The original loop structure is preserved
This ensures the transformation adds safety without breaking semantics.
305-323: Good test for unique counter names in nested loops.Verifying that nested loops get unique counter names (
__iter_0,__iter_1) is critical to prevent variable shadowing issues that could cause incorrect iteration counting.
338-376: Good tests for break/continue preservation.These tests verify that the iteration counter injection doesn't break the semantics of
breakandcontinuestatements, which is essential for correct loop behavior.libs/ast-guard/src/agentscript-transformer.ts (2)
276-307: Good design: throwing string literal instead of Error object.The decision to throw a string literal rather than constructing an Error object (line 300-302) is clever - it avoids requiring the Error constructor to be accessible in the sandbox while still providing a meaningful error message. The enclave can catch and wrap this appropriately.
401-418: Good approach: reverse-order transformation to preserve indices.Processing loops in reverse order (line 403) is the correct approach when inserting declarations before loops - it ensures that insertions don't affect the indices of loops processed later.
libs/enclave-vm/jest.perf.config.ts (1)
1-30: Well-structured performance test configuration.The Jest config is appropriately designed for performance benchmarks:
maxWorkers: 1ensures consistent measurements without parallel interferencetestTimeout: 120000provides adequate time for long-running benchmarks- Coverage is correctly disabled to avoid instrumentation overhead
- Custom reporter for structured output to
perf-results.json- Setup file properly clears metrics between runs
libs/enclave-vm/src/double-vm/types.ts (2)
102-114: Well-structured addition of rapid enumeration configuration fields.The new
rapidEnumerationThresholdandrapidEnumerationOverridesfields are properly documented with JSDoc comments including default values and usage examples. The types are consistent with the serializable counterpart.As per coding guidelines for
libs/**, when public APIs change, ensure there is a matchingdocs/draft/docs/**update. Please verify documentation is updated to reflect these new configuration options.
136-137: LGTM!The serializable config fields mirror the parent config correctly, maintaining type consistency for cross-VM boundary serialization.
libs/ast-guard/src/__tests__/infinite-loop.rule.spec.ts (5)
1-11: LGTM! Well-organized test file with comprehensive coverage.The test file is well-structured with clear organization covering:
- Infinite patterns for all loop types (for, while, do-while)
- Valid patterns that should not be flagged
- Configuration options testing
- Integration with AgentScript preset
- Edge cases and error location verification
12-124: Comprehensive for-loop test coverage.Good coverage of both infinite patterns (missing test, always-true, truthy literals, negations) and valid patterns (bounded, variable conditions, always-false). The tests correctly expect
for(;0;)andfor(;false;)to pass validation since they never execute.
126-285: LGTM! Good coverage of while and do-while patterns.Correctly tests JavaScript-specific truthy semantics (empty arrays and objects are truthy in JS). The do-while tests properly handle the "executes at least once" behavior with
while(false).
287-392: LGTM! Good configuration and integration coverage.The configuration tests verify all boolean options are respected. Integration tests with the AgentScript preset confirm the rule works within the full validation pipeline, including proper handling of
allowedLoopsconfiguration.
394-475: LGTM! Comprehensive edge case coverage.Good testing of:
- Nested loops reporting multiple issues
- Special identifier handling (Infinity as truthy, undefined/NaN as falsy)
- Empty body loops (
for(;;);)- Error location accuracy with line numbers
These edge cases align with the
isAlwaysTruthy/isAlwaysFalsyimplementation in the rule.libs/enclave-vm/src/__tests__/perf/utils/index.ts (1)
1-5: LGTM! Clean barrel export structure.The file properly centralizes performance utility exports with appropriate separation of runtime and type-only exports.
libs/enclave-vm/src/double-vm/double-vm-wrapper.ts (1)
424-425: LGTM!The new rapid enumeration fields are correctly serialized from the parent validation config. Both types (number and
Record<string, number>) are natively serializable for cross-VM boundary passing.libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (4)
1-50: LGTM! Well-structured worker pool performance test suite.Good setup with:
- Extended timeout (60s) for worker spawn overhead
- Configurable helper function for worker enclave creation
- Explicit double VM disabled to isolate worker adapter testing
52-121: LGTM! Startup latency tests are well-designed.Cold and warm startup tests properly isolate the measured behavior with dispose/recreate patterns. The assertions (5s for cold, 3s for first execution) provide reasonable upper bounds.
123-385: LGTM! Comprehensive latency and load testing.Good patterns used:
- Controlled concurrency with
Promise.raceandinFlighttracking- Comparison against VM adapter for overhead analysis
- Queue exhaustion testing with intentionally small pool
The latency assertions are reasonable for the measured operations.
387-578: LGTM! Good coverage of scaling, memory, and stability.Thorough testing of:
- Throughput scaling with different pool sizes
- Memory footprint and leak detection
- Long-running stability with degradation checks
The use of
global.gc()when available helps ensure accurate memory measurements.libs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.ts (2)
1-85: LGTM! Well-designed double VM overhead comparison.The test provides valuable comparative metrics between double VM enabled vs disabled, with proper warmup iterations and cleanup. The assertion that double VM should be "at least 50% of single" (line 82) ensures it's not somehow faster due to measurement variance, while the "less than 10x slower" bound is reasonable for the security benefit.
87-422: LGTM! Comprehensive double VM performance characterization.Good coverage of:
- Performance across all security levels
- Context creation (cold vs warm)
- Tool call proxy overhead and linear scaling verification
- Validation/pattern detection impact
- Memory overhead comparison
The tests provide valuable performance baselines for the double VM feature.
libs/enclave-vm/src/__tests__/perf/enclave.perf.spec.ts (3)
1-46: LGTM! Well-structured cold start latency tests.Good use of
it.eachfor parameterized testing across security levels. Proper create/dispose pattern for cold start measurement with metric recording for JSON output.
48-225: LGTM! Comprehensive latency and throughput testing.Good coverage of:
- Warm start latency with benchmark utility
- Security level comparison
- Throughput for various code patterns (simple, tool calls, loops)
The loop throughput test at line 192-193 correctly uses the
maxIterationsconfiguration introduced in this PR.
227-397: LGTM! Valuable overhead and memory profiling.Good testing of:
- Tool call overhead scaling with per-call overhead calculation
- Code complexity scaling to understand parse/transform costs
- Memory footprint and leak detection
These metrics provide useful baselines for performance monitoring.
libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (2)
41-57: LGTM!The
measureAsyncandmeasureSyncimplementations are clean and correctly capture timing withperformance.now().Also applies to: 59-75
121-156: LGTM!The measurement phase correctly handles errors, captures memory when requested, and applies GC/cooldown between iterations.
libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (3)
26-50: LGTM!The
captureMemoryandforceGCfunctions are straightforward and correct. Returning a boolean fromforceGCis useful for callers to know if GC was actually performed.
52-132: LGTM!The
trackMemoryfunction properly handles interval-based sampling with cleanup in both success and error paths. The peak tracking and delta calculations are correct.
191-204: LGTM!The
formatMemorySummaryfunction provides a clean, readable output format for memory tracking results.libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (5)
75-83: Population vs. sample standard deviation.This computes population standard deviation (dividing by n). For sample statistics, dividing by (n-1) is typically preferred for unbiased estimation. If this is intentional for benchmark analysis, consider adding a brief comment to clarify.
1-6: LGTM!The interfaces are well-structured and provide comprehensive coverage of latency, throughput, and memory statistics needed for benchmark reporting.
Also applies to: 7-49
85-121: LGTM!The
calculateLatencyStatsfunction correctly handles empty input, sorts the data internally, and computes all relevant percentiles.
123-166: LGTM!The throughput and memory stats calculations are correct, with proper handling of edge cases (empty samples, missing memory data).
233-262: LGTM!The
compareReportsfunction provides a straightforward regression detection mechanism. The 10% threshold for latency increase or throughput decrease is a reasonable default for flagging regressions.libs/enclave-vm/src/double-vm/parent-vm-bootstrap.ts (1)
182-183: Rapid enumeration config fields are properly defined with defaults and handled gracefully.The
rapidEnumerationThresholdandrapidEnumerationOverridesfields are correctly defined as required fields inSerializableParentValidationConfig(types.ts lines 136-137) with sensible defaults inDEFAULT_DOUBLE_VM_CONFIG(rapidEnumerationThreshold: 30andrapidEnumerationOverrides: {}). The suspicious pattern detection code in suspicious-patterns.ts (lines 79-86) defensively handles these fields with nullish coalescing operators, ensuring graceful fallback even though the fields are required in the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @libs/ast-guard/src/rules/infinite-loop.rule.ts:
- Around line 131-133: The comment for the case 'Literal' branch is inconsistent
with the condition "return !node.value && node.value !== undefined" because it
lists `undefined` as a falsy value to detect while the code explicitly excludes
it; update the comment next to the case 'Literal' (and/or near the node.value
check) to accurately state that undefined is excluded here (since undefined is
handled elsewhere, e.g., the Identifier branch) so the comment matches the
condition's intent.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/draft/docs/libraries/enclave.mdxlibs/ast-guard/src/rules/infinite-loop.rule.tslibs/enclave-vm/README.mdlibs/enclave-vm/src/__tests__/perf/utils/memory-utils.tslibs/enclave-vm/src/double-vm/suspicious-patterns.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- libs/enclave-vm/src/tests/perf/utils/memory-utils.ts
🧰 Additional context used
📓 Path-based instructions (3)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/double-vm/suspicious-patterns.tslibs/enclave-vm/README.mdlibs/ast-guard/src/rules/infinite-loop.rule.ts
docs/draft/docs/**
⚙️ CodeRabbit configuration file
docs/draft/docs/**: This folder holds the draft/source docs that humans are expected to edit. When authors want to add or change documentation, they should do it here. The Codex workflow uses these drafts, together with the code diff, to generate the latest docs under docs/docs/. As a reviewer: - Encourage contributors to add/update content here instead of docs/docs/. - It is fine to do structural/content feedback here (clarity, examples, etc).
Files:
docs/draft/docs/libraries/enclave.mdx
docs/**
⚙️ CodeRabbit configuration file
docs/**: Repository documentation for the SDK, using MDX and hosted by Mintlify. See more specific rules for: - docs/docs/** (latest rendered docs, automation-only) - docs/v/** (archived versions, read-only) - docs/draft/docs/** (human-editable drafts) - docs/blogs/** (blogs, human edited) - docs/docs.json (Mintlify navigation)
Files:
docs/draft/docs/libraries/enclave.mdx
🧬 Code graph analysis (1)
libs/enclave-vm/src/double-vm/suspicious-patterns.ts (2)
libs/enclave-vm/src/double-vm/types.ts (2)
SuspiciousPattern(30-44)OperationHistory(15-22)libs/enclave-vm/src/double-vm/index.ts (2)
SuspiciousPattern(16-16)OperationHistory(19-19)
🔇 Additional comments (12)
libs/ast-guard/src/rules/infinite-loop.rule.ts (6)
7-16: LGTM: Clean API design.The interface provides clear, granular control over which loop types to check, with well-documented defaults.
32-38: LGTM: Appropriate metadata and constructor.The rule metadata is well-defined, and ERROR severity with enabled-by-default is appropriate for infinite loop detection.
40-82: LGTM: Solid implementation of loop detection logic.The conditional handler registration based on options is clean, and the detection logic correctly identifies missing test conditions and always-truthy patterns for each loop type.
87-122: LGTM: Comprehensive truthiness detection.The logic correctly handles literals, unary expressions, special identifiers, and object/array expressions. The comment at lines 98-99 explains that
!!xis parsed as two nested UnaryExpressions, which properly addresses the past review concern about the!!operator check.
151-169: LGTM: Robust error reporting.The method defensively handles optional location data and provides clear, informative error messages with useful debugging context.
7-16: Add InfiniteLoopRule exports to main library index and update documentation.
InfiniteLoopRuleandInfiniteLoopOptionsare exported fromlibs/ast-guard/src/rules/index.tsbut are missing from the main library export file (libs/ast-guard/src/index.ts). All other validation rules follow the pattern of being explicitly exported from the main index. Add these exports to maintain consistency.Additionally, per the coding guidelines for
libs/**, new public APIs must have matching documentation underdocs/draft/docs/**. Theast-guard.mdxdocumentation currently does not mention theInfiniteLoopRuleor document its options. Add documentation covering this new validation rule and its configuration options.libs/enclave-vm/src/double-vm/suspicious-patterns.ts (2)
38-55: LGTM! Excellent documentation of legitimate use cases and configuration.The expanded docblock clearly explains the purpose, legitimate use cases (pagination/iteration), configuration options, and DoS considerations. This will help developers understand when and how to tune the threshold.
63-79: Original review comment is not applicable—the code is working as designed.The parent VM bootstrap (lines 178-184 of parent-vm-bootstrap.ts) properly initializes
validationConfigas a const before the detection functions execute. Since these functions are serialized and injected into the bootstrap scope at runtime,validationConfigis guaranteed to be available in their execution context. The try-catch with @ts-expect-error is a defensive pattern that correctly handles compile-time type constraints while relying on runtime scope availability—this is the intended design, not a workaround for configuration issues.docs/draft/docs/libraries/enclave.mdx (2)
87-87: LGTM! Documentation accurately reflects the new configurable threshold.The updated description correctly shows the new default threshold of 30 and indicates it's configurable.
92-108: Excellent addition! Clear configuration examples for rapid enumeration tuning.This new section provides practical examples of how to configure both the default threshold and per-operation overrides. The examples (search, users.list) demonstrate common pagination scenarios.
libs/enclave-vm/README.md (2)
102-103: LGTM! Appropriate API documentation update.The addition of
doubleVmconfiguration option to the interface documentation with a reference to the detailed section is correct.
240-346: The Double VM documentation is accurate and consistent with the implementation. All configuration types, default values, pattern IDs, and examples match the codebase. The boundary condition described (">30 times in 5 seconds") is mathematically correct and matches the code's use of the>operator (which blocks at 31+ calls).No changes needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
.github/workflows/push.yml (2)
60-65: Remove redundant error handling.The command includes
|| true(line 64) which forces the exit code to 0, makingcontinue-on-error: true(line 65) redundant. Choose one approach for clarity.🔎 Suggested fix
- name: Run performance tests id: perf-tests run: | echo "Running performance benchmarks..." - npx nx run enclave-vm:test-perf --passWithNoTests || true + npx nx run enclave-vm:test-perf --passWithNoTests continue-on-error: true
67-78: Consider adding error handling for the formatting script.If the formatting script fails (e.g., due to malformed JSON), this step will fail without
continue-on-error, potentially blocking the workflow. Consider whether failures in performance summary generation should be non-blocking.🔎 Optional: Add continue-on-error if non-blocking is desired
- name: Generate performance summary if: always() run: | if [ -f "perf-results.json" ]; then echo "## 📊 Performance Test Results" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY node scripts/format-perf-summary.mjs perf-results.json >> $GITHUB_STEP_SUMMARY else echo "## ⚠️ Performance Tests" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY echo "No performance results found. Tests may have been skipped or failed to run." >> $GITHUB_STEP_SUMMARY fi + continue-on-error: truescripts/format-perf-summary.mjs (4)
23-23: Add error handling for JSON parsing.If
perf-results.jsoncontains malformed JSON, the script will throw an uncaught exception. Consider wrapping the parse operation in a try-catch block to provide a user-friendly error message.🔎 Suggested error handling
+let report; +try { -const report = JSON.parse(readFileSync(resultsPath, 'utf-8')); + report = JSON.parse(readFileSync(resultsPath, 'utf-8')); +} catch (error) { + console.log('## ⚠️ Invalid Performance Results\n'); + console.log(`Failed to parse \`${resultsFile}\`: ${error.message}`); + process.exit(0); +}
39-47: Validate JSON structure before accessing properties.The code assumes
report.summaryexists and hastotal,passed, andfailedproperties. If the JSON structure is unexpected, this will throw. Consider adding validation or defensive checks.🔎 Suggested validation
// Test summary console.log('### Test Summary\n'); -const { summary } = report; +const summary = report.summary || { total: 0, passed: 0, failed: 0 }; const passRate = summary.total > 0 ? ((summary.passed / summary.total) * 100).toFixed(1) : 0; console.log(`✅ **${summary.passed}/${summary.total}** tests passed (${passRate}%)\n`);
55-60: Add validation for metrics structure.The code iterates over
metricswithout verifying that each value is an array. If the JSON structure is unexpected (e.g.,metrics[testName]is not an array), this will throw.🔎 Suggested defensive check
const allMetrics = []; for (const [testName, metrics] of Object.entries(report.metrics)) { + if (!Array.isArray(metrics)) continue; for (const metric of metrics) { allMetrics.push({ testName, ...metric }); } }
97-98: Add fallback for missing timestamp.If
report.timestampis missing, the footer will display "undefined". Consider using optional chaining with a fallback value.🔎 Suggested fix
// Footer -console.log(`---\n*Generated at ${report.timestamp}*`); +console.log(`---\n*Generated at ${report.timestamp || new Date().toISOString()}*`);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/push.ymlscripts/format-perf-summary.mjs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
.github/workflows/push.yml (1)
80-87: LGTM!The artifact upload step is well-configured with appropriate error handling (
if-no-files-found: ignore) and retention settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Fix all issues with AI Agents 🤖
In @libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts:
- Around line 178-180: The awaiting call to await this.getEmbedding(prompt) in
local-llm.scorer.ts performs an expensive embedding inference and its result is
discarded; remove this call (since initialize() already validates the pipeline)
or, if you actually need the embedding, assign it to a variable and incorporate
it into the scoring logic (e.g., use the embedding in analyzePrompt or the
scoring function); if the intention was only to verify pipeline readiness,
replace the call with a lightweight readiness check such as using initialize()
or an explicit pipelineLoaded flag instead of running getEmbedding.
- Around line 201-204: The code calls await this.getEmbedding(prompt) but never
uses the result, wasting CPU on every scoring request; either remove that call
from the scoring path in local-llm.scorer.ts (and in scoreWithClassification) or
actually use the embedding: assign const embedding = await
this.getEmbedding(prompt) and feed it into the similarity-based scoring logic
(or a VectoriaDB lookup) instead of the current keyword heuristics;
alternatively replace the call with a lightweight readiness helper (e.g.,
ensurePipelineLoaded or a boolean check) if you only need to warm the model.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.prettierignore (1)
15-23: Clarify the intent: broaddocs/ignore makes specific patterns redundant.Line 18 adds
docs/as a broad ignore pattern that matches the entiredocs/directory and all its contents. This makes the specific patterns in lines 21–23 (docs/live/docs.json,docs/docs.draft.json,docs/docs.backup.json) redundant—Prettier will skip all files underdocs/once the broader pattern matches.Decide on the intent:
- If all documentation should be ignored: keep line 18 and remove lines 21–23.
- If only specific files should be ignored: remove line 18.
🧹 Nitpick comments (1)
scripts/format-perf-summary.mjs (1)
95-95: Hardcoded threshold comparison operator may be inaccurate.Line 95 assumes all thresholds are upper bounds by hardcoding
<=. However, some performance metrics have lower-bound thresholds (e.g., minimum throughput, minimum operations per second) where the comparison should be>=. Consider including comparison operator information in the metric object or document that only upper-bound thresholds are supported.💡 Possible enhancement
If the upstream performance reporter can provide comparison operators in the metric object, you could render them dynamically:
- const threshold = metric.threshold !== undefined ? `<= ${metric.threshold} ${metric.unit || ''}` : '-'; + const operator = metric.operator || '<='; + const threshold = metric.threshold !== undefined ? `${operator} ${metric.threshold} ${metric.unit || ''}` : '-';
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/workflows/push.yml.prettierignoreeslint.config.mjslibs/enclave-vm/eslint.config.mjslibs/enclave-vm/src/adapters/vm-adapter.tslibs/enclave-vm/src/double-vm/double-vm-wrapper.tslibs/enclave-vm/src/scoring/feature-extractor.tslibs/enclave-vm/src/scoring/scorers/local-llm.scorer.tslibs/enclave-vm/src/scoring/scoring-gate.tslibs/enclave-vm/src/sidecar/reference-resolver.tsscripts/format-perf-summary.mjs
🚧 Files skipped from review as they are similar to previous changes (2)
- .github/workflows/push.yml
- libs/enclave-vm/src/double-vm/double-vm-wrapper.ts
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/sidecar/reference-resolver.tslibs/enclave-vm/src/scoring/feature-extractor.tslibs/enclave-vm/src/adapters/vm-adapter.tslibs/enclave-vm/eslint.config.mjslibs/enclave-vm/src/scoring/scoring-gate.tslibs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Performance
- GitHub Check: Test
🔇 Additional comments (7)
eslint.config.mjs (1)
14-20: LGTM! Standard best practice for intentional unused variables.The
no-unused-varsrule with underscore ignore patterns is a well-established convention that allows developers to explicitly mark intentional unused variables and arguments. This is particularly useful for runtime placeholders and callback parameters that must match a specific signature but aren't used.libs/enclave-vm/eslint.config.mjs (1)
12-15: LGTM! Appropriate exclusion for test-only dependency.Adding
@jest/reportersto the ignored dependencies list is correct, as test-only dependencies used by benchmark reporters don't need to be treated as production dependencies. The explanatory comment clearly documents the rationale.libs/enclave-vm/src/scoring/feature-extractor.ts (1)
274-274: LGTM! Type-safety improvement.The explicit type cast
as stringis safe and appropriate here, as theisStaticNameguard on line 273 already ensurestoolNameNode.valueis a string. This improves type-safety without changing runtime behavior.libs/enclave-vm/src/sidecar/reference-resolver.ts (1)
46-49: LGTM!The refactored type guard correctly uses an intermediate typed variable with bracket-notation property access. Short-circuit evaluation ensures
obj['__type']andobj['__parts']are only accessed after confirmingvalueis a non-null object. This is a clean, defensive pattern for type guards onunknownvalues with no change to public API behavior.libs/enclave-vm/src/scoring/scoring-gate.ts (2)
17-18: LGTM: Clean import restructuring following TypeScript best practices.The separation of type-only imports (line 17) from value imports (line 18) using the
import typesyntax is excellent for tree-shaking and clearly distinguishes compile-time types from runtime values. The removal of unused imports (ExtractedFeatures,DEFAULT_SCORING_CONFIG) keeps the module boundary clean.
30-40: Configuration normalization consolidates logic effectively.Delegating configuration normalization to
normalizeScoringConfig(line 34) is a good separation of concerns, centralizing the handling of both legacy and new progressive modes in the types module. The constructor remains clean and the public API is unchanged, maintaining backward compatibility for this SDK library.libs/enclave-vm/src/adapters/vm-adapter.ts (1)
411-411: Good type safety improvement.Replacing
anywithRecord<string, unknown>provides better type safety while maintaining the necessary flexibility for dynamic property assignment.Also applies to: 415-415
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts:
- Around line 85-120: The test disposes the wrong instance inside the loop: it
calls enclave.dispose() (the outer instance) on each iteration which will fail
after the first dispose; remove the repeated enclave.dispose() call from inside
the loop and either dispose the outer enclave once before the loop or eliminate
the outer createWorkerEnclave() entirely and only use the per-iteration
newEnclave created by createWorkerEnclave(); ensure you only call dispose() on
newEnclave inside the loop (and on the outer enclave exactly once if you keep
it).
🧹 Nitpick comments (2)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (2)
292-339: Consider tightening the p99 threshold.Line 338 uses a 50x multiplier (
stats.p99 < stats.p50 * 50) to account for CI variability. While CI environments can be noisy, this threshold is extremely permissive and may not catch significant performance regressions. For example, if p50 is 10ms, p99 could be up to 500ms and still pass.Consider either:
- Using a tighter multiplier (e.g., 10-20x) if historical data supports it
- Adding an absolute upper bound in addition to the relative check
- Documenting why such a loose threshold is necessary based on observed CI behavior
501-529: Consider documenting the --expose-gc requirement.The test checks for
global.gcavailability (line 518) before forcing garbage collection, which is good defensive coding. However, for this test to be fully effective, Node.js must be run with the--expose-gcflag.Consider adding a comment or console warning when GC is not available, or document this requirement in the test suite setup to ensure consistent CI configuration.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts
🧬 Code graph analysis (1)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (7)
libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (4)
calculateLatencyStats(88-121)calculateReport(171-182)formatReport(187-218)formatBytes(223-231)libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (1)
benchmark(96-159)scripts/format-perf-summary.mjs (1)
report(23-23)libs/enclave-vm/src/__tests__/perf/utils/index.ts (1)
recordMetrics(4-4)libs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.ts (1)
recordMetrics(76-94)libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (2)
trackMemory(55-132)formatBytes(181-189)libs/enclave-vm/src/sidecar/reference-resolver.ts (1)
resolve(167-213)
🔇 Additional comments (9)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (9)
1-15: LGTM! Appropriate imports and timeout configuration.The 60-second timeout is reasonable for worker thread spawn overhead, and all necessary utilities are imported.
20-50: LGTM! Well-structured helper with appropriate defaults.The configuration correctly isolates the worker adapter for testing by disabling double VM and VM-level memory limits.
54-83: LGTM! Cold startup test correctly measures pool initialization.The test properly disposes the enclave after each measurement iteration.
123-232: LGTM! Comprehensive latency benchmarks with proper cleanup.The tests appropriately compare worker pool overhead against VM adapter baseline and include tool call scenarios.
234-290: LGTM! Proper concurrent load handling with metrics recording.The test correctly maintains concurrency level using
Promise.raceand records metrics for CI reporting.
341-385: LGTM! Queue exhaustion test properly validates backpressure behavior.The test correctly forces queueing by using more concurrent requests than available workers.
388-444: LGTM! Throughput scaling test with appropriate lack of strict assertions.The test correctly avoids making strict scaling factor assertions, which is appropriate given the variability in CI environments.
447-499: LGTM! Memory tracking tests with reasonable thresholds.The tests properly use the
trackMemoryutility which forces GC for accurate baseline measurements.
532-578: LGTM! Stability test appropriately validates performance over time.The test correctly measures latency degradation across batches and uses a reasonable 50% threshold to detect stability issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Fix all issues with AI Agents 🤖
In @libs/enclave-vm/README.md:
- Around line 376-432: The README example incorrectly nests scoring under a
`scoring` property; update the Enclave constructor example so `scorer` and
`localLlm` are top-level properties on the EnclaveConfig (not under `scoring`),
keeping `customAnalyzer: myAnalyzer` inside `localLlm`; adjust the example that
constructs `new Enclave({...})` to pass `scorer: 'local-llm'` and `localLlm: {
modelId: 'Xenova/all-MiniLM-L6-v2', customAnalyzer: myAnalyzer }` alongside
`toolHandler`, and ensure references to the CustomAnalyzer interface (analyze,
initialize, dispose) remain unchanged.
- Around line 101-104: Export PartialDoubleVmConfig from the package entry so
consumers can import it: open libs/enclave-vm/src/index.ts and add
PartialDoubleVmConfig to the re-exports from ./types (the same place other types
are exported), ensuring the public API matches CreateEnclaveOptions.doubleVm
which references PartialDoubleVmConfig; update any export list that re-exports
types from ./types to include PartialDoubleVmConfig.
In @libs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.ts:
- Around line 601-623: The test name claims the custom analyzer is initialized
but never asserts initializeFn was called; update the test to either (A)
explicitly assert initializeFn was invoked when the model loads by mocking model
load to succeed and adding expect(initializeFn).toHaveBeenCalled() after await
scorer.initialize(), or (B) if model load is non-deterministic keep current
behavior but rename the test to something like "should be ready with custom
analyzer configured" and remove the misleading assertion about initialization;
refer to LocalLlmScorer, LocalLlmConfig, customAnalyzer, and initializeFn in
your changes.
In @libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts:
- Around line 122-126: The custom analyzer's initialize() should be called
whenever a custom analyzer is provided, even if the HuggingFace model fails to
load; update the logic in the constructor/initialization flow of
local-llm.scorer.ts so that if config.customAnalyzer is present you await
config.customAnalyzer.initialize() (wrap in try/catch so initialization errors
are logged but don't prevent fallbackToRules), and only set ready = true after
either model load or the analyzer initialize attempt completes; this ensures
score() can safely rely on the custom analyzer being initialized when
configured.
In @scripts/format-perf-summary.mjs:
- Around line 95-96: The comment incorrectly references threshold.type; update
the comment to use the correct property name thresholdType (matching how
metric.thresholdType is used in the code and on line 97) so the docstring aligns
with the actual property name and clarifies that for lower-bound thresholds set
thresholdType = 'min'.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
libs/enclave-vm/README.md (1)
82-86: Document the__maxIterationsruntime global in the README.The codebase exposes
__maxIterationsas a runtime global (in safe-runtime.ts:573 and enclave.ts:656), allowing user code to access the iteration limit. However, it is not documented in the README. Add it to the "Custom Globals" section (around line 514) or create a "Runtime Globals" section documenting both__maxIterationsand the loop enforcement wrappers (__safe_for,__safe_forOf, etc.) along with their purpose.
🧹 Nitpick comments (7)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (1)
334-337: Consider tightening the p99 threshold multiplier.The 50x multiplier between p50 and p99 is quite lenient and may mask performance regressions, especially in cases where the median is very low. While CI variability is a valid concern, a 10-20x multiplier with a lower absolute bound might better catch meaningful degradation.
scripts/format-perf-summary.mjs (1)
72-82: Consider adding validation for metric properties.While the code checks for array and object types, individual metric properties (name, value, unit) aren't validated. If the JSON is malformed, this could result in rendering "Unknown" or "N/A" values without clear indication that the data is invalid.
🔎 Optional: Add property validation
for (const metric of testMetrics) { - if (metric && typeof metric === 'object') { + if (metric && typeof metric === 'object' && typeof metric.name === 'string' && typeof metric.value === 'number') { allMetrics.push({ testName, ...metric }); } }This ensures only well-formed metrics are included in the report.
libs/enclave-vm/src/__tests__/perf/enclave.perf.spec.ts (2)
86-119: Consider asserting on p95 in addition to p50.The test computes both p50 and p95 for each security level but only asserts on p50 (line 118). Since the test logs both metrics, consider adding an assertion for p95 consistency as well.
🔎 Optional: Add p95 assertion
// PERMISSIVE should generally be faster than STRICT expect(results['PERMISSIVE'].p50).toBeLessThanOrEqual(results['STRICT'].p50 * 1.5); + expect(results['PERMISSIVE'].p95).toBeLessThanOrEqual(results['STRICT'].p95 * 1.5); });
195-230: Verify the low throughput threshold and consider testing iteration limit enforcement.Two observations:
Low threshold (line 229): The test expects throughput > 5 exec/sec for loops, which is significantly lower than the 100 exec/sec for simple code and 50 exec/sec for tool calls. While loops may have overhead, this threshold seems very permissive. Consider if this should be higher (e.g., 20-30 exec/sec).
Iteration limit not verified: The test uses
maxIterations: 10000(line 198), which relates to the PR's infinite loop detection feature. However, the test only runs 100 iterations (line 203), well under the limit. Consider adding a test case that verifies the iteration limit is enforced when exceeded.Based on PR objectives mentioning "iteration counter injection and infinite loop detection," ensuring the limit is tested would strengthen confidence in the feature.
libs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.ts (3)
142-194: Reconsider the weak assertion for context creation overhead.The assertion at line 192 requires only that
coldStats.mean >= warmStats.mean * 0.5, allowing cold start to be faster than warm execution. The comment acknowledges "sometimes cold is faster due to CPU caching," which suggests this test may not reliably validate context creation overhead.Suggestions:
- If CPU caching effects are significant, consider increasing the sample size or using a more controlled measurement approach.
- Alternatively, document that this test is primarily observational rather than a strict validation.
- Consider asserting on a percentile (e.g., p50 or p75) which might be more stable than mean.
The current assertion is too permissive and may not catch regression in context creation performance.
265-318: Consider documenting the rationale for the 40x scaling assertion.The test asserts that 20 tool calls should take less than 40x the time of 1 call (line 317). This implies allowing up to 2x overhead beyond linear scaling (20x). While reasonable, the rationale for this specific multiplier isn't documented.
🔎 Optional: Add explanatory comment
// 20 calls should be less than 40x the time of 1 call (accounting for overhead) + // (Linear would be 20x; allowing 2x factor for per-call overhead and base cost) expect(ratio).toBeLessThan(40);
321-383: Consider adding a test for iteration limit enforcement.The validation overhead test includes a loop (lines 338-340) but doesn't verify that the iteration counter injection or infinite loop detection features work correctly. Given the PR objectives mention "iteration counter injection and infinite loop detection," consider adding a test case that verifies:
- Loops exceeding
maxIterationsare detected and terminated- The error message clearly indicates an iteration limit was exceeded
This would provide more complete coverage of the PR's core features.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.prettierignoreREADME.mdlibs/enclave-vm/README.mdlibs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.tslibs/enclave-vm/src/__tests__/perf/enclave.perf.spec.tslibs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.tslibs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.tslibs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.tslibs/enclave-vm/src/scoring/index.tslibs/enclave-vm/src/scoring/scorers/local-llm.scorer.tslibs/enclave-vm/src/scoring/types.tsscripts/format-perf-summary.mjs
🚧 Files skipped from review as they are similar to previous changes (2)
- .prettierignore
- libs/enclave-vm/src/tests/perf/utils/benchmark-reporter.ts
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.tslibs/enclave-vm/src/scoring/types.tslibs/enclave-vm/src/scoring/index.tslibs/enclave-vm/README.mdlibs/enclave-vm/src/scoring/scorers/local-llm.scorer.tslibs/enclave-vm/src/__tests__/perf/enclave.perf.spec.tslibs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.tslibs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.ts
🧬 Code graph analysis (3)
libs/enclave-vm/src/scoring/types.ts (2)
libs/enclave-vm/src/scoring/index.ts (3)
CustomAnalyzer(71-71)ExtractedFeatures(52-52)RiskSignal(60-60)libs/enclave-vm/src/index.ts (2)
ExtractedFeatures(70-70)RiskSignal(77-77)
libs/enclave-vm/src/__tests__/perf/enclave.perf.spec.ts (5)
libs/enclave-vm/src/__tests__/perf/utils/statistics.ts (4)
calculateLatencyStats(88-121)calculateReport(171-182)formatReport(187-218)formatBytes(223-231)libs/enclave-vm/src/__tests__/perf/utils/benchmark-reporter.ts (2)
recordMetrics(89-117)recordMetric(61-84)libs/enclave-vm/src/__tests__/perf/utils/index.ts (2)
recordMetrics(4-4)recordMetric(4-4)libs/enclave-vm/src/__tests__/perf/utils/perf-utils.ts (1)
benchmark(96-159)libs/enclave-vm/src/__tests__/perf/utils/memory-utils.ts (2)
trackMemory(55-132)formatBytes(181-189)
libs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.ts (2)
libs/enclave-vm/src/scoring/types.ts (2)
CustomAnalyzer(379-404)LocalLlmConfig(409-478)libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (1)
LocalLlmScorer(65-439)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test
- GitHub Check: Performance
🔇 Additional comments (32)
libs/enclave-vm/src/__tests__/perf/worker-pool.perf.spec.ts (8)
1-15: LGTM!The imports are well-organized and the extended timeout appropriately accounts for worker thread spawn overhead.
17-50: LGTM!The helper function provides a clean abstraction for creating worker pool enclaves with sensible defaults and appropriate configuration for performance testing.
52-112: LGTM! Previous disposal issue has been resolved.The disposal logic is now correct—each
enclaveinstance is properly created within the loop scope and disposed after use. Both startup tests correctly measure their intended metrics with appropriate thresholds.
114-223: LGTM!All three execution latency tests are well-structured with proper warmup, measurement, cleanup, and reasonable thresholds. The comparison test appropriately measures both adapters under identical conditions.
340-384: LGTM!The queue exhaustion test correctly validates queueing behavior by using a small pool and high concurrency, with appropriate assertions.
387-443: LGTM!The throughput scaling test properly measures and compares performance across different pool sizes with appropriate cleanup.
445-531: LGTM!The memory tests are well-structured, properly clean up resources, and include appropriate safeguards (GC availability check, cleanup delays) for reliable measurements.
533-579: LGTM!The stability test appropriately measures performance consistency over time with a reasonable degradation threshold.
libs/enclave-vm/README.md (2)
240-269: Documentation alignment verified—no action needed.The draft documentation at
docs/draft/docs/libraries/enclave.mdxcorrectly documents bothrapidEnumerationThreshold(line 100) and the__maxIterationsruntime global (lines 158–172) as referenced in this README update. Documentation is aligned and in the proper draft location.
240-295: All documentation claims verified as accurate. TherapidEnumerationThresholddefault of 30, the 5-second detection window, all five suspicious pattern IDs (EXFIL_LIST_SEND, RAPID_ENUMERATION, CREDENTIAL_EXFIL, BULK_OPERATION, DELETE_AFTER_ACCESS), and theRecord<string, number>type forrapidEnumerationOverridesare correctly documented and match the implementation inlibs/enclave-vm/src/double-vm/.README.md (1)
93-97: Security metrics are well-supported by the actual test suite.The security test counts in the README are verified: actual test suite contains 1852 total test blocks with 1380 security-related tests, exceeding the claimed 1184+. The test file structure confirms multiple attack vector suites (attack-matrix, combined-attacks, constructor-obfuscation-attacks, runtime-attack-vectors, helper-escape), validating the 150+ blocked vectors claim. Feature additions (AI Scoring Gate, Reference Sidecar) are properly reflected in test directory structure (scoring and sidecar modules). The metrics are accurate and conservative.
libs/enclave-vm/src/scoring/types.ts (2)
359-404: LGTM! Well-designed extensibility interface.The
CustomAnalyzerinterface provides a clean extension point with appropriate lifecycle hooks. The documentation and example clearly demonstrate usage.
456-477: LGTM! Clear documentation of the custom analyzer integration.The field addition is well-documented with a practical example showing how to integrate external LLM scoring.
libs/enclave-vm/src/scoring/index.ts (1)
69-72: LGTM! Properly exposed in the public API.The
CustomAnalyzertype is correctly exported and well-categorized under extensibility types.libs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.ts (7)
565-599: LGTM! Comprehensive test of custom analyzer integration.The test properly verifies that the custom analyzer is invoked and its results are used in the scoring output.
625-643: LGTM! Proper verification of cleanup.The test correctly verifies that the custom analyzer's
dispose()method is invoked during scorer disposal.
645-669: LGTM! Validates default heuristic behavior.The test properly verifies that built-in heuristics are used when no custom analyzer is configured.
671-698: LGTM! Validates proper data flow to custom analyzer.The test confirms that both the generated prompt and extracted features are correctly passed to the custom analyzer.
700-723: LGTM! Proper error handling validation.The test confirms that the scorer gracefully falls back to rule-based scoring when the custom analyzer fails, maintaining system resilience.
725-749: LGTM! Good edge case coverage.The test ensures that custom analyzer scores exceeding the valid range are properly clamped to 100.
751-775: LGTM! Validates risk level calculation.The test confirms that custom analyzer scores are correctly mapped to risk levels.
libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (5)
4-11: LGTM! Improved documentation.The updated documentation clearly explains the built-in heuristics and custom analyzer options with practical examples.
Also applies to: 43-63
152-175: LGTM! Robust custom analyzer integration with fallback.The implementation properly uses the custom analyzer when configured and gracefully falls back to rule-based scoring on errors.
210-226: LGTM! Wasteful embedding computation removed.The refactored scoring methods now delegate to
analyzePrompt, which properly routes to either the custom analyzer or built-in heuristics. This addresses the previous review comments about unused embedding computations and improves performance.Also applies to: 231-253
308-324: LGTM! Clean separation of concerns.The new
analyzePromptmethod provides a clear dispatch point between custom analyzer and built-in heuristics. The renamedanalyzeWithHeuristicsmethod makes its purpose explicit.Also applies to: 326-413
432-438: LGTM! Proper resource cleanup.The dispose method correctly cleans up the custom analyzer using optional chaining.
libs/enclave-vm/src/__tests__/perf/enclave.perf.spec.ts (4)
15-46: LGTM!The cold start latency test correctly measures initialization and first execution time by creating fresh Enclave instances for each iteration. The thresholds (p50 < 100ms, p95 < 500ms) are reasonable benchmarks for cold start performance.
233-290: LGTM!The tool call overhead test effectively characterizes the per-call cost by measuring across different call counts (0, 1, 5, 10, 20) and computing the incremental overhead. The assertion (line 288) correctly validates that increased tool calls result in increased execution time.
292-340: LGTM!The code complexity scaling test appropriately measures how execution time scales with code size by varying the number of variables (1, 10, 50, 100). The assertion that the largest code completes under 200ms (line 338) provides a reasonable performance bound.
342-402: LGTM!The memory usage tests provide comprehensive coverage of memory behavior: single instance footprint, repeated execution growth, and dispose cleanup verification. The use of
trackMemoryutility and conditional GC invocation (lines 390-392) ensures accurate measurement. The thresholds (10MB, 100MB, 20MB) are reasonable bounds for detecting memory leaks or excessive usage.libs/enclave-vm/src/__tests__/perf/double-vm.perf.spec.ts (2)
16-85: LGTM - Note on permissive assertions.The test correctly measures the overhead of double VM compared to single VM. The assertions (lines 82, 84) are intentionally permissive to account for the security benefit of double VM:
- Lower bound: double VM >= 0.5x single VM (prevents unrealistic speedup)
- Upper bound: double VM p95 < 10x single VM p95 (prevents excessive overhead)
These sanity-check bounds are reasonable for a performance characterization test.
385-422: LGTM!The memory overhead test correctly compares single VM and double VM memory usage. The test properly measures peak heap usage, disposes of resources, and validates that double VM memory footprint remains under 20MB. The use of
trackMemoryutility ensures accurate measurement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @libs/enclave-vm/src/index.ts:
- Line 34: The export PartialDoubleVmConfig is missing explicit documentation in
the Double VM docs; update docs/draft/docs/libraries/enclave.mdx to add a
dedicated section for PartialDoubleVmConfig that names the exported type,
describes its purpose (a partial/optional config for Double VM initialization),
lists and explains each optional property, shows usage examples (when to pass a
PartialDoubleVmConfig vs the full config), and cross-references the public API
export PartialDoubleVmConfig so consumers understand what they import from the
library.
🧹 Nitpick comments (2)
scripts/format-perf-summary.mjs (1)
65-109: LGTM! Robust metrics flattening and formatting logic.The key metrics section handles the complex flattening, sorting, and formatting well:
- Type validation before array/object iteration (lines 74, 78)
- Failures prioritized in sort (line 86) for better CI visibility
- 30-metric limit (line 93) keeps output manageable
- thresholdType logic (line 97) correctly handles both min and max thresholds
- Comprehensive fallback values throughout
If metric names or values might contain markdown special characters (
|,*, etc.), consider escaping them to prevent table formatting issues. However, this is low-risk for a CI-focused script.🔎 Example escaping helper
Add before line 65:
// Escape markdown special characters in table cells const escapeMarkdown = (str) => String(str).replace(/[|*_`[\]]/g, '\\$&');Then on line 102, wrap dynamic values:
- console.log(`| ${name} | ${value} ${unit} | ${threshold} | ${status} |`); + console.log(`| ${escapeMarkdown(name)} | ${escapeMarkdown(value)} ${escapeMarkdown(unit)} | ${threshold} | ${status} |`);libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (1)
343-372: Consider reporting all matching risk keywords, not just the first.The current implementation breaks after detecting the first critical or high-risk keyword. If a prompt contains multiple high-risk terms (e.g., both "password" and "secret"), only one signal is generated. Reporting all matches would provide richer context for security reviewers and more accurately reflect the risk profile.
🔎 Proposed enhancement to report all keywords
// Check for critical keywords + const criticalMatches: string[] = []; for (const keyword of RISK_KEYWORDS.critical) { if (promptLower.includes(keyword)) { - const score = 25; - totalScore += score; - signals.push({ - id: 'ML_CRITICAL_KEYWORD', - score, - description: `Critical security keyword detected: ${keyword}`, - level: 'critical', - context: { keyword }, - }); - break; // Only add once + criticalMatches.push(keyword); } } + if (criticalMatches.length > 0) { + const score = 25; + totalScore += score; + signals.push({ + id: 'ML_CRITICAL_KEYWORD', + score, + description: `Critical security keyword(s) detected: ${criticalMatches.join(', ')}`, + level: 'critical', + context: { keywords: criticalMatches }, + }); + } // Check for high risk keywords + const highRiskMatches: string[] = []; for (const keyword of RISK_KEYWORDS.high) { if (promptLower.includes(keyword)) { - const score = 15; - totalScore += score; - signals.push({ - id: 'ML_HIGH_RISK_KEYWORD', - score, - description: `High risk keyword detected: ${keyword}`, - level: 'high', - context: { keyword }, - }); - break; + highRiskMatches.push(keyword); } } + if (highRiskMatches.length > 0) { + const score = 15; + totalScore += score; + signals.push({ + id: 'ML_HIGH_RISK_KEYWORD', + score, + description: `High risk keyword(s) detected: ${highRiskMatches.join(', ')}`, + level: 'high', + context: { keywords: highRiskMatches }, + }); + }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
libs/enclave-vm/README.mdlibs/enclave-vm/src/index.tslibs/enclave-vm/src/scoring/__tests__/local-llm-scorer.spec.tslibs/enclave-vm/src/scoring/scorers/local-llm.scorer.tsscripts/format-perf-summary.mjs
🚧 Files skipped from review as they are similar to previous changes (2)
- libs/enclave-vm/README.md
- libs/enclave-vm/src/scoring/tests/local-llm-scorer.spec.ts
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/index.tslibs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts
🧬 Code graph analysis (2)
libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (2)
libs/enclave-vm/src/scoring/scorers/rule-based.scorer.ts (1)
RuleBasedScorer(61-364)libs/enclave-vm/src/scoring/types.ts (3)
ExtractedFeatures(201-226)ScoringResult(261-286)RiskSignal(231-256)
scripts/format-perf-summary.mjs (1)
libs/enclave-vm/src/sidecar/reference-resolver.ts (1)
resolve(167-213)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test
- GitHub Check: Performance
🔇 Additional comments (5)
scripts/format-perf-summary.mjs (2)
1-63: LGTM! Well-structured script with comprehensive error handling.The setup, file validation, JSON parsing, and initial output sections are all well-implemented:
- Appropriate exit codes (0 for missing file to avoid failing CI, 1 for actual errors)
- Type validation before accessing properties
- Fallback values throughout (e.g., 'N/A', default objects)
- Division-by-zero protection on line 58
111-135: LGTM! Clean detailed results section with good UX.The collapsible detailed results section and footer are well-implemented:
- Collapsible
<details>section improves readability for large result sets- Type validation before accessing result properties (line 120)
- Fallback values for all fields
- Timestamp fallback on line 134 ensures footer always displays
libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (3)
138-142: Good addition—custom analyzer now initialized regardless of model load outcome.This addresses the past review comment about ensuring custom analyzer initialization is called consistently. The try-catch wrapper is appropriate since the custom analyzer may not depend on the model, and initialization failures shouldn't prevent fallback scoring from working.
158-210: LGTM—scoring flow refactored correctly with proper fallback handling.The new scoring flow properly prioritizes the custom analyzer when configured, falls back to rule-based scoring when the pipeline is unavailable, and includes error handling at each branch. The delegation to
scoreWithClassificationandscoreWithSimilarityis clean, and all error paths lead to appropriate fallback behavior.
442-442: LGTM—custom analyzer disposal handled correctly.The optional chaining ensures the custom analyzer's
dispose()method is called when present, properly cleaning up resources. This is consistent with howfallbackScorerdisposal is handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @.github/workflows/push.yml:
- Around line 135-141: The "Restore build artifacts" cache step (uses:
actions/cache/restore@v4, key: build-${{ github.sha }}) can silently miss and
cause flaky builds; update that step to include fail-on-cache-miss: true to fail
fast on a miss, or add an explicit fallback (e.g., run npm ci / restore
artifacts) immediately after the cache restore to handle cache misses
deterministically; ensure the change is applied to the step named "Restore build
artifacts" in .github/workflows/push.yml.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/push.yml (1)
46-64: Build artifacts may be cached even when the build fails.The cache save step at lines 58-64 runs unconditionally after the build steps. If both the initial build (line 48) and the retry build (line 56) fail, the workflow will still cache potentially incomplete or broken artifacts, which will then be restored by the
testandperformancejobs.Consider conditioning the cache save on build success:
🔎 Proposed fix
+ - name: Track build success + id: build-status + if: steps.build.outcome == 'success' || success() + run: echo "success=true" >> $GITHUB_OUTPUT + - name: Cache build artifacts + if: steps.build-status.outputs.success == 'true' uses: actions/cache/save@v4 with: path: | node_modules libs/*/dist key: build-${{ github.sha }}Alternatively, fail the build job explicitly when both build attempts fail, so dependent jobs don't run at all.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/push.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Performance
- GitHub Check: Test
🔇 Additional comments (3)
.github/workflows/push.yml (3)
101-114: Good addition for test failure visibility.The failure logging to
$GITHUB_STEP_SUMMARYwith context extraction and output truncation is a practical approach. The explicit failure step at lines 112-114 ensures the job properly signals failure status.
161-168: LGTM!The artifact upload configuration is well-structured with appropriate handling for missing files and reasonable retention.
147-159: The formatting script is properly implemented and handles edge cases.
scripts/format-perf-summary.mjsexists and includes comprehensive error handling: JSON parsing errors are caught with user-friendly messages, missing fields default to safe values, and invalid nested structures are skipped or validated before processing. The script exits gracefully for missing files and with error codes for invalid data, making the workflow'scontinue-on-error: trueappropriate.
…enhance scoring logic for keyword detection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
scripts/format-perf-summary.mjs (1)
109-109: Consider using nullish coalescing for better falsy value handling.The
||operator on line 109 could incorrectly display 'N/A' for valid falsy non-numeric values like booleanfalse. While performance metrics are unlikely to be boolean values in practice, using??would be more technically correct.🔎 Suggested refinement
- const value = typeof metric.value === 'number' ? metric.value.toFixed(2) : escapeMarkdown(String(metric.value || 'N/A')); + const value = typeof metric.value === 'number' ? metric.value.toFixed(2) : escapeMarkdown(String(metric.value ?? 'N/A'));libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (1)
33-38: Remove or implement unused keyword categories.The
mediumandlowrisk keyword categories are defined but never used inanalyzeWithHeuristics. Consider either implementing detection for these categories or removing them to avoid dead code.🔎 Option 1: Remove unused categories
const RISK_KEYWORDS = { critical: ['password', 'secret', 'apikey', 'token', 'credential', 'private_key'], high: ['exfiltration', 'send', 'webhook', 'upload', 'transfer', 'email'], - medium: ['limit:999', 'bulk', 'batch', 'all', 'wildcard'], - low: ['loop', 'iterate', 'list', 'query'], };Option 2: Implement detection (add to
analyzeWithHeuristicsafter line 384)// Check for medium risk keywords const mediumMatches: string[] = []; for (const keyword of RISK_KEYWORDS.medium) { if (promptLower.includes(keyword)) { mediumMatches.push(keyword); } } if (mediumMatches.length > 0) { const score = Math.min(20, 10 + (mediumMatches.length - 1) * 3); totalScore += score; signals.push({ id: 'ML_MEDIUM_RISK_KEYWORD', score, description: `Medium risk keyword${mediumMatches.length > 1 ? 's' : ''} detected: ${mediumMatches.join(', ')}`, level: 'medium', context: { keywords: mediumMatches }, }); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/draft/docs/libraries/enclave.mdxlibs/enclave-vm/src/scoring/scorers/local-llm.scorer.tsscripts/format-perf-summary.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/draft/docs/libraries/enclave.mdx
🧰 Additional context used
📓 Path-based instructions (1)
libs/**
⚙️ CodeRabbit configuration file
libs/**: Contains publishable SDK libraries. Review for API correctness, breaking changes, and consistency with docs. When public APIs change, ensure there is a matching docs/draft/docs/** update (not direct edits under docs/docs/**).
Files:
libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test
- GitHub Check: Performance
🔇 Additional comments (11)
scripts/format-perf-summary.mjs (6)
1-22: Well-structured script with good documentation and defensive helper function.The shebang, imports, and
escapeMarkdownhelper are implemented correctly. The nullish coalescing operator on line 20 properly handles null/undefined values, and the escaping covers the necessary markdown special characters for table cells.
24-48: Excellent error handling and validation.The file handling gracefully manages missing files (exit 0) while properly failing on parse errors (exit 1). The validation logic and error messages are clear and appropriate for a CI script.
50-73: Clean output formatting with proper defensive coding.The environment and test summary sections handle missing data gracefully with fallback values, and the pass rate calculation safely avoids division by zero. The conditional output for failed tests keeps the summary clean.
75-108: Metrics processing logic is well-implemented.The flattening, sorting, and threshold formatting logic is correct. The comment on lines 105-106 now correctly references
thresholdType(the previous issue was properly addressed), and the threshold operator logic on line 107 handles both upper-bound and lower-bound thresholds appropriately.
110-119: Good handling of metric limits and formatting.The 30-metric limit is clearly communicated with the overflow message, and the remaining formatting logic (unit, name escaping) is implemented correctly.
121-145: Detailed results and footer are well-implemented.The collapsible details section provides good UX, defensive validation prevents issues with malformed entries, and the timestamp fallback ensures the footer always has a valid date. All values are properly escaped for markdown output.
libs/enclave-vm/src/scoring/scorers/local-llm.scorer.ts (5)
4-8: Documentation improvements look good.The updated documentation clearly describes the flexible scoring architecture with built-in heuristics and custom analyzer options. The examples are helpful for users.
Also applies to: 43-63
158-181: Scoring flow logic is correct.The prioritization of custom analyzer with graceful fallback to rule-based scorer is well-structured. The control flow ensures that when a custom analyzer is configured, it's attempted first, and only falls back on errors.
Also applies to: 183-190
342-384: Enhanced keyword detection logic is well-designed.The multi-match detection with progressive scoring and caps is a good improvement. It provides more detailed signals by reporting all matching keywords while preventing score inflation through the caps.
436-445: Public API additions are well-designed.The
getConfig()andisUsingFallback()methods provide useful introspection capabilities while properly protecting internal state through readonly typing.
454-454: Custom analyzer disposal is correctly implemented.The disposal of custom analyzer resources uses proper optional chaining and follows the same pattern as the fallback scorer disposal.
Summary by CodeRabbit
New Features
Tests
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.