-
Notifications
You must be signed in to change notification settings - Fork 11
fix: retry VMs init for up to 2 min #1612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughVmsService lifecycle hook changed from OnModuleInit to OnApplicationBootstrap and now includes a scheduled healInitialization retry routine decorated with @timeout to reattempt hypervisor initialization and watcher setup. Corresponding test updated to call onApplicationBootstrap(). No other public signatures changed. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App as NestJS App
participant VMS as VmsService
participant Hyp as Hypervisor/Watcher
Note over App,VMS: Application bootstrap
App->>VMS: onApplicationBootstrap()
VMS->>Hyp: attemptHypervisorInitializationAndWatch()
alt Init succeeds
Hyp-->>VMS: Ready
else Init fails
VMS-->>VMS: schedule healInitialization (Timeout 10s)
end
rect rgb(245,245,255)
Note right of VMS: healInitialization (retry up to maxRetries)
loop Retry until success or maxRetries
VMS->>Hyp: attemptHypervisorInitializationAndWatch()
alt Success
Hyp-->>VMS: Ready
VMS-->>VMS: stop retries
else Failure
VMS-->>VMS: wait delay (e.g., 10s)
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Claude finished @pujitm's task —— View job Reviewing PR Changes
✅ No critical issues found in changes The diff shows clean improvements:
No race conditions, infinite loops, security issues, or breaking changes detected in the modified lines. |
318e71e to
8c9eee6
Compare
8c9eee6 to
6664c8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
api/src/unraid-api/graph/resolvers/vms/vms.service.ts (1)
47-56: Retry loop meets the “up to 2 min” goal; consider early-exit and guard against concurrent inits
- Behavior matches PR intent: 12 retries × 10s = 120s max.
- Minor ergonomics: break immediately after a successful attempt to avoid an unnecessary extra sleep.
- Concurrency: multiple callers (bootstrap, timeout, PID watcher) can race initializeHypervisor. If hypervisor.connectOpen isn’t re-entrant, two overlapping attempts could contend. A lightweight guard avoids this.
Apply this small loop tweak:
- let retries = 1; - while (!this.isVmsAvailable && retries <= maxRetries) { - this.logger.log(`Attempting to initialize VMs service...attempt ${retries}/${maxRetries}`); - await this.attemptHypervisorInitializationAndWatch(); - await new Promise((resolve) => setTimeout(resolve, delay)); - retries++; - } + for (let attempt = 1; !this.isVmsAvailable && attempt <= maxRetries; attempt++) { + this.logger.log(`Attempting to initialize VMs service...attempt ${attempt}/${maxRetries}`); + await this.attemptHypervisorInitializationAndWatch(); + if (this.isVmsAvailable) break; // early-exit on success + await new Promise((resolve) => setTimeout(resolve, delay)); + }And optionally add a simple in-progress guard (outside this hunk) to serialize init:
// class fields private initializing = false; private initializingPromise: Promise<void> | null = null; // wrap initializeHypervisor calls behind a simple latch private async initializeOnce(): Promise<void> { if (this.isVmsAvailable) return; if (this.initializingPromise) return this.initializingPromise; this.initializing = true; this.initializingPromise = (async () => { try { await this.initializeHypervisor(); this.isVmsAvailable = true; } finally { this.initializing = false; this.initializingPromise = null; } })(); return this.initializingPromise; } // then call await this.initializeOnce() from attemptHypervisorInitializationAndWatch and watcher handlers.api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts (1)
304-306: Avoid asserting exact error messages in promises; use bare .rejects.toThrow()Per our testing guidelines, prefer behavior over message wording to reduce brittleness.
Apply this diff:
- await expect(service.startVm('999')).rejects.toThrow('Failed to set VM state: Invalid UUID'); - await expect(service.stopVm('999')).rejects.toThrow('Failed to set VM state: Invalid UUID'); + await expect(service.startVm('999')).rejects.toThrow(); + await expect(service.stopVm('999')).rejects.toThrow();
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts(1 hunks)api/src/unraid-api/graph/resolvers/vms/vms.service.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
TypeScript source files must use import specifiers with .js extensions for ESM compatibility
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.tsapi/src/unraid-api/graph/resolvers/vms/vms.service.ts
api/src/unraid-api/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place new API source files in api/src/unraid-api/ rather than legacy locations
Prefer adding new files to the Nest repo at api/src/unraid-api/ instead of legacy code
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.tsapi/src/unraid-api/graph/resolvers/vms/vms.service.ts
api/**/*.{test,spec}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Prefer not to mock simple dependencies in API tests
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts
**/*.{test,spec}.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{test,spec}.{js,ts,jsx,tsx}: General testing: use .rejects.toThrow() without arguments when asserting thrown errors
General testing: focus on behavior, not exact error message wording
General testing: avoid brittle tests tied to non-essential details (exact messages, log formats)
General testing: use mocks as nouns (objects), not verbs (behavioral overuse)
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts
**/*.{ts,tsx,js,jsx,vue}
📄 CodeRabbit inference engine (CLAUDE.md)
Avoid unnecessary or obvious comments; only add comments when needed for clarity
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.tsapi/src/unraid-api/graph/resolvers/vms/vms.service.ts
api/**/*.{test,spec}.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/api-rules.mdc)
api/**/*.{test,spec}.{js,jsx,ts,tsx}: Use Vitest for tests in the api; do not use Jest
Prefer not to mock simple dependencies in tests
For error testing, use.rejects.toThrow()without arguments; avoid asserting exact error messages unless the message format is the subject under test
Files:
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts
🔇 Additional comments (4)
api/src/unraid-api/graph/resolvers/vms/vms.service.ts (3)
15-15: Switching to OnApplicationBootstrap is the right lifecycle hook hereThis ensures libvirt availability checks and watcher setup happen after the app is bootstrapped. No issues with the implements change.
42-45: Bootstrap path is clean and minimalDirectly delegating to attemptHypervisorInitializationAndWatch keeps the hook lean. Nice.
1-2: Verified ScheduleModule import: no further action required
api/src/unraid-api/app/app.module.tsincludesScheduleModule.forRoot()(line 27), ensuring@Timeouthandlers will fire as expected.@nestjs/schedulev6.0.0 is declared inapi/package.json, satisfying the dependency requirement.- ESM import style remains compliant with our
.jsalias rules.All prerequisites are met; no changes needed.
api/src/unraid-api/graph/resolvers/vms/vms.service.spec.ts (1)
199-199: Tests aligned with new lifecycle hookSwitching to onApplicationBootstrap keeps integration tests consistent with service changes.
|
This plugin has been deployed to Cloudflare R2 and is available for testing. |
🤖 I have created a release *beep* *boop* --- ## [4.16.0](v4.15.1...v4.16.0) (2025-08-27) ### Features * add `parityCheckStatus` field to `array` query ([#1611](#1611)) ([c508366](c508366)) * generated UI API key management + OAuth-like API Key Flows ([#1609](#1609)) ([674323f](674323f)) ### Bug Fixes * **connect:** clear `wanport` upon disabling remote access ([#1624](#1624)) ([9df6a3f](9df6a3f)) * **connect:** valid LAN FQDN while remote access is enabled ([#1625](#1625)) ([aa58888](aa58888)) * correctly parse periods in share names from ini file ([#1629](#1629)) ([7d67a40](7d67a40)) * **rc.unraid-api:** remove profile sourcing ([#1622](#1622)) ([6947b5d](6947b5d)) * remove unused api key calls ([#1628](#1628)) ([9cd0d6a](9cd0d6a)) * retry VMs init for up to 2 min ([#1612](#1612)) ([b2e7801](b2e7801)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Resolve #1392 -- where the VMs service would be unavailable following a server restart.
onApplicationBootstrapto avoid blocking application lifecycle prematurely.