- 
                Notifications
    You must be signed in to change notification settings 
- Fork 116
Add barebones but working implementation of model preload (#209) #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| WalkthroughAdds startup preload hooks to config and ProxyManager.New, introduces a DiscardWriter for dummy requests, and defines a ModelPreloadedEvent. Updates tests to cover new hooks and configuration parsing. Also ignores .dev/ in .gitignore. Changes
 Sequence Diagram(s)sequenceDiagram
    participant App
    participant ProxyManager
    participant Preloader as Goroutine
    participant ProcGroup as ProcessGroup
    participant Proxy as ProxyRequest
    participant Sink as DiscardWriter
    participant Bus as EventBus
    App->>ProxyManager: New(config)
    ProxyManager-->>App: return (preload runs async)
    ProxyManager->>Preloader: start if hooks.on_startup.preload not empty
    loop for each model
        Preloader->>ProxyManager: swapProcessGroup(model)
        alt group found
            Preloader->>Proxy: ProxyRequest GET "/"
            Proxy->>Sink: write/discard response
            Preloader->>Bus: emit ModelPreloadedEvent{model, true}
        else error/not found
            Preloader->>Bus: emit ModelPreloadedEvent{model, false}
        end
    end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
 Suggested labels
 Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
 🧪 Generate unit tests
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type  Other keywords and placeholders
 Status, Documentation and Community
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (1)
proxy/config.go (1)
141-146: Consider making OnStartup a named type for future extensibilityInline struct is fine now, but if hooks grow, a named type (e.g., type OnStartupHooks struct { Preload []string ... }) can improve clarity and reuse across config, tests, and docs.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
- proxy/config.go(3 hunks)
- proxy/config_test.go(1 hunks)
- proxy/discardWriter.go(1 hunks)
- proxy/proxymanager.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
proxy/proxymanager.go (2)
proxy/discardWriter.go (1)
DiscardWriter(6-6)proxy/proxymanager_test.go (6)
TestProxyManager_SwapMultiProcess(45-86)
TestProxyManager_Unload(381-406)
TestProxyManager_SwapMultiProcessParallelRequests(130-186)
TestProxyManager_SwapProcessCorrectly(21-43)
TestProxyManager_PersistentGroupsAreNotSwapped(90-126)
TestProxyManager_RunningEndpoint(409-473)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run-tests
🔇 Additional comments (1)
proxy/config.go (1)
165-167: Config.Hooks field addition LGTMField wiring and YAML tags are consistent with the new HooksConfig type.
| @coderabbitai review | 
| ✅ Actions performedReview triggered. 
 | 
| @coderrabbitai review please | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
proxy/discardWriter.go (1)
5-28: ResponseWriter contract fixed; Flush support addedHeader is now stable, Write returns len(p), status is tracked, and Flush is provided. This resolves prior issues flagged in earlier reviews about breaking http semantics and panic risks in flusher paths.
🧹 Nitpick comments (2)
proxy/helpers_test.go (1)
72-78: Use the already-initialized simpleResponderPath in YAMLSwitching to the package-level simpleResponderPath makes the YAML generation stable across platforms. One nit: TestMain (Line 24) still recomputes the path via getSimpleResponderPath(). Consider reusing simpleResponderPath there to avoid drift between the path you validate and the one you embed in configs.
proxy/events.go (1)
53-60: Consider documenting Success semanticsThe ModelPreloadedEvent type is clear. Add a brief comment clarifying what “Success” means (e.g., process swap ok vs. dummy request ok) to prevent ambiguous interpretations by subscribers.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
- .gitignore(1 hunks)
- proxy/config.go(3 hunks)
- proxy/config_posix_test.go(2 hunks)
- proxy/discardWriter.go(1 hunks)
- proxy/events.go(2 hunks)
- proxy/helpers_test.go(2 hunks)
- proxy/proxymanager.go(2 hunks)
- proxy/proxymanager_test.go(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- .gitignore
🚧 Files skipped from review as they are similar to previous changes (2)
- proxy/proxymanager.go
- proxy/config.go
🧰 Additional context used
🧬 Code Graph Analysis (2)
proxy/proxymanager_test.go (5)
proxy/config.go (1)
LoadConfigFromReader(198-361)proxy/proxymanager.go (1)
New(47-130)proxy/process.go (2)
StopWaitForInflightRequest(36-36)
StateReady(25-25)event/default.go (1)
On(16-18)proxy/events.go (1)
ModelPreloadedEvent(53-56)
proxy/helpers_test.go (1)
proxy/logMonitor.go (1)
NewLogMonitorWriter(41-49)
🔇 Additional comments (3)
proxy/helpers_test.go (1)
16-19: LGTM on centralizing test helpersMoving nextTestPort, testLogger, and simpleResponderPath to package-level improves reuse across tests. No functional issues spotted.
proxy/events.go (1)
10-10: Event ID allocation looks consistentModelPreloadedEventID = 0x06 continues the existing sequence without collisions. Good addition.
proxy/config_posix_test.go (1)
103-106: Hooks parsing and expected struct assertion are solidThe YAML hooks.on_startup.preload is exercised and the expected Config.Hooks.OnStartup.Preload assertion reflects canonicalized values. This provides good coverage for the new config surface.
Also applies to: 169-173
- add hooks.on_startup.preload (array) - implement in ProxyManager to load models - todo: tests, docs
e552095    to
    dbfe92a      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
proxy/proxymanager.go (1)
101-127: Preload event may report success on proxy failure; add cancellation, endpoint targeting, and error handlingCurrent flow emits Success: true even if ProxyRequest fails, doesn’t cancel on shutdown, and always probes “/”. Tighten this:
- Emit Success: false if ProxyRequest fails.
- Use pm.shutdownCtx so preload cancels cleanly.
- Prefer each model’s CheckEndpoint (fallback “/”).
- Handle http.NewRequest error and log proxy errors.
- Copy the preload slice defensively in the goroutine.
// run any startup hooks if len(config.Hooks.OnStartup.Preload) > 0 { - // do it in the background, don't block startup -- not sure if good idea yet + // do it in the background, don't block startup go func() { - discardWriter := &DiscardWriter{} - for _, realModelName := range config.Hooks.OnStartup.Preload { + // defensive copy in case config is ever mutated later + preload := append([]string(nil), config.Hooks.OnStartup.Preload...) + for _, realModelName := range preload { + select { + case <-pm.shutdownCtx.Done(): + pm.proxyLogger.Infof("Preload cancelled due to shutdown") + return + default: + } proxyLogger.Infof("Preloading model: %s", realModelName) processGroup, _, err := pm.swapProcessGroup(realModelName) if err != nil { - event.Emit(ModelPreloadedEvent{ - ModelName: realModelName, - Success: false, - }) + event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false}) proxyLogger.Errorf("Failed to preload model %s: %v", realModelName, err) continue - } else { - req, _ := http.NewRequest("GET", "/", nil) - processGroup.ProxyRequest(realModelName, discardWriter, req) - event.Emit(ModelPreloadedEvent{ - ModelName: realModelName, - Success: true, - }) } + // Prefer model-specific check endpoint if configured + endpoint := strings.TrimSpace(pm.config.Models[realModelName].CheckEndpoint) + if endpoint == "" { + endpoint = "/" + } + req, reqErr := http.NewRequestWithContext(pm.shutdownCtx, "GET", endpoint, nil) + if reqErr != nil { + event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false}) + proxyLogger.Errorf("Failed to build preload request for %s: %v", realModelName, reqErr) + continue + } + // Use a fresh DiscardWriter per request + if err := processGroup.ProxyRequest(realModelName, &DiscardWriter{}, req); err != nil { + event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false}) + proxyLogger.Errorf("Preload request failed for %s: %v", realModelName, err) + continue + } + event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: true}) } }() }
🧹 Nitpick comments (2)
config.example.yaml (1)
223-231: Fix typo and clarify alias support in startup hooks
- Typo: “dictionar” → “dictionary”.
- Minor grammar: “models sections” → “models section”.
- Add a note that aliases are supported and resolved to real model IDs (this matches the config cleanup logic via RealModelName).
Apply this diff:
- # - optional, default: empty dictionar + # - optional, default: empty dictionary @@ - # - model names must match keys in the models sections + # - model names must match keys in the models section + # - aliases are supported; entries are resolved to real model IDsproxy/proxymanager_test.go (1)
868-885: Count only successful preloads and assert both model IDs observedThe test currently waits for any two events, including failures. Tighten to count successes and track model names to avoid false positives.
- preloadChan := make(chan ModelPreloadedEvent, 2) // buffer for 2 expected events - - unsub := event.On(func(e ModelPreloadedEvent) { - preloadChan <- e - }) + successCh := make(chan string, 2) // buffer for 2 expected successes + unsub := event.On(func(e ModelPreloadedEvent) { + if e.Success { + successCh <- e.ModelName + } + }) @@ - for i := 0; i < 2; i++ { - select { - case <-preloadChan: - case <-time.After(5 * time.Second): - t.Fatal("timed out waiting for models to preload") - } - } + want := map[string]bool{"model1": false, "model2": false} + timeout := time.After(10 * time.Second) + for seen := 0; seen < 2; { + select { + case name := <-successCh: + if _, ok := want[name]; ok && !want[name] { + want[name] = true + seen++ + } + case <-timeout: + t.Fatalf("timed out waiting for successful preloads; seen=%v", want) + } + }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (10)
- .gitignore(1 hunks)
- README.md(5 hunks)
- config.example.yaml(2 hunks)
- proxy/config.go(3 hunks)
- proxy/config_posix_test.go(2 hunks)
- proxy/discardWriter.go(1 hunks)
- proxy/events.go(2 hunks)
- proxy/helpers_test.go(2 hunks)
- proxy/proxymanager.go(2 hunks)
- proxy/proxymanager_test.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- .gitignore
- proxy/helpers_test.go
- proxy/events.go
- proxy/config.go
- proxy/config_posix_test.go
- proxy/discardWriter.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-05-05T23:54:14.205Z
Learnt from: mostlygeek
PR: mostlygeek/llama-swap#117
File: proxy/config.go:0-0
Timestamp: 2025-05-05T23:54:14.205Z
Learning: In the llama-swap project, automatic port assignment does not include port availability checks. It's the user's responsibility to provide an appropriate startPort that doesn't conflict with other services.
Applied to files:
- README.md
🧬 Code Graph Analysis (2)
proxy/proxymanager_test.go (5)
proxy/config.go (1)
LoadConfigFromReader(198-361)proxy/events.go (1)
ModelPreloadedEvent(53-56)event/default.go (1)
On(16-18)proxy/proxymanager.go (1)
New(47-130)proxy/process.go (1)
StateReady(25-25)
proxy/proxymanager.go (4)
proxy/discardWriter.go (1)
DiscardWriter(6-9)event/default.go (1)
Emit(28-30)proxy/events.go (1)
ModelPreloadedEvent(53-56)llama-swap.go (1)
main(27-178)
🔇 Additional comments (4)
README.md (3)
34-36: Docs: Nice addition highlighting container lifecycle and startup preloadThe revised bullets add clarity around container lifecycle (cmdStart/cmdStop) and the new startup preload feature. Good visibility for users.
46-49: Docs: Clearer setup introExplicitly calling out the minimal config path improves onboarding. LGTM.
141-149: Homebrew section reads wellThe flow (tap → install → run) is clear. No issues spotted.
proxy/proxymanager.go (1)
18-21: Event emission import is correct and scoped appropriatelyUsing the event bus here aligns with the existing event system and keeps the preload workflow observable in tests and ops tooling.
Summary by CodeRabbit