feat: config hot-reload #106

sammcj · 2025-04-30T04:25:51Z

This PR adds a new feature:

Hot-reload when the config file changes.

It is disabled by default and as per the additions to the readme can be enabled by running llama-swap with the --watch-config flag.

The behaviour is such that it will wait for any in-flight requests to complete before reloading the underlying llama server.

Also added some very minor readme linting fixes while I was in there 😅

Summary by CodeRabbit

New Features
- Added live configuration reload capability, allowing the server to automatically reload its configuration file without downtime when changes are detected.
- Introduced a new command-line flag to enable or disable automatic config file watching and reloading.
Improvements
- Enhanced server lifecycle management for more graceful startup, shutdown, and process handling.
- Logging output is now less verbose and more informative, especially during normal startup, shutdown, and reload events.
- Documentation updated with clearer instructions, improved code block formatting, and detailed descriptions of available command-line flags.
Bug Fixes
- Reduced risk of port collisions in tests by assigning explicit ports and adding delays where necessary.
Tests
- Updated tests to reflect changes in handler method naming and improved test diagnostics.
- Adjusted test process commands for clearer output during test runs.

mostlygeek

Overall this looks like it's in the right direction. I think the right design is for all reloading due to configuration changes be outside of ProxyManager. It is instantiated with a New(config *Config) and I want to keep the contract that the config comes in from the outside and is never modified inside.

sammcj · 2025-05-01T03:54:39Z

Thanks for the review @mostlygeek,

I've updated the PR to move all config reload logic out of ProxyManager.

Key changes:

ProxyManager now maintains its contract: it's instantiated with a config and never modifies it internally
Config watching/reloading happens entirely in the main package
When config changes:
- Main package creates a new ProxyManager instance
- Handles graceful shutdown of the old instance
- Starts the new instance with updated config

This maintains better separation of concerns while preserving all functionality. The changes also made the reload process more robust by properly coordinating server lifecycle.

Let me know if you'd like me to explain any part in more detail or if there's any further changes you'd like made.

mostlygeek · 2025-05-01T07:08:21Z

It's looking good. I'm just pushed a big PR #109 which changes llama-swap.go a bit. I'm planning to land that first. I think there may be some small merge conflicts. Since ProxyManager's contract hasn't changed that part will be fine :).

Right now ProxyManager.Run() is used to create a listener. What do you think of changing that a bit so we can replace ProxyManager without dropping the TCP connection?

Maybe something like:

  proxyManager := proxy.New(config)
  srv := &http.Server{
    Addr:    ":8080",
    Handler: proxyManager,
  }

  go func() {
    // service connections
    if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
      log.Fatalf("listen: %s\n", err)
    }
  }()

ProxyManager is an http.Handler. Perhaps it's possible to change srv.Handler gracefully? I haven't looked too much into this.

sammcj · 2025-05-01T07:15:39Z

Yeah that would be nice actually, good thinking.

What I'll do is update this branch first, then once you've merged your PR, I'll pull and fix conflicts etc...

I'll mark it as draft until that's done :)

sammcj · 2025-05-01T22:52:14Z

OK that was a bit more challenging than I anticipated and pushed the levels of my go skills so I'm sorry if I've overcomplicated anything but this is now working well on my server @mostlygeek

mostlygeek · 2025-05-02T00:39:03Z

That took me a little bit to review. Most of it looks pretty good. I'm not entirely sure what was going on with test changes but I left comments.

BTW: if this is you, thanks for following through 😄 !

mostlygeek · 2025-05-02T00:47:00Z

I think this would be a good change to ProxyManager:

func NewWithLoggers(
    config Config, 
    muxLogger *LogMonitor, 
    proxyLogger *LogMonitor, 
    upstreamLogger *LogMonitor,
) *ProxyManager { 
   ... 
}

This way:

loggers are defined at the main() level
loggers are shared across old and new proxy manager instances
the log stream history is maintained across restarts
the fmt.Println() in llama-swap.go is replaced by the loggers directly

sammcj · 2025-05-02T01:22:18Z

I'm not entirely sure what was going on with test changes but I left comments.

Yeah they gave me a lot of grief after moving the process handling around.

I'll have a read through later today.

BTW: if this is you, thanks for following through 😄 !

Yep that's me!

Groups allows more control over swapping behaviour when a model is requested. The new groups feature provides three ways to control swapping: within the group, swapping out other groups or keep the models in the group loaded persistently (never swapped out). Closes #96, #99 and #106.

sammcj · 2025-05-03T06:43:17Z

OK I've OK I've rebased from main now.

I did make a couple of changes to your logging here:

-		p.proxyLogger.Warnf("<%s> cmd or cmd.Process is nil", p.ID)
+		p.proxyLogger.Debugf("<%s> cmd or cmd.Process is nil (normal during config reload)", p.ID)

Just because with how the process handler works now, this would cause what looks to be an alarming message every time the config reloads, but I can revert those back if you wish @mostlygeek

sammcj · 2025-05-03T06:44:38Z

~~Ah it seems tests are working locally but not in CI, I'll have to come back to this another day sorry.~~

Fixed, but no rush to review :)

mostlygeek · 2025-05-03T06:46:31Z

Ah it seems tests are working locally but not in CI, I'll have to come back to this another day sorry.

Thanks. I'm too tired tonight to review anyways :)

coderabbitai · 2025-05-09T02:58:38Z

Walkthrough

The changes introduce live configuration reload support, improved server lifecycle management, and enhanced logging. A new watch-config flag enables automatic config file watching and reloading using fsnotify. The HTTP server now uses a standard handler interface, and control flow for startup, shutdown, and reloads is refactored for robustness. Documentation and tests are updated accordingly.

Changes

Files / Grouped Files	Change Summary
README.md	Improved code block syntax highlighting, expanded binary run instructions, and added detailed command-line flag descriptions.
go.mod	Added `github.com/fsnotify/fsnotify` dependency.
llama-swap.go	Refactored server startup to use `http.Server`, added signal/config reload goroutine, implemented live config reload with file watching, debounce, and retry logic. Introduced `watch-config` flag.
proxy/process.go	Changed several log messages from Info/Warning to Debug, clarified log context for process lifecycle events, and improved error handling for process termination.
proxy/proxymanager.go	Extracted Gin engine setup into a private method, renamed handler to `ServeHTTP` to implement `http.Handler`, added `StopProcesses`, removed `Run` method, and simplified `Shutdown`.
proxy/proxymanager_test.go	Updated tests to call `ServeHTTP` instead of `HandlerFunc` or direct Gin engine calls.
proxy/helpers_test.go	Removed `--silent` flag from test responder command.
proxy/process_test.go	Used fixed ports for process tests, added delay for port release, improved error reporting, and added `strings` import.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Server
    participant ProxyManager
    participant ConfigWatcher
    participant OS

    User->>Server: Start with --watch-config
    Server->>ProxyManager: Initialize with config
    Server->>ConfigWatcher: Start watching config file
    loop On config file change
        ConfigWatcher->>Server: Notify reload channel
        Server->>ProxyManager: Gracefully stop old manager
        Server->>ProxyManager: Initialize new manager with new config
        Server->>Server: Swap in new handler
    end
    OS-->>Server: Send termination signal
    Server->>ProxyManager: Gracefully shutdown
    Server->>Server: Shutdown HTTP server

Suggested labels

enhancement

Tip

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3e73d3 and 7ff4b0e.

📒 Files selected for processing (3)

proxy/helpers_test.go (1 hunks)
proxy/process.go (3 hunks)
proxy/process_test.go (3 hunks)

✅ Files skipped from review due to trivial changes (1)

proxy/helpers_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

proxy/process_test.go
proxy/process.go

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

proxy/proxymanager.go (1)

209-226: Add clarification on the difference between StopProcesses and Shutdown.

The new StopProcesses method safely stops all running upstream processes concurrently, but the distinction between this method and the existing Shutdown method is not explicitly documented. From the implementation, StopProcesses calls processGroup.stopProcesses() while Shutdown calls processGroup.Shutdown(), suggesting different semantics.

Add a comment to clarify the difference:
// StopProcesses acquires a lock and stops all running upstream processes.
// This is the public method safe for concurrent calls.
+// Unlike Shutdown, this method only stops the processes but doesn't perform
+// a complete shutdown, allowing for process replacement without full termination.
func (pm *ProxyManager) StopProcesses() {

llama-swap.go (1)

148-166: Consider adding retry logic for transient config loading errors.

Currently, if loading the new configuration fails, the error is logged but no retry is attempted. For transient issues (like a partial write that's quickly corrected), this could result in a missed reload.

Consider adding a simple retry mechanism with backoff for config loading failures:

if event.Name == configPath && event.Has(fsnotify.Write) {
    // Reset or start the debounce timer
    if debounceTimer != nil {
        debounceTimer.Stop()
    }
    debounceTimer = time.AfterFunc(debounceDuration, func() {
        log.Printf("Config file modified: %s, reloading...", event.Name)

+       // Try up to 3 times with exponential backoff
+       var newConfig proxy.Config
+       var err error
+       for retries := 0; retries < 3; retries++ {
+           // Load new configuration
+           newConfig, err = proxy.LoadConfig(configPath)
+           if err == nil {
+               break
+           }
+           log.Printf("Error loading new config (attempt %d/3): %v", retries+1, err)
+           if retries < 2 {
+               time.Sleep(time.Duration(1<<retries) * time.Second)
+           }
+       }
+       if err != nil {
+           log.Printf("Failed to load new config after retries: %v", err)
+           return
+       }

-       // Load new configuration
-       newConfig, err := proxy.LoadConfig(configPath)
-       if err != nil {
-           log.Printf("Error loading new config: %v", err)
-           return
-       }

        // Create new ProxyManager with new config
        newPM := proxy.New(newConfig)
        reloadChan <- newPM
        log.Println("Config reloaded successfully")
    })
}

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c5a5da and 1766292.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (7)

README.md (4 hunks)
go.mod (1 hunks)
llama-swap.go (2 hunks)
proxy/logMonitor_test.go (1 hunks)
proxy/process.go (2 hunks)
proxy/proxymanager.go (4 hunks)
proxy/proxymanager_test.go (15 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

llama-swap.go (2)

proxy/proxymanager.go (2)

ProxyManager (26-38)

New (41-88)

proxy/config.go (1)

LoadConfig (90-97)

🪛 LanguageTool

README.md

[uncategorized] ~252-~252: Loose punctuation mark.
Context: ...efault: config.yaml). - --listen: Address and port to listen on (default:...

(UNLIKELY_OPENING_PUNCTUATION)

[uncategorized] ~253-~253: Loose punctuation mark.
Context: ...on (default: :8080). - --version: Show version information and exit. ...

(UNLIKELY_OPENING_PUNCTUATION)

[uncategorized] ~254-~254: Loose punctuation mark.
Context: ...rmation and exit. - --watch-config: Automatically reload the configuration ...

(UNLIKELY_OPENING_PUNCTUATION)

🪛 golangci-lint (1.64.8)

proxy/logMonitor_test.go

27-27: Error return value of logMonitor.Write is not checked

(errcheck)

28-28: Error return value of logMonitor.Write is not checked

(errcheck)

29-29: Error return value of logMonitor.Write is not checked

(errcheck)

proxy/process.go

359-359: Error return value of p.cmd.Process.Kill is not checked

(errcheck)

🔇 Additional comments (38)

go.mod (1)

6-6: Good addition of fsnotify dependency for configuration hot-reload support.

The fsnotify package is an appropriate choice for implementing file system event monitoring to support the configuration hot-reload feature.

proxy/logMonitor_test.go (3)

24-24: LGTM: Waitgroup count correctly adjusted for two goroutines.

This change properly sets up the test for two separate client goroutines.

26-30: Improved test reliability by writing messages before starting goroutines.

Writing the messages before starting the collection goroutines ensures they won't be missed. Good change.

🧰 Tools

🪛 golangci-lint (1.64.8)

27-27: Error return value of logMonitor.Write is not checked

(errcheck)

28-28: Error return value of logMonitor.Write is not checked

(errcheck)

29-29: Error return value of logMonitor.Write is not checked

(errcheck)

31-32: Well-structured concurrent message processing.

The refactoring to use dedicated goroutines for each client improves test reliability and clarity, aligning with the implementation changes in the proxy manager.

Also applies to: 42-51

README.md (2)

198-217: Improved syntax highlighting with shell-specific code blocks.

Converting generic code blocks to shell-specific code blocks enhances readability by providing proper syntax highlighting.

Also applies to: 234-239, 269-287

249-254: Comprehensive documentation of CLI flags including the new hot-reload functionality.

Good addition of the --watch-config flag documentation with a clear explanation of its purpose and behavior. This properly communicates the new feature to users.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~252-~252: Loose punctuation mark.
Context: ...efault: config.yaml). - --listen: Address and port to listen on (default:...

(UNLIKELY_OPENING_PUNCTUATION)

[uncategorized] ~253-~253: Loose punctuation mark.
Context: ...on (default: :8080). - --version: Show version information and exit. ...

(UNLIKELY_OPENING_PUNCTUATION)

[uncategorized] ~254-~254: Loose punctuation mark.
Context: ...rmation and exit. - --watch-config: Automatically reload the configuration ...

(UNLIKELY_OPENING_PUNCTUATION)

proxy/process.go (3)

263-266: Improved logging clarity during startup phase.

Downgrading connection refused and health check error messages to Debug level and marking them as "normal during startup" reduces log noise and prevents users from being alarmed by expected transient conditions.

348-349: Better logging during config reload process.

Changing the log level for nil command/process conditions to Debug and indicating they're normal during config reload helps prevent users from being concerned about expected shutdown events.

353-354: Refined logging during shutdown sequences.

These changes to log levels properly categorize expected shutdown behavior as debug-level events rather than warnings or errors, which makes logs cleaner and more focused on actual issues.

Also applies to: 358-359, 370-373

proxy/proxymanager_test.go (16)

37-37: Method name correctly updated to match http.Handler interface.

The rename from HandlerFunc to ServeHTTP properly aligns the test with the implementation changes in proxymanager.go where the handler now implements the standard Go http.Handler interface.

75-75: Method name correctly updated to match http.Handler interface.

Similar to the previous instance, this correctly uses the renamed method to match the implementation changes.

117-117: Method name correctly updated to match http.Handler interface.

Consistent update to use ServeHTTP instead of the previous HandlerFunc method.

160-160: Method name correctly updated to match http.Handler interface.

Another consistent update to use the standard interface method name.

203-203: Method name correctly updated to match http.Handler interface.

Consistent renaming to standard interface method name.

254-288: Good addition of test for startup failure scenario.

This test case properly verifies that the proxy manager handles process startup failures gracefully by:

Returning an appropriate 502 Bad Gateway status code

Including a descriptive error message in the response

Tracking the process in a failed state

This test is particularly valuable for the hot-reload feature to ensure reliability when loading potentially invalid configurations.

304-304: Method name correctly updated to match http.Handler interface.

Consistent update to use standard interface method.

309-309: Method name correctly updated to match http.Handler interface.

Another consistent update to use the standard method name.

345-345: Method name correctly updated to match http.Handler interface.

Consistently updated method name.

363-363: Method name correctly updated to match http.Handler interface.

Consistent update to use ServeHTTP.

369-369: Method name correctly updated to match http.Handler interface.

Another consistent update to use the standard method name.

423-423: Method name correctly updated to match http.Handler interface.

Consistently updated to use the standard interface method.

493-493: Method name correctly updated to match http.Handler interface.

Consistently updated method name.

504-565: Good reintroduction of UseModelName test functionality.

This test properly verifies that the UseModelName configuration setting correctly overrides the requested model name when proxying requests to upstream services. The test covers both regular JSON requests (/v1/chat/completions) and multipart form data requests (/v1/audio/transcriptions), ensuring comprehensive coverage of this feature.

580-580: Method name correctly updated to match http.Handler interface.

Final update to use the standard interface method name.

601-601: Method name correctly updated to match http.Handler interface.

Consistent update to use standard interface method.

proxy/proxymanager.go (4)

40-88: Good refactoring of constructor to improve code organization.

The extraction of the Gin HTTP engine setup into a separate setupGinEngine method improves the code organization and separation of concerns. This refactoring maintains the same functionality while making the code more maintainable.

90-202: Clean implementation of private setup method.

The extracted setupGinEngine method contains all the previously inline setup code with no functional changes, keeping the same routing and middleware configuration. This is a good example of code reorganization that improves readability without changing behavior.

204-207: Correct implementation of standard http.Handler interface.

Renaming the handler method to ServeHTTP properly implements the standard Go http.Handler interface, which allows the ProxyManager to be used with standard Go HTTP server implementations. This is a good practice that makes the code more idiomatic and flexible.

228-245: Simplified Shutdown method maintains parallel operation.

The Shutdown method has been simplified to focus on stopping all processes under lock while maintaining parallel operation through goroutines and a wait group. This preserves the previous behavior where shutdown operations occur concurrently for all process groups.

llama-swap.go (9)

3-18: Good addition of required imports.

The added imports support the new functionality:

context for shutdown context

log for structured logging

path/filepath for path handling

time for timer operations

fsnotify for file watching

All these imports are directly used and necessary for the new features.

20-24: Good practice to declare version variables.

Using package-level variables for version information allows them to be set during build time with -ldflags, making it easier to track deployed versions.

31-31: Good addition of watch-config flag.

Adding a CLI flag for watch-config functionality follows the existing pattern for configuration options and gives users explicit control over enabling the hot-reload feature.

58-68: Well-structured server initialization with channels.

The code sets up a proper channel structure for coordinating server lifecycle events:

reloadChan for signaling config reloads

exitChan for coordinating graceful shutdown

sigChan for OS signals

Explicitly created HTTP server with the proxy manager as handler

This structure provides clean separation of concerns and proper coordination of different lifecycle events.

70-77: Appropriate server startup with error handling.

The server is started asynchronously with proper error handling that closes the exit channel on fatal errors, allowing the main goroutine to detect and respond to server failures.

79-105: Robust config reload and signal handling.

The handler goroutine properly manages both config reloads and termination signals:

For config reloads, it gracefully stops old processes, performs a full shutdown, and updates the server handler

For termination signals, it gracefully shuts down both the proxy manager and HTTP server with a timeout context

The approach of stopping old processes before updating the handler minimizes potential race conditions during reloads.

107-115: Proper conditional file watching setup.

The file watching is only set up if explicitly requested via the flag, and includes robust error handling for path resolution failures.

117-119: Clean main goroutine blocking.

Having the main goroutine wait on the exit channel until signaled provides a clean way to keep the program running until an explicit shutdown is triggered.

121-177: Well-implemented config file watching with debouncing.

The implementation includes several good practices:

Proper error handling for watcher creation and setup

Debouncing of rapid file changes to prevent unnecessary reloads

Clear logging of file watching status and events

Clean handling of watcher shutdown with deferred Close

Proper validation of file system events before triggering reloads

Error handling during config loading to prevent loading invalid configurations

The 2-second debounce duration is a reasonable default that balances responsiveness with stability.

proxy/logMonitor_test.go

proxy/process.go

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

llama-swap.go (2)
121-189: Well-implemented file watching with debouncing and retry logic.

The watchConfigFileWithReload function effectively monitors the config file and handles reloads with several good practices:

Debouncing to prevent multiple reloads during rapid changes

Retry logic with exponential backoff for loading the new config

Proper error handling throughout the process

Consider standardizing logging approach throughout the codebase.

There's a mix of fmt.Printf and log.Printf/Println being used for logging throughout the file. Consider standardizing on one approach for consistency, preferably using structured logging with appropriate log levels.

98-100: Server shutdown timeout could be configurable.

The 5-second timeout for server shutdown is reasonable, but consider making it configurable for environments with different requirements.
- ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+ const serverShutdownTimeout = 5 * time.Second  // Move to top-level const or make configurable
+ ctx, cancel := context.WithTimeout(context.Background(), serverShutdownTimeout)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb5ad5b and e5af39a.

📒 Files selected for processing (7)

llama-swap.go (2 hunks)
proxy/helpers_test.go (1 hunks)
proxy/logMonitor_test.go (1 hunks)
proxy/process.go (2 hunks)
proxy/process_test.go (4 hunks)
proxy/proxymanager.go (3 hunks)
proxy/proxymanager_test.go (15 hunks)

✅ Files skipped from review due to trivial changes (2)

proxy/helpers_test.go
proxy/proxymanager_test.go

🚧 Files skipped from review as they are similar to previous changes (3)

proxy/process.go
proxy/logMonitor_test.go
proxy/proxymanager.go

🔇 Additional comments (13)

proxy/process_test.go (6)

21-25: Improved test configuration comment clarity.

The comments have been updated to make it clearer that tests are running with error-level logging rather than debug level. This change improves code readability without altering behavior.

31-33: Good use of specific ports for deterministic testing.

Using a fixed port (12901) for the test instance rather than dynamically assigned ports makes the test more deterministic and repeatable. This is a good practice for improving test reliability.

35-35: Updated constructor call to include logger parameters.

The constructor now includes debug loggers as parameters, which aligns with the updated NewProcess function signature. This change helps with proper logging during config hot-reloading.

52-58: Added sleep delay and port reassignment to prevent race conditions.

This change addresses a potential race condition by:

Adding a sleep delay after stopping the process to ensure the port is released

Using a different specific port (12902) for the second test instance

Explicitly updating the process config to use the new port

These changes improve test reliability by preventing port conflicts that could occur during rapid restart scenarios, which is essential for testing hot-reload functionality.

67-68: Enhanced error message with contextual information.

The error message now includes the request URL path and the proxy port, which makes test failures much easier to diagnose. This is a good practice for improving test maintainability.

76-78: Reduced port collision likelihood with explicit port assignment.

Using a specific high port (12903) for this test reduces the chance of port conflicts with other tests or processes, improving test stability. The clear comment explains the reasoning behind this change.

llama-swap.go (7)

3-18: Good import organization and new dependency addition.

The new imports are appropriately organized and necessary for the hot-reload functionality. The fsnotify package is a good choice for file system event notifications.

20-24: Version variables properly defined.

Good practice to define version information variables. These variables are typically set during the build process with -ldflags.

31-31: Clear flag definition for the new feature.

The --watch-config flag is well-documented and properly defaulted to false, making this feature opt-in.

58-77: Well-structured server setup with proper error handling.

The code creates an HTTP server with the proxy manager as its handler and launches it asynchronously. The error handling appropriately closes the exit channel on fatal errors.

79-105: Robust reload and signal handling implementation.

This goroutine effectively manages both config reloads and termination signals. The graceful shutdown sequence for the proxy manager is well-implemented, ensuring in-flight requests complete before swapping handlers.

107-115: Conditional file watching with proper error handling.

Good implementation of the watch-config option with proper error handling for file path resolution.

167-167: Well-implemented exponential backoff.

Good use of exponential backoff for retries when loading the new configuration.

sammcj · 2025-05-09T03:46:26Z

I think I've addressed the remaining issues, but if my lack of go skills are causing you more grief than it's worth - I totally understand if you'd rather reject this and do it your own way.

By the way - that CodeRabbit bot is good!

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

proxy/process.go (3)
122-126: Consider also allowing Failed → Starting to remove an extra state hop

Allowing Failed → Stopped definitely un-bricks a process, but the subsequent request has to perform a second transition (Stopped → Starting) before the restart actually happens.
If you deem a “retry immediately” path desirable, you could accept Failed → Starting as well. It keeps the state-machine simpler (one transition instead of two) and saves one CAS attempt.
 case StateFailed:
-        return to == StateStopped // Allow resetting a failed process
+        return to == StateStopped || to == StateStarting
263-268: Relying on substring matching for “connection refused” is brittle

strings.Contains(err.Error(), "connection refused") will miss platform-specific phrasings (ECONNREFUSED, translated messages, wrapped errors, etc.).
Using errors.Is (Go 1.20+) with syscall.ECONNREFUSED or checking for *net.OpError provides a future-proof, locale-independent check.
- if strings.Contains(err.Error(), "connection refused") {
+ var opErr *net.OpError
+ if errors.As(err, &opErr) && errors.Is(opErr.Err, syscall.ECONNREFUSED) {
349-363: Differentiate “already terminated” from real termination errors

terminateProcess() may return errors unrelated to a process that has already exited (e.g. permission issues). Treating every error as “already terminated” hides actionable failures.
-if err := p.terminateProcess(); err != nil {
-    p.proxyLogger.Debugf("<%s> Process already terminated: %v (normal during shutdown)", p.ID, err)
+if err := p.terminateProcess(); err != nil {
+    if errors.Is(err, os.ErrProcessDone) {
+        p.proxyLogger.Debugf("<%s> Process already terminated (normal during shutdown)", p.ID)
+    } else {
+        p.proxyLogger.Warnf("<%s> Failed to terminate process: %v", p.ID, err)
+    }
 }

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5af39a and f3e73d3.

📒 Files selected for processing (2)

proxy/process.go (4 hunks)
proxy/process_test.go (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

proxy/process_test.go

🔇 Additional comments (1)

proxy/process.go (1)

418-431: Recover-on-request path looks good – just verify p.cmd lifecycle

Resetting Failed → Stopped on the first incoming request is a neat way to self-heal.
One corner case: if p.cmd still references a defunct process, start() will overwrite it, but the old Wait() goroutine is still sending on cmdWaitChan. This should be safe (buffered channel), yet double-check for:

Potential goroutine leaks (old Wait() keeps running forever).

A second send on cmdWaitChan after a new command starts, which would dead-lock because the buffer size is 1.

If either scenario is possible, you might want to nil out p.cmdWaitChan (or create a fresh channel) when you reset from Failed.

mostlygeek

Hey @sammcj we're almost there.

The main changes are reverting things that are not related, or have unclear rational to the original intent of the PR. The functional changes to process.go/process_test.go are a good examples.

I am getting a bit of review fatigue but the finish line looks close. :)

mostlygeek · 2025-05-09T16:04:45Z

proxy/process.go

 	// prevent new requests from being made while stopping or irrecoverable
 	currentState := p.CurrentState()
-	if currentState == StateFailed || currentState == StateShutdown || currentState == StateStopping {
+


There is a chance here that the upstream os process is still running/resident in memory. Simply resetting the state to StateStopped to have p.start() is likely not enough. I think we'd need something that additionally makes sure it is no longer there with a SIGTERM, and with that doesn't work a SIGKILL.

I have been considering a more robust attempted recovery behaviour for failed processes. That enhancement should be in a different PR not this one.

Can you revert this behaviour change. We can also ensure that this PR stays more focused on the original intent.

Yeah I hit that issue when I was testing my PRs changes (hence my attempt to fix it), I've reverted this now to leave things as they were.

mostlygeek · 2025-05-09T16:08:17Z

proxy/process_test.go

 var (
 	debugLogger = NewLogMonitorWriter(os.Stdout)
 )



I don't understand these changes to the tests. I would suggest reverting them along with the behaviour changes in process.go.

Reverted process_test.go (I can no longer make tests pass locally without the change so am relying Github Actions feedback now).

The only changes left in process.go are to reduce the info level logging noise when handling config hot-reloads where the processes are expected be interrupted and stopped.

I couldn't get the process tests to pass locally without making changes, I found the TestProcess_AutomaticallyStartsUpstream had issues with port allocation.

=== RUN TestProcess_AutomaticallyStartsUpstream process_test.go:43: Error Trace: /Users/samm/git/llama-swap/proxy/process_test.go:43 Error: Not equal: expected: "ready" actual : "failed" Diff: --- Expected +++ Actual @@ -1,2 +1,2 @@ -(proxy.ProcessState) (len=5) "ready" +(proxy.ProcessState) (len=6) "failed" Test: TestProcess_AutomaticallyStartsUpstream process_test.go:45: Error Trace: /Users/samm/git/llama-swap/proxy/process_test.go:45 Error: Not equal: expected: 200 actual : 502 Test: TestProcess_AutomaticallyStartsUpstream Messages: Expected status code 200, got 502 process_test.go:46: Error Trace: /Users/samm/git/llama-swap/proxy/process_test.go:46 Error: "unable to start process: upstream command exited unexpectedly: exit status 1\n" does not contain "testing91931" Test: TestProcess_AutomaticallyStartsUpstream process_test.go:59: Expected status code 200, got 503 --- FAIL: TestProcess_AutomaticallyStartsUpstream (0.25s)

As an aside - if it helps to explain my reasoning - I passed the PR diff with and without the process_test changes to a LLM to challenge my assumptions and got it to explain why the tests were failing without changes:

Q: "Do the changes in my PR require this test to change?"

A: "No, your PR's core features (config reloading) do not fundamentally break the Process type's existing contract in a way that necessitates a change to the test's logic to make it pass if the test were perfectly stable.

However, to achieve stable and reliable tests in the presence of your PR (and generally), TestProcess_AutomaticallyStartsUpstream (and other similar tests like TestProcess_WaitOnMultipleStarts) do benefit significantly from more robust port management (i.e., using distinct, hardcoded or carefully allocated ports for each simple-responder instance within the test). This makes the test less dependent on global state and timing.

You might need to explain to the reviewer that the reverted version of TestProcess_AutomaticallyStartsUpstream is inherently flaky due to its port allocation strategy, and the changes you had previously made (using getTestSimpleResponderConfigPort with distinct ports) were to ensure test reliability, not to work around a bug introduced by the config reloading feature. The PR might be making this flakiness more apparent, but the fix lies in making the test more robust."

mostlygeek

Looks good!

mostlygeek · 2025-05-12T00:37:37Z

Thanks @sammcj! 🥳

mostlygeek requested changes Apr 30, 2025

View reviewed changes

sammcj force-pushed the config_hot_reload branch from f78a187 to d4f844e Compare May 1, 2025 03:33

sammcj marked this pull request as draft May 1, 2025 03:40

sammcj force-pushed the config_hot_reload branch from 2079862 to b3c236f Compare May 1, 2025 03:50

sammcj marked this pull request as ready for review May 1, 2025 03:51

sammcj requested a review from mostlygeek May 1, 2025 04:02

sammcj marked this pull request as draft May 1, 2025 07:15

sammcj added 2 commits May 2, 2025 08:50

feat: config hot-reload

b09b210

feat: config hot-reload

f6e6a1f

sammcj force-pushed the config_hot_reload branch from 2194f64 to f6e6a1f Compare May 1, 2025 22:50

sammcj marked this pull request as ready for review May 1, 2025 22:50

sammcj marked this pull request as draft May 3, 2025 05:39

sammcj added 4 commits May 3, 2025 15:57

feat: config hot-reload

f233ad4

feat: config hot-reload

553a136

feat: config hot-reload

4470908

Merge branch 'main' into config_hot_reload

f02c38b

sammcj marked this pull request as ready for review May 3, 2025 06:43

sammcj added 2 commits May 3, 2025 17:43

Merge branch 'main' into config_hot_reload

83340f0

feat: config hot-reload

fa23166

coderabbitai bot reviewed May 9, 2025

View reviewed changes

proxy/logMonitor_test.go Outdated Show resolved Hide resolved

proxy/process.go Outdated Show resolved Hide resolved

sammcj force-pushed the config_hot_reload branch from 5233585 to cfa784f Compare May 9, 2025 03:21

feat: config hot-reload - fix merge conflicts

e5af39a

sammcj force-pushed the config_hot_reload branch from b769080 to e5af39a Compare May 9, 2025 03:41

coderabbitai bot reviewed May 9, 2025

View reviewed changes

sammcj requested a review from mostlygeek May 9, 2025 03:46

feat: config hot-reload - fix merge conflicts

f3e73d3

coderabbitai bot reviewed May 9, 2025

View reviewed changes

mostlygeek requested changes May 9, 2025

View reviewed changes

sammcj added 6 commits May 12, 2025 09:15

feat: config hot-reload - fix merge conflicts

7ff4b0e

feat: config hot-reload - revert process_test (fails locally again now)

10247b7

feat: config hot-reload - revert process_test (fails locally again now)

d2c567d

feat: config hot-reload - fix merge conflicts

f9ae748

feat: config hot-reload - revert process_test (fails locally again now)

5f57d61

Merge branch 'main' into config_hot_reload

96b4f51

mostlygeek approved these changes May 12, 2025

View reviewed changes

mostlygeek merged commit bc65270 into mostlygeek:main May 12, 2025
2 checks passed

coderabbitai bot mentioned this pull request May 13, 2025

Change /unload to not wait for inflight requests (#125) #126

Merged

coderabbitai bot mentioned this pull request Jul 1, 2025

Add Event Bus #184

Merged

coderabbitai bot mentioned this pull request Aug 11, 2025

Add barebones but working implementation of model preload (#209) #235

Merged

This was referenced Aug 25, 2025

Add a config editor page #268

Closed

Support llama-server's /infill endpoint #272

Merged

feat: add /completion endpoint #275

Merged

This was referenced Oct 13, 2025

Fix race conditions in proxy.Process #349

Merged

add tray option #350

Merged

This was referenced Oct 27, 2025

feat: add optional -enrich-cot flag to stream swap progress in reasoning_content #370

Closed

Stream loading state when swapping models #371

Merged

feat: config hot-reload #106

feat: config hot-reload #106

Uh oh!

Conversation

sammcj commented Apr 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

mostlygeek left a comment

Choose a reason for hiding this comment

Uh oh!

sammcj commented May 1, 2025

Uh oh!

mostlygeek commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sammcj commented May 1, 2025

Uh oh!

sammcj commented May 1, 2025

Uh oh!

mostlygeek commented May 2, 2025

Uh oh!

mostlygeek commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sammcj commented May 2, 2025

Uh oh!

sammcj commented May 3, 2025

Uh oh!

sammcj commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mostlygeek commented May 3, 2025

Uh oh!

coderabbitai bot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Suggested labels

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sammcj commented May 9, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

mostlygeek left a comment

Choose a reason for hiding this comment

Uh oh!

mostlygeek May 9, 2025

Choose a reason for hiding this comment

Uh oh!

sammcj May 11, 2025

Choose a reason for hiding this comment

Uh oh!

mostlygeek May 9, 2025

Choose a reason for hiding this comment

Uh oh!

sammcj May 11, 2025

Choose a reason for hiding this comment

Uh oh!

mostlygeek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mostlygeek commented May 12, 2025

Uh oh!

sammcj commented Apr 30, 2025 •

edited by coderabbitai bot

Loading

mostlygeek commented May 1, 2025 •

edited

Loading

mostlygeek commented May 2, 2025 •

edited

Loading

sammcj commented May 3, 2025 •

edited

Loading

coderabbitai bot commented May 9, 2025 •

edited

Loading