- 
                Notifications
    You must be signed in to change notification settings 
- Fork 547
Enable graceful shutdown of servers #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable graceful shutdown of servers #122
Conversation
f67a926    to
    4ae26c0      
    Compare
  
    | I know the tests have been flaky, but three timeouts seem worse than average. I downloaded testresults-windows-latest-Debug.zip since I saw that SseClientTransportTests.ConnectAsync_Should_Connect_Successfully hung in that run which seems to be the most common culprit even in other PRs, but I couldn't find any rooted instances of SseClientTransport which I found odd. I wonder if we're awaiting a Task that gets GC'd without ever completing somewhere. Do you have any idea what might be going on? While looking at that, I also saw a bunch of unrooted SseClientTransports that appear to be from passed SseServerIntegrationTests that had faulted _receiveTasks due to trying to write to the ITestOutputHelper after the test completed but were still in the IsConnected state. At this point I realized that the IClientTransport created by McpClientFactory.Create() never gets disposed unless the client fails to connect. There's no good way for the caller to dispose the transport. The caller could provide their own createTransportFunc and storing the return value, but they shouldn't have to. Of course, it shouldn't be the responsibility of McpClient to dispose one of its constructor parameters. Maybe it'd be more acceptable given a  I think what is here is good so long as the increased test flakiness is an aberration. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving mostly so I can continue my work without too much more rebasing.
| /// Runs the server, listening for and handling client requests. | ||
| /// </summary> | ||
| Task StartAsync(CancellationToken cancellationToken = default); | ||
| Task RunAsync(Func<CancellationToken, Task>? onInitialized = null, CancellationToken cancellationToken = default); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of the onInitalized callback. I think the far more ergonomic pattern is the one used by generic host where you can do something like this:
await server.StartAsync();
// Code to run after initialization
await server.WaitForShutdownAsync();RunAsync in then built on top of that. It might be different if there was something that needed to run after onInitialized, but that doesn't appear to be the case. I can change this in a follow up though.
| // Wait for transport completion. | ||
| if (serverTransport is not null) | ||
| { | ||
| await serverTransport.Completion; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should RunAsync wait for the MessageProcessingTask to complete too? It's good to know that when RunAsync exits gracefully you can dispose your logging provider and not truncate.
At least CleanupAsync still awaits the MessageProcessingTask, but RunAsync doesn't call that unless there's an exception. If it's waiting for transport completion, it should probably now always call CleanupAsync in a finally instead.
I plan to remove IServerTransport.Completion in my follow up, but we should be able to get the same effect by separating the listener and connection/session interfaces and disposing both at the appropriate times. That is during the disposal of the IMcpServer and after the completion of the MessageProcessingTask respectively.
| response.Headers.CacheControl = "no-cache"; | ||
|  | ||
| await using var localTransport = transport = new SseResponseStreamTransport(response.Body); | ||
| await using var server = McpServerFactory.Create(transport, mcpServerOptions.Value, loggerFactory, endpoints.ServiceProvider); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to call IMcpServer.RunAsync() somewhere for this sample to continue to work? You can test it out using npx @modelcontextprotocol/inspector
| 
 Do you want to just take this over, incorporating it into your change and changing it to your liking? | 
| return Task.CompletedTask; | ||
| await _shutdownTokenSource.CancelAsync().ConfigureAwait(false); | ||
| _listener.Stop(); | ||
| await _listeningTask.ConfigureAwait(false); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also await Completed?
| 
 Sure. | 
| ServerInstructions = initializeResponse.Instructions; | ||
|  | ||
| // Validate protocol version | ||
| if (initializeResponse.ProtocolVersion != _options.ProtocolVersion) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR - but we need to figure out what "supports" means exactly in https://spec.modelcontextprotocol.io/specification/2025-03-26/basic/lifecycle/
We might risk disconnecting from servers we are compatible with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you ensure we have an open issue for that? Thanks.
| var logLevelIndex = random.Next(loggingLevels.Count); | ||
| var logLevel = loggingLevels[logLevelIndex]; | ||
| await server.SendMessageAsync(new JsonRpcNotification() | ||
| // Send random log messages every few seconds | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is value in replicating the everything servers behavior in the TestServer (in this case the 15 second interval). Given it's the official reference server, it allows us to reuse test code. I've found and reported several bugs in the reference server because of this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
15 seconds is way too long for tests, though. It makes these test cases take upwards of 30 seconds, when they could otherwise finish almost immediately.
| @halter73, I'm going to close this, since you've picked up the commits and incorporated it into your changes. | 
Fixes #10
Fixes #117
I'm not in love with where this landed, but I'm expecting it'll evolve further with @halter73's work, and it's an improvement over where things were.
I cleaned up a few other things along the way.