-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Server Netty. Fix rejected execution during engine stop (#8671) #5183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis pull request modifies the Netty server engine to improve shutdown robustness and response-sending safety. Changes include adding exception handling during engine stop, introducing channel-active checks before responding, updating test infrastructure to use real-time aware test runners, and adjusting test mocks for shutdown completion handling. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (6)📚 Learning: 2025-05-30T06:45:52.309ZApplied to files:
📚 Learning: 2025-05-30T06:45:52.309ZApplied to files:
📚 Learning: 2025-11-14T14:11:30.292ZApplied to files:
📚 Learning: 2025-05-14T18:05:02.321ZApplied to files:
📚 Learning: 2025-05-16T13:11:28.416ZApplied to files:
📚 Learning: 2025-06-09T07:08:35.085ZApplied to files:
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt (1)
26-26: Wildcard import is reasonable but unrelated to PR objectives.The change from explicit imports to a wildcard import for
io.netty.handler.codec.http2.*consolidates multiple imports and is acceptable. Note that the coding guidelines specify star imports forio.ktor.*packages, but this pattern is commonly extended to third-party packages when multiple classes are imported.This change appears unrelated to the main PR objective of fixing shutdown-related
RejectedExecutionException.ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.kt (1)
38-41: LGTM! Proper cleanup on channel unregistration.The implementation correctly cancels the in-flight request coroutine when the channel is unregistered, preventing potential writes to inactive channels and resource leaks. This aligns with the PR's objective to fix shutdown-related issues.
Optional: Consider nulling out
currentJobafter cancellation for minor memory optimization:override fun channelUnregistered(ctx: ChannelHandlerContext?) { - currentJob?.cancel() + currentJob?.cancel() + currentJob = null super.channelUnregistered(ctx) }While not strictly necessary (the handler is typically discarded after unregistration), this ensures the reference is explicitly cleared.
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt (2)
283-287: Consider removing unnecessarycrossinlinemodifier.The
crossinlinemodifier prevents non-local returns from the lambda, but since the lambda is only passed directly torunCatchingand not stored or passed to another execution context, this modifier is not necessary.Apply this diff if you prefer:
- private inline fun <R> withStopException(crossinline block: () -> R) { + private inline fun <R> withStopException(block: () -> R) { runCatching(block).onFailure { environment.log.error("Exception thrown during engine stop", it) } }
278-281: Consider shutting downcallEventGroupinterminate()for consistency.The
terminate()method only shuts downconnectionEventGroupandworkerEventGroup, but notcallEventGroupwhenshareWorkGroupis false. AlthoughcallEventGroupis lazy and may not be initialized during early bind failures, for completeness and consistency with thestop()method, consider adding conditional shutdown ofcallEventGroup.Apply this diff to add the shutdown:
private fun terminate() { connectionEventGroup.shutdownGracefully().sync() workerEventGroup.shutdownGracefully().sync() + if (!configuration.shareWorkGroup && ::callEventGroup.isInitialized) { + callEventGroup.shutdownGracefully().sync() + } }Note: Use
::callEventGroup.isInitializedto check if the lazy property was accessed before shutting it down.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.kt(1 hunks)ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt(1 hunks)ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationResponse.kt(2 hunks)ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{kt,kts}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{kt,kts}: Follow Kotlin official style guide for all Kotlin source and build scripts
Use star imports for io.ktor.* packages
Max line length is 120 characters
Indent with 4 spaces in Kotlin code
Include a copyright header in new Kotlin files
Files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.ktktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationResponse.ktktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.ktktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
**/*.kt
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.kt: Document all public Kotlin APIs, including parameters, return types, and exceptions
Annotate internal APIs with @internalapi
Follow Kotlin error-handling conventions and use specific Ktor exceptions
Files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.ktktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationResponse.ktktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.ktktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
🧠 Learnings (8)
📚 Learning: 2025-10-22T07:21:51.263Z
Learnt from: bjhham
Repo: ktorio/ktor PR: 5139
File: ktor-server/ktor-server-plugins/ktor-server-di/common/src/io/ktor/server/plugins/di/DependencyInjection.kt:11-11
Timestamp: 2025-10-22T07:21:51.263Z
Learning: In Ktor, `io.ktor.utils.io.CancellationException` is a typealias for `kotlinx.coroutines.CancellationException`, which is a supertype of `JobCancellationException`. Therefore, checking `e !is io.ktor.utils.io.CancellationException` is sufficient to exclude all coroutine cancellation exceptions.
Applied to files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt
📚 Learning: 2025-08-26T06:07:28.352Z
Learnt from: CR
Repo: ktorio/ktor PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T06:07:28.352Z
Learning: Applies to **/*.kt : Follow Kotlin error-handling conventions and use specific Ktor exceptions
Applied to files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt
📚 Learning: 2025-05-14T18:17:03.059Z
Learnt from: bjhham
Repo: ktorio/ktor PR: 4855
File: ktor-server/ktor-server-plugins/ktor-server-di/common/src/io/ktor/server/plugins/di/DependencyRegistry.kt:37-38
Timestamp: 2025-05-14T18:17:03.059Z
Learning: In the Ktor DI plugin, when a DependencyRegistry implements CoroutineScope, it should be properly cancelled during application shutdown to prevent memory leaks and resource issues. This is handled by calling registry.cancel() in the ApplicationStopped event.
Applied to files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.kt
📚 Learning: 2025-05-14T18:17:03.059Z
Learnt from: bjhham
Repo: ktorio/ktor PR: 4855
File: ktor-server/ktor-server-plugins/ktor-server-di/common/src/io/ktor/server/plugins/di/DependencyRegistry.kt:37-38
Timestamp: 2025-05-14T18:17:03.059Z
Learning: When implementing CoroutineScope in Ktor components like DependencyRegistry, always ensure the scope is properly cancelled during application shutdown to prevent memory leaks and resource issues. In the DI plugin, this is handled by calling registry.cancel() in the ApplicationStopped event handler after all dependency cleanup is complete.
Applied to files:
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationCallHandler.kt
📚 Learning: 2025-05-30T06:45:52.309Z
Learnt from: rururux
Repo: ktorio/ktor PR: 4896
File: ktor-client/ktor-client-core/jvm/test/FileStorageTest.kt:1-12
Timestamp: 2025-05-30T06:45:52.309Z
Learning: The headersOf() function from io.ktor.http package is available through wildcard imports like `import io.ktor.http.*`, so no explicit import statement is needed when using wildcard imports from that package.
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
📚 Learning: 2025-08-26T06:07:28.352Z
Learnt from: CR
Repo: ktorio/ktor PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T06:07:28.352Z
Learning: Applies to **/*.{kt,kts} : Use star imports for io.ktor.* packages
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
📚 Learning: 2025-05-30T06:45:52.309Z
Learnt from: rururux
Repo: ktorio/ktor PR: 4896
File: ktor-client/ktor-client-core/jvm/test/FileStorageTest.kt:1-12
Timestamp: 2025-05-30T06:45:52.309Z
Learning: In Ktor test files, particularly in the ktor-client/ktor-client-core/jvm/test/ directory, test files follow the convention of not including explicit package declarations. This is consistent across test files like CachingCacheStorageTest.kt and should be maintained for consistency.
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
📚 Learning: 2025-05-14T18:05:02.321Z
Learnt from: bjhham
Repo: ktorio/ktor PR: 4855
File: ktor-server/ktor-server-plugins/ktor-server-di/api/ktor-server-di.klib.api:334-336
Timestamp: 2025-05-14T18:05:02.321Z
Learning: Breaking changes in constructor parameter order are acceptable for the ktor-server-di module when the code hasn't been released yet, as confirmed by the development team.
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt
🔇 Additional comments (6)
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationResponse.kt (2)
49-66: Good defensive guard against inactive channels.The addition of
!context.channel().isActiveprevents attempts to send response bytes to an inactive channel, which directly addresses the PR objective to stop writing responses to inactive channels during shutdown. The early return is appropriate since there's no valid destination for the response when the channel is closed.
113-127: Consistent active-channel guard applied.The
!context.channel().isActiveguard mirrors the change inrespondFromBytesand ensures that response setup is skipped when the channel is inactive. This consistency across both response-sending entry points provides comprehensive protection against writing to closed channels during shutdown.ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyEngineTest.kt (1)
380-380: More idiomatic Kotlin.The change from
s.substring(0, end + 4)tos.take(end + 4)is functionally equivalent and more idiomatic Kotlin. Sinceendis the index where"\r\n\r\n"was found,end + 4safely includes the complete header terminator.Like the import change, this refactor is unrelated to the main PR objectives.
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt (3)
290-293: LGTM!The early guard correctly prevents duplicate stop sequences by checking if the cancellation job can be completed. This ensures idempotent stop behavior.
296-298: LGTM!Wrapping each channel sync in
withStopExceptionensures that if one channel fails to close properly, other channels still get processed. This improves shutdown resilience.
300-301: LGTM!Computing shutdown futures upfront before syncing them allows both event loop groups to shut down concurrently, improving shutdown performance.
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt
Outdated
Show resolved
Hide resolved
78c9b45 to
ee6655e
Compare
bjhham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for dealing with this! 🙏
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationResponse.kt
Outdated
Show resolved
Hide resolved
marychatte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
ee6655e to
bed0e8b
Compare
bed0e8b to
4f415e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt (1)
283-287: Stop exception wrapper is sound; consider reusing interminate()The helper cleanly centralizes logging of shutdown failures. As a minor follow‑up, you could also wrap the
terminate()shutdown calls withwithStopExceptionfor consistent logging on the bind‑failure path.ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt (2)
39-53: Optionally close the ServerSocket via use {} for cleanliness
testStartOnUsedPortleaves theServerSocketopen for the lifetime of the test method. Wrapping it inuse { ... }(or a try/finally) would ensure the socket is always closed even if an assertion fails, without changing the behavior of the test.
89-172: Startup timeout and header assertions are solid; consider explicit server shutdown instead of relying on cancellationThe combination of
runTestWithRealTime,withTimeout(10.seconds) { appStarted.await() }, and theResponseSenthook with assertions intestScope.launchgives good coverage for strippingContent-LengthandTransfer-Encodingacross the different response types.One small robustness tweak to consider: if
server.start(wait = true)is implemented as a blocking call that doesn’t cooperate with coroutine cancellation,serverJob.cancel()alone might not always guarantee prompt shutdown. If that’s the case in Netty’s engine implementation, explicitly callingserver.stop(...)from the test (e.g., after the client block, before cancelling the job) would make the lifecycle more explicit and reduce the risk of hanging tests or leaked threads.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt(1 hunks)ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyConfigurationTest.kt(1 hunks)ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt(6 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-05-30T06:45:52.309Z
Learnt from: rururux
Repo: ktorio/ktor PR: 4896
File: ktor-client/ktor-client-core/jvm/test/FileStorageTest.kt:1-12
Timestamp: 2025-05-30T06:45:52.309Z
Learning: In Ktor test files, particularly in the ktor-client/ktor-client-core/jvm/test/ directory, test files follow the convention of not including explicit package declarations. This is consistent across test files like CachingCacheStorageTest.kt and should be maintained for consistency.
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt
📚 Learning: 2025-06-23T12:49:56.883Z
Learnt from: CR
Repo: ktorio/ktor PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-06-23T12:49:56.883Z
Learning: Error handling should follow Kotlin conventions and use specific Ktor exceptions.
Applied to files:
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt
🧬 Code graph analysis (1)
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt (4)
ktor-test-dispatcher/common/src/TestCommon.kt (1)
runTestWithRealTime(35-44)ktor-server/ktor-server-core/common/src/io/ktor/server/engine/EmbeddedServer.kt (5)
embeddedServer(106-113)embeddedServer(151-172)embeddedServer(209-229)embeddedServer(254-264)embeddedServer(271-277)ktor-server/ktor-server-core/common/src/io/ktor/server/application/Application.kt (1)
serverConfig(107-112)ktor-network/jvm/src/io/ktor/network/util/Utils.kt (1)
withTimeout(79-90)
🔇 Additional comments (5)
ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettyConfigurationTest.kt (1)
109-109: LGTM! Correct alignment with new shutdown semantics.Changing the mock from
await()tosync()properly reflects the updated shutdown synchronization pattern introduced in this PR. Usingsync()ensures exceptions are propagated during shutdown, addressing the core issue described in the PR objectives.ktor-server/ktor-server-netty/jvm/src/io/ktor/server/netty/NettyApplicationEngine.kt (1)
293-313: Channel close and event-loop shutdown ordering look correctClosing server channels and syncing their futures before shutting down the connection/worker groups, and only shutting down
callEventGroupwhen!configuration.shareWorkGroup, matches the intended ownership model and should avoid the previousRejectedExecutionExceptions during stop.ktor-server/ktor-server-netty/jvm/test/io/ktor/tests/server/netty/NettySpecificTest.kt (3)
7-29: Imports and test harness wiring match usagesAll newly added imports are exercised in the tests, and adopting
runTestWithRealTimeplusDuration.secondskeeps these Netty tests aligned with the real-time test harness without introducing dead code.
32-37: runTestWithRealTime usage in Netty lifecycle tests looks appropriateWrapping
testNoLeakWithoutStartAndStopandtestStartMultipleConnectorsOnUsedPortinrunTestWithRealTimeis consistent with the real I/O behavior of Netty and doesn’t change the semantics of these tests, since their bodies are synchronous and don’t depend on virtual time.Also applies to: 55-87
174-228: Bad-request test migration to real-time harness looks goodUsing
runTestWithRealTimeplus a boundedwithTimeout(10.seconds)aroundappStarted.await()and the existingwithTimeout(5000)for the read loop keeps this low-level TCP test from hanging while preserving the original 400-response expectations.
Subsystem
Server Netty
Motivation
KTOR-8671 Netty: RejectedExecutionException during shutdown on MacOS when dev mode is enabled
Solution
Actually, the problem is reproducible even without dev mode. The problem is that the channels were closed after the workers:
Other small improvements that didn't cause problems yet, but can be dangerous:
shutdown()methods are called and the exception is logged properly(.await()was ignoring any errors)PS:
I can't reproduce the reported problem in unit tests.
I reproduced another
RejectedExecutionExceptionwhen trying to stop the server, which is writing a response, but the error is only logged (I suppose it's done by Netty's internal logger, which I can't mock properly).That's why the PR doesn't contain any tests, verified by local Maven publication.
Do you have advice here?