feat: add TCP keepalive for MySQL and PostgresSQL.#3559
feat: add TCP keepalive for MySQL and PostgresSQL.#3559xuehaonan27 wants to merge 3 commits intolaunchbadge:mainfrom
Conversation
BREAK CHANGE: [`sqlx_core::net::socket::connect_tcp`]. New parameter added. Add TCP keepalive configuration which could be enabled by [`PgConnectOptions::tcp_keep_alive`] and [`MySqlConnectOptions::tcp_keep_alive`].
CommanderStorm
left a comment
There was a problem hiding this comment.
(not a maintainer)
I thought you might still appreciate an external review ^^
sqlx-core/src/net/socket/mod.rs
Outdated
| #[derive(Debug, Clone)] | ||
| pub struct TcpKeepalive { | ||
| pub time: Duration, | ||
| pub interval: Duration, | ||
| pub retries: u32, | ||
| } |
There was a problem hiding this comment.
https://docs.rs/socket2/latest/socket2/struct.TcpKeepalive.html
I think we should use the builder pattern too and copy the platform support of socket2 given that this is explicitly done in their case.
Also adding documentation via docstrings is really helpful => copying their docs is likely fine.
Likely we should make this Copy too.
There was a problem hiding this comment.
Thank you for giving me a review!
That's reasonable, or we might have to maintain the consistency between TcpKeepalive defined in socket2 and in sqlx_core.
Should I re-export TcpKeepalive definition in socket2?
There was a problem hiding this comment.
Current code is a build failour on the following platforms (assuming one tries to use the code):
any(
target_os = "openbsd",
target_os = "redox",
target_os = "solaris",
target_os = "nto",
target_os = "espidf",
target_os = "vita",
target_os = "haiku",
)
Please use the same platform cfg scoping as the new dependency.
sqlx-core/src/net/socket/mod.rs
Outdated
| host: &str, | ||
| port: u16, | ||
| with_socket: Ws, | ||
| keepalive: &Option<TcpKeepalive>, |
There was a problem hiding this comment.
| keepalive: &Option<TcpKeepalive>, | |
| keepalive: Option<&TcpKeepalive>, |
Would this not be a better API?
https://users.rust-lang.org/t/api-design-option-t-vs-option-t/34139/2
Given that socket2::TcpKeepalive is Copy, I think copying that derive and dropping the reference might be more ergonomic. What do you think?
There was a problem hiding this comment.
I didn't notice that TcpKeepalive in socket2 is Copy.
I will adjust it. Thx.
There was a problem hiding this comment.
I didn't notice the API design preference in Rust. Thank you for pointing this out for me!
|
Should I re-export |
`TcpKeepalive`. fix: `connect_tcp` API design.
sqlx-core/src/net/socket/mod.rs
Outdated
| pub struct TcpKeepalive { | ||
| pub time: Option<Duration>, | ||
| pub interval: Option<Duration>, | ||
| pub retries: Option<u32>, | ||
| } |
There was a problem hiding this comment.
If we want to encapsulate instead of using socket2::TcpKeepalive in the public api (that is something that @abonander should decide), we should use a newtype wrapper instead of just copying the other code.
| pub struct TcpKeepalive { | |
| pub time: Option<Duration>, | |
| pub interval: Option<Duration>, | |
| pub retries: Option<u32>, | |
| } | |
| pub struct TcpKeepalive(socket2::TcpKeepalive) |
There was a problem hiding this comment.
That's reasonable, thx
There was a problem hiding this comment.
That's a good idea. But socket2::TcpKeepalive is not Copy (Sorry I made a mistake before).
So doing so we won't have Copy trait.
There was a problem hiding this comment.
Interesting. I don't know how I came up with the idea that it does. Must have misread a part of the code.
There was a problem hiding this comment.
Yeah, I have mixed feelings about using a bespoke struct here.
These builder methods could just be directly on the ConnectOptions structs.
I suppose being able to define TcpKeepalive separately in a const could be nice for reusability, but at the same time, having to import and build a separate type would also be a little annoying if you only use it once.
I think maybe because interval and retries aren't supported on all platforms, we should only expose the tcp_keepalive_time() on the ConnectOptions builders. We can always add the others later.
There was a problem hiding this comment.
These fields should also be #[doc(hidden)] or just private.
|
After some thought, I'm not entirely sure how much value this actually has. As I worked out in estuary/flow#1676 (comment), keepalive likely won't solve the original reporter's problem. It could be useful for other reasons, such as keeping connections from timing out, or catching when a server disconnected abruptly without sending a FIN packet. However, because the connection state is managed directly, we won't see a keepalive timeout until the next time we try to read from or write to the socket, at which point we're already trying to use it anyway. I suppose that's still preferable to it hanging forever on a read that will never complete, though. |
sqlx-core/src/net/socket/mod.rs
Outdated
| pub struct TcpKeepalive { | ||
| pub time: Option<Duration>, | ||
| pub interval: Option<Duration>, | ||
| pub retries: Option<u32>, | ||
| } |
There was a problem hiding this comment.
Yeah, I have mixed feelings about using a bespoke struct here.
These builder methods could just be directly on the ConnectOptions structs.
I suppose being able to define TcpKeepalive separately in a const could be nice for reusability, but at the same time, having to import and build a separate type would also be a little annoying if you only use it once.
I think maybe because interval and retries aren't supported on all platforms, we should only expose the tcp_keepalive_time() on the ConnectOptions builders. We can always add the others later.
tcp_keepalive.rs and update related code - Moved the TCP keepalive type definition from `mod.rs` to new source file `tcp_keepalive.rs` - Modified method `tcp_keep_alive` to `tcp_keepalive_time` in `MySqlConnectOptions` and `PgConnectOptions` - Updated `socket2` dependency in `sqlx-core` to enable feature `all`.
|
I think this logic is required for |
|
Catching situations where a server disconnects abruptly without sending a FIN packet is a real-world use case for us. Our application subscribes to PostgreSQL notifications via database triggers. After an infrastructure issue, we observed that the application silently stopped receiving notifications while the process itself remained running and unaware of any connection problem. I was able to reproduce this by using When I applied the
the However, relying on the It would be very helpful if
This would allow |
|
We're hitting the same issue in production — long-lived PostgreSQL connections silently go stale after network changes, and without TCP keepalive there's no way to detect it. Would love to see this merged, thank you for the work on this! |
BREAK CHANGE: [
sqlx_core::net::socket::connect_tcp]. New parameter added.Add TCP keepalive configuration which could be enabled by [
PgConnectOptions::tcp_keep_alive] and [MySqlConnectOptions::tcp_keep_alive].Does your PR solve an issue?
fixes #3540