Skip to content

How to detect gRPC connection loss as soon as possible? #2098

Closed
@michalraska

Description

@michalraska

Hello, I'd like to discuss and validate the best approach for detecting channel disconnection (offline) as quickly as possible.

Context:

  • Our iOS/iPadOS app starts listening to a notification stream when it becomes active
  • When the app goes to background, the channel with notification stream is closed
  • iOS apps' active state typically lasts minutes, while iPadOS apps can remain active for hours or even days
  • App needs to detect lost server connection within 5-10 seconds, even on poor networks (EDGE/3G) and needs to inform users about offline state promptly

So far it seems to me the best approach is using keepalive ping:

// With `GRPCChannelPool` config
configuration.keepalive = .init(
    interval: .seconds(5),
    timeout: .seconds(3),
    permitWithoutCalls: true
)

// Or with `ClientConnection`
eventLoopGroup = PlatformSupport.makeEventLoopGroup(loopCount: 1)
var configuration = ClientConnection.Configuration.default(
    target: target,
    eventLoopGroup: eventLoopGroup
)
configuration.tlsConfiguration = .makeClientDefault(compatibleWith: eventLoopGroup)
configuration.connectionKeepalive = .init(
    interval: .seconds(5),
    timeout: .seconds(3),
    permitWithoutCalls: true
)
channel = ClientConnection(configuration: configuration)

Current behavior:

  • With this configuration, I get the error "unavailable (14): Transport became inactive" within 5 seconds when the device becomes disconnected (by turning Airplane mode on)
  • However, with this configuration the client gets error "unavailable (14): Too many pings" during online active state after while. I guess this can likely be fixed by setting GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = 0 on the server side.

Questions:

  1. Is keepalive ping the best solution for my app? What else could be better?
  2. Are there any important considerations before using keepalive in production? The server will handle fewer than a few hundred clients, so keepalive pings shouldn't create significant load.
  3. When using this aggressive keepalive ping configuration, should other configuration also be adjusted?
  4. Which channel type is more suitable for my need? ClientConnection or PooledChannel?
    • ClientConnection allows observing ConnectivityState via ConnectivityStateDelegate
    • With PooledChannel, the ConnectivityState is not available to my knowledge, but I guess it's better to rely on thrown errors anyway
    • App will not handle many concurrent request

Thank you! I hope others who are new to gRPC (like me) will also find the answers useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/supportAdopter support requests.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions