Skip to content

DNS Resolver timeout no longer honoured #57839

Open
@jwong-ping

Description

@jwong-ping

Version

v20.17.0 or greater

Platform


Subsystem

No response

What steps will reproduce the bug?

This bug is difficult to reproduce by a simple code snippet as it depends on your systems's DNS cache, DNS server performance, etc.

Overtime when resolve4 and/or resolve6 are required to resolve IPs and the DNS server becomes a bit slower to respond. The DNS request will timeout prematurely, before the configured timeout provided in the Resolver.

Resolver options:

  • timeout Query timeout in milliseconds, or -1 to use the default timeout.
  • tries The number of tries the resolver will try contacting each name server before giving up. Default: 4

How often does it reproduce? Is there a required condition?

This occurs when c-ares has observed enough DNS requests and begins to use its own timeout value instead of the one provided to it through the Resolver. Then if there is a delay in response from the DNS server, the request will timeout before what is configured.

It has been consistently observed in deployments on node >= v20.17.0, metrics have shown that the timeouts occur around 1s when timeout is configured for 5s.

What is the expected behavior? Why is that the expected behavior?

The DNS Resolver's timeout configuration is honoured and behaves as described in the docs.

What do you see instead?

The DNS Resolver's timeout configuration is ignored after enough DNS requests.

Additional information

Looking through the release notes and those of the dependencies, it appears that v1.32.0 of c-ares now uses the timeout as a hint only and instead with enough observations, calculates it's own.

Relevant c-ares release notes:

Rework query timeout logic to automatically adjust timeouts based on network
conditions. The timeout specified now is only used as a hint until there
is enough history to calculate a more valid timeout. c-ares/c-ares#794

https://github.com/c-ares/c-ares/releases/tag/v1.32.0

Relevant comment in change set:

ARES_OPT_TIMEOUTMS
As of c-ares 1.32.0, this option is only honored on the first successful query
to any given server, after that the timeout is automatically calculated based
on prior query history.

Metadata

Metadata

Assignees

No one assigned

    Labels

    dnsIssues and PRs related to the dns subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions