Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a debug level logging message for mismatched DNS Records and IPv4/v6 Addresses #21552

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

keefertaylor
Copy link

Description

I've recently been chasing down a mysterious log in our Consul clusters which sounds quite scary:

[ERROR] agent.dns: error serializing DNS results: error="no data"

The root cause of this error is that a client was making an AAAA query to our consul cluster, which received an answer in the form of IPv4 addresses, which cannot satisfy the request. During message serialization, this caused the answer to be silently dropped, and only manifest as a failure at the top level, which ends up being an error that is too generic to be useful.

There's a discussion and other users who have had this issue and been confused here in the Hashicorp Forum

This PR adds a debug level log message to the code path in serialization. I chose debug so that this shouldn't be too chatty in existing consul deployments, though warn feels slighly more appropriate to me.

It's slightly unfortunate that we have to plumb a logger into the messageSerializer in order to do this. An alternative is that we could create a more descriptive error and bubble it up through the message serializer, which would then be logged out logged here. A new error type felt more invasive and more likely to break something, but I'm happy to take this approach if folks at Hashicorp prefer.

Testing & Reproduction steps

Repro:

  1. Stand up consul: consul agent -dev
  2. Query for AAAA record: dig @127.0.0.1 -p 8600 AAAA consul.service.consul
  3. Observe mysterious log: [ERROR] agent.dns: error serializing DNS results: error="no data"

Test this PR:

  1. Build consul with this PR
  2. Perform above repro steps
  3. Observe helpful debug log: [DEBUG] agent.dns: unable to return DNS AAAA record for for ipv4 address: question=consul.service.consul. query-type=28 answer=127.0.0.1

Links

Discussion on Hashicorp forum with other confused users: https://discuss.hashicorp.com/t/after-updating-to-latest-consul-im-getting-error-serializing-dns-results-errors-in-my-logs/68319

PR Checklist

  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

Copy link

hashicorp-cla-app bot commented Jul 18, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

This pull request has been automatically flagged for inactivity because it has not been acted upon in the last 60 days. It will be closed if no new activity occurs in the next 30 days. Please feel free to re-open to resurrect the change if you feel this has happened by mistake. Thank you for your contributions.

@github-actions github-actions bot added the meta/stale Automatically flagged for inactivity by stalebot label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta/stale Automatically flagged for inactivity by stalebot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant