[bug] vault + consul +dnsmasq = unresolvable ping #3710
Labels
theme/api
Relating to the HTTP API interface
theme/consul-vault
Relating to Consul & Vault interactions
type/bug
Feature does not function as expected
Milestone
vault 0.9.0
consul 1.0.0
This behavior is being observed after attempting to emulate the "c1m" or "container 1 million" challenge configuration. I'm posting it under consul as the vault team feels it's a consul problem (I'm not convinced of this as the issue seems unique to vault registration, but see issue #3604)
Setting the Stage
Using Ubuntu 16.04 servers, use packer to deploy servers using the scripts from c1m (or take existing boxes and just run the various .sh scripts) . This will result in not only consul being locally installed, but dnsmasq bound to 127.0.0.1 and redirecting /consul/127.0.0.1#8600
Once done, introduce an HA Vault solution in the same Datacenter (not the same server). This will, of course, result in the "vault" service being registered in Consul.
Demonstrating the Issue
Easiest way to demonstrate this issue will be to break things down into steps, so bare with me:
So the oddity here is the "active.vault" ping fails when dnsmasq is running, but succeeds when it is stopped. it' also worth mentioning, that an nslookup and dig will both work perfectly fine, and that when you dig vault.service.consul, you get CNAMES instead of A records:
I'm asking in the consul support because it looks like the culprit might be that somehow vault is registering CNAMES instead of A records, which is technically "bad" (doesn't follow RFC). I'm suspicious of this being why dnsmasq refuses to resolve "active.vault" as it's a sub record.
Again, service resolution for anything OTHER THAN vault will work, unless I change the service name of the vault cluster ... then that new name won't work (basically, the vault service fails to ping regardless of name chosen).
Before closing and claiming it's a vault problem (old api calls or some such), please be aware they have already closed this can called it a consul problem. There's some finger pointing going on here ...
The text was updated successfully, but these errors were encountered: