-
Notifications
You must be signed in to change notification settings - Fork 2k
consul connect: allow "cni/*" network mode #26449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
rather than only allowing "bridge" the cni network must be configured appropriately, or the fancy network feature (Connect) won't work (naturally) so a warning is shown at job submission time for groups with cni networks and connect tasks.
Welp, I caused a compile error trying to fix something "real quick" and in fixing that, it seems to have broken connect+ipv6, so I'm hunting that down. Edit: Nope, I just goofed the very thing that we are concerned people might goof when using this. I copied the ipv4-only Nomad "bridge" CNI config, so the I will follow up with a second PR that updates the docs for the bridge config, so there's a full example of IPv6 there, too. Doing that separately so I can backport it, while this PR only goes to Edit edit: docs PR for this: #26456 |
strings don't "startswith" around these parts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating the docs! Minor nit about Consul Connect name change.
all external traffic is provided by the sidecar. | ||
You must set the group's `network` [`mode`][] to `bridge`, or an appropriately | ||
configured `cni/*` network, for network isolation. Using a `cni/*` network | ||
with Consul Connect requires extra care. You may model your network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with Consul Connect requires extra care. You may model your network | |
with Consul service mesh requires extra care. You may model your network |
Consul Connect because Consul service mesh 3-4 years ago. I just recently updated the Nomad docs to reflect the name change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoa! TIL.
while I have you here, I'll admit I was surprised that this was the only page that mentions the restriction that I'm loosening. If you happen to know of any others, please let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not surprised. Much of the Nomad content RE Consul needs refreshing (several tickets for Nomad IA phase 2). Jeff has scoped out the work, and I think Daniele is going to tackle the refrsh since he is a Consul expert.
all external traffic is provided by the sidecar. | ||
You must set the group's `network` [`mode`][] to `bridge`, or an appropriately | ||
configured `cni/*` network, for network isolation. Using a `cni/*` network | ||
with Consul service mesh requires extra care. You may model your network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"extra care" feels like it leaves a lot to the imagination. Can we provide specific guidance or otherwise specifically discourage people from doing this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly I hadn't thought through all the nuances to give proper guidance, and I thought we wanted to be pretty hands-off on this, which is why I'd added the warning to do it "at your own risk".
but yeah, a little more guidance is probably worthwhile.
I believe these are the high points?
- the alloc network should be isolated, so traffic between the main task and the sidecar can't be intercepted (because it's probably not encrypted)
- incoming traffic needs to be able to reach the sidecar at the IP:port which will be advertised on the service
- traffic needs to be able to flow from different allocs' sidecars to one another (similar, but different from # 2?)
did I get any of that wrong? am I missing anything big?
there are way more ways to screw it up than to get it right, so I hesitate to give very much more in-depth information. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the alloc network should be isolated
CNI network
configs get a network namespace created by Nomad, so that part's given I think?
incoming traffic needs to be able to reach the sidecar at the IP:port which will be advertised on the service
Where "the service" == "the sidecar service" here, but otherwise 👍
traffic needs to be able to flow from different allocs' sidecars to one another (similar, but different from # 2?)
👍
did I get any of that wrong? am I missing anything big?
Hm, what about tproxy mode? In that scenario, we'll create another set of IP tables rules inside the network namespace (and these are controlled by the consul-cni
plugin, not by Nomad directly). So if you're trying to use cni/*
with tproxy you need to account for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a network namespace created by Nomad
this should be isolated by default, but surely a CNI network could be (mistakenly) crafted to pry it open
tproxy mode
good call-out, and maybe best communicated with a new tab in the example configs here? (there's "Default" and "IPv6" there nowadays)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR documenting tproxy conflist: #26532
I'll follow up next week with a craftily worded description of the "extra care" 👍
Currently, to use Consul Connect (service mesh), you must use
"bridge"
network mode.We've kept it this way so we can be confident that Connect can actually... connect.
But it's entirely possible to configure your own CNI network (Nomad's "bridge" is just a CNI network, after all) to be compatible with Connect. We just can't make guarantees about it, because CNI is highly configurable.
This PR opens up the validation to allow
"cni/*"
network modes in a "use at your own risk" capacity.The CLI even presents a warning on job
plan
/run
when your jobspec is so configured:Closes #8953 (from Sep 2020!)
Internal ref: NMD-585