Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using ClusterFirstWithHostNet DNSPolicy for antrea-agent #4548

Conversation

antoninbas
Copy link
Contributor

The DaemonSet for a CNI should not use ClusterFirstWithHostNet and should not have a dependency on cluster DNS, since cluster DNS itself depends on the CNI. Therefore, we revert to the "default" DNSPolicy for host-network Pods.

The only dependency we had on cluster DNS was in the Flow Exporter, where we would try to connect to the Flow Aggregator using DNS name "flow-aggregator.flow-aggregator.svc" (by default). This would only work on Linux, and not on Windows (for Windows Nodes, the Cluster IP had to be provided instead). We update the Flow Exporter to resolve the Service Cluster IP using the K8s API instead.

Notable changes:

  • go-ipfix is upgraded to v0.6.0 to support configuring the server name for Flow Aggregator certificate verification in the Flow Exporter
  • the flowAggregatorAddress parameter is no longer required in the Flow Aggregator configuration
  • the format for the flowCollectorAddr parameter in the Flow Exporter configuration (antrea-agent) is changed in a non backwards-compatible way. The "host" name must be provided as "flow-aggregator/flow-aggregator" instead of "flow-aggregator.flow-aggregator.svc".

Fixes #3279

Signed-off-by: Antonin Bas abas@vmware.com

@codecov
Copy link

codecov bot commented Jan 10, 2023

Codecov Report

Merging #4548 (a33037b) into main (8961420) will decrease coverage by 0.03%.
The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4548      +/-   ##
==========================================
- Coverage   68.51%   68.49%   -0.03%     
==========================================
  Files         400      400              
  Lines       58216    58192      -24     
==========================================
- Hits        39887    39858      -29     
- Misses      15564    15582      +18     
+ Partials     2765     2752      -13     
Flag Coverage Δ
e2e-tests 37.89% <0.00%> (-0.45%) ⬇️
integration-tests 34.48% <ø> (-0.08%) ⬇️
kind-e2e-tests 47.23% <66.66%> (-0.25%) ⬇️
unit-tests 57.55% <34.72%> (-0.01%) ⬇️
Impacted Files Coverage Δ
cmd/antrea-agent/options.go 36.47% <ø> (ø)
...agent/flowexporter/connections/conntrack_others.go 0.00% <ø> (ø)
pkg/config/flowaggregator/default.go 100.00% <ø> (ø)
pkg/flowaggregator/exporter/ipfix.go 63.11% <50.00%> (ø)
pkg/flowaggregator/certificate.go 75.00% <53.84%> (-0.53%) ⬇️
pkg/agent/flowexporter/exporter/exporter.go 82.25% <92.50%> (+2.06%) ⬆️
pkg/agent/flowexporter/exporter/certificate.go 50.00% <100.00%> (ø)
pkg/agent/util/ethtool/ethtool_linux.go 0.00% <0.00%> (-70.00%) ⬇️
pkg/agent/controller/trafficcontrol/controller.go 77.24% <0.00%> (-5.55%) ⬇️
pkg/agent/wireguard/client_linux.go 77.07% <0.00%> (-4.46%) ⬇️
... and 32 more

@antoninbas antoninbas force-pushed the stop-using-ClusterFirstWithHostNet-dnsPolicy-for-antrea-agent branch 3 times, most recently from 8df0283 to ffaa9ec Compare January 10, 2023 23:59
tnqn
tnqn previously approved these changes Jan 11, 2023
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas added this to the Antrea v1.11 release milestone Jan 11, 2023
@antoninbas antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Jan 11, 2023
heanlan
heanlan previously approved these changes Jan 14, 2023
Copy link
Contributor

@heanlan heanlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas dismissed stale reviews from heanlan and tnqn via d3292b5 January 14, 2023 02:08
@antoninbas antoninbas force-pushed the stop-using-ClusterFirstWithHostNet-dnsPolicy-for-antrea-agent branch from ffaa9ec to d3292b5 Compare January 14, 2023 02:08
@antoninbas
Copy link
Contributor Author

/test-all

heanlan
heanlan previously approved these changes Jan 14, 2023
The DaemonSet for a CNI should not use ClusterFirstWithHostNet and
should not have a dependency on cluster DNS, since cluster DNS itself
depends on the CNI. Therefore, we revert to the "default" DNSPolicy for
host-network Pods.

The only dependency we had on cluster DNS was in the Flow Exporter,
where we would try to connect to the Flow Aggregator using DNS name
"flow-aggregator.flow-aggregator.svc" (by default). This would only work
on Linux, and not on Windows (for Windows Nodes, the Cluster IP had to
be provided instead). We update the Flow Exporter to resolve the Service
Cluster IP using the K8s API instead.

Notable changes:
* go-ipfix is upgraded to v0.6.0 to support configuring the server name
  for Flow Aggregator certificate verification in the Flow Exporter
* the flowAggregatorAddress parameter is no longer required in the Flow
  Aggregator configuration
* the format for the flowCollectorAddr parameter in the Flow Exporter
  configuration (antrea-agent) is changed in a non backwards-compatible
  way. The "host" name must be provided as
  "flow-aggregator/flow-aggregator" instead of
  "flow-aggregator.flow-aggregator.svc".

Fixes antrea-io#3279

Signed-off-by: Antonin Bas <abas@vmware.com>
Signed-off-by: Antonin Bas <abas@vmware.com>
@antoninbas antoninbas force-pushed the stop-using-ClusterFirstWithHostNet-dnsPolicy-for-antrea-agent branch from d3292b5 to a33037b Compare January 17, 2023 20:35
@antoninbas
Copy link
Contributor Author

/test-all

Copy link
Contributor

@heanlan heanlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit 08fa578 into antrea-io:main Jan 18, 2023
@antoninbas antoninbas deleted the stop-using-ClusterFirstWithHostNet-dnsPolicy-for-antrea-agent branch January 18, 2023 00:58
GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Jan 27, 2023
…-io#4548)

The DaemonSet for a CNI should not use ClusterFirstWithHostNet and
should not have a dependency on cluster DNS, since cluster DNS itself
depends on the CNI. Therefore, we revert to the "default" DNSPolicy for
host-network Pods.

The only dependency we had on cluster DNS was in the Flow Exporter,
where we would try to connect to the Flow Aggregator using DNS name
"flow-aggregator.flow-aggregator.svc" (by default). This would only work
on Linux, and not on Windows (for Windows Nodes, the Cluster IP had to
be provided instead). We update the Flow Exporter to resolve the Service
Cluster IP using the K8s API instead.

Notable changes:
* go-ipfix is upgraded to v0.6.0 to support configuring the server name
  for Flow Aggregator certificate verification in the Flow Exporter
* the flowAggregatorAddress parameter is no longer required in the Flow
  Aggregator configuration
* the format for the flowCollectorAddr parameter in the Flow Exporter
  configuration (antrea-agent) is changed in a non backwards-compatible
  way. The "host" name must be provided as
  "flow-aggregator/flow-aggregator" instead of
  "flow-aggregator.flow-aggregator.svc".

Fixes antrea-io#3279

Signed-off-by: Antonin Bas <abas@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS failures: Antrea ProxyAll and ClusterFirstWithHostNet DNSPolicy for Antrea Agent
3 participants