Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make disabling TX checksum offload configurable #3832

Merged
merged 1 commit into from
May 27, 2022
Merged

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented May 26, 2022

Add a configuration parameter disableTXChecksumOffload for Antrea Agent to support cases in which the datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.

Signed-off-by: Quan Tian qtian@vmware.com

This is required when using TrafficControl Redirect action with threat detection engines' certain packet capture methods, e.g. Suricata AF_PACKET, with which skb csum state is messed up after it's processed by engines' userspace, leading to packet drop on destination.

@tnqn tnqn requested review from antoninbas and jianjuns May 26, 2022 15:10
@tnqn tnqn added the action/release-note Indicates a PR that should be included in release notes. label May 26, 2022
@tnqn tnqn added this to the Antrea v1.7 release milestone May 26, 2022
@tnqn tnqn mentioned this pull request May 26, 2022
5 tasks
@codecov-commenter
Copy link

codecov-commenter commented May 26, 2022

Codecov Report

Merging #3832 (6fd87c5) into main (6445ee1) will decrease coverage by 16.08%.
The diff coverage is 85.71%.

❗ Current head 6fd87c5 differs from pull request most recent head 5196f83. Consider uploading reports for the commit 5196f83 to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3832       +/-   ##
===========================================
- Coverage   62.18%   46.10%   -16.09%     
===========================================
  Files         281      247       -34     
  Lines       40091    35913     -4178     
===========================================
- Hits        24931    16556     -8375     
- Misses      13190    17720     +4530     
+ Partials     1970     1637      -333     
Flag Coverage Δ
e2e-tests 46.10% <85.71%> (?)
kind-e2e-tests ?
unit-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...g/agent/cniserver/interface_configuration_linux.go 12.77% <75.00%> (-3.98%) ⬇️
pkg/agent/cniserver/pod_configuration.go 44.84% <100.00%> (-9.75%) ⬇️
pkg/agent/cniserver/server.go 41.88% <100.00%> (-25.39%) ⬇️
pkg/controller/networkpolicy/endpoint_querier.go 4.58% <0.00%> (-88.08%) ⬇️
pkg/util/flowexport/flowexport.go 0.00% <0.00%> (-84.62%) ⬇️
pkg/agent/util/iptables/lock.go 0.00% <0.00%> (-80.00%) ⬇️
pkg/controller/networkpolicy/crd_utils.go 13.63% <0.00%> (-78.58%) ⬇️
pkg/cni/client.go 0.00% <0.00%> (-77.78%) ⬇️
...lowaggregator/clickhouseclient/clickhouseclient.go 0.00% <0.00%> (-76.62%) ⬇️
pkg/controller/externalippool/validate.go 0.00% <0.00%> (-75.87%) ⬇️
... and 156 more

@tnqn
Copy link
Member Author

tnqn commented May 26, 2022

/test-all

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn : does that mean to use Suricata for IPS/FW, we must disable checksum offload? It sounds not ideal.

@@ -48,7 +48,7 @@ type ifConfigurator struct {
epCache *sync.Map
}

func newInterfaceConfigurator(ovsDatapathType ovsconfig.OVSDatapathType, isOvsHardwareOffloadEnabled bool) (*ifConfigurator, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment saying disableTXChecksumOffload is ignored on Windows?

@@ -170,6 +170,10 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false

# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
disableTXChecksumOffload: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should say it is on Linux only?

@tnqn
Copy link
Member Author

tnqn commented May 26, 2022

@tnqn : does that mean to use Suricata for IPS/FW, we must disable checksum offload? It sounds not ideal.

Yes. I remember it's also the case of Snort. I saw this issue in several places where packets are passed to userspace and sent back to kernel for transmission. Not sure if it's specific to AF_PACKET, I will try other capture methods.

Add a configuration parameter `disableTXChecksumOffload` for Antrea
Agent to support cases in which the datapath doesn't support TX checksum
offloading, which causes packets to be dropped due to bad checksum.

Signed-off-by: Quan Tian <qtian@vmware.com>
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Anyway, let us merge it for this release. And we can think about any better solution (seems not quite possible).

@tnqn
Copy link
Member Author

tnqn commented May 26, 2022

/test-all

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also fine with this change, but I agree with Jianjun that it is really not ideal if this is required when integrating with IDS solutions.

@tnqn
Copy link
Member Author

tnqn commented May 27, 2022

Sure @jianjuns @antoninbas, I will explore better solutions.

@tnqn tnqn merged commit 05504f0 into antrea-io:main May 27, 2022
@tnqn tnqn deleted the tx-offload branch May 27, 2022 08:09
hongliangl added a commit to hongliangl/antrea that referenced this pull request Jun 30, 2022
This is a supplement to PR antrea-io#3832. When `disableTXChecksumOffload`
is true, TX checksum offload should be also disabled, otherwise
for the cases in which the datapath doesn't support TX checksum
offloading, packets received on Antrea gateway could be dropped
due to bad checksum.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this pull request Jul 1, 2022
This is a supplement to PR antrea-io#3832. When `disableTXChecksumOffload`
is true, TX checksum offload of Antrea gateway should be also
disabled, otherwise for the cases in which the datapath doesn't
support TX checksum offloading, packets sent from Antrea gateway
could be dropped due to bad checksum.

Note that, when changing `disableTXChecksumOffload` from true back
to false, TX checksum offload of Antrea gateway will not be enabled
automatically, and TX checksum offload can be enabled manually with
ethtool. Another way is to remove Antrea gateway interface before
updating `disableTXChecksumOffload`.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this pull request Sep 27, 2022
This is a supplement to PR antrea-io#3832. When `disableTXChecksumOffload`
is true, TX checksum offload of Antrea gateway should be also
disabled, otherwise for the cases in which the datapath doesn't
support TX checksum offloading, packets sent from Antrea gateway
could be dropped due to bad checksum.

Note that, when changing `disableTXChecksumOffload` from true back
to false, TX checksum offload of Antrea gateway will not be enabled
automatically, and TX checksum offload can be enabled manually with
ethtool. Another way is to remove Antrea gateway interface before
updating `disableTXChecksumOffload`.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants