-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix libOpenflow crash for some Traceflow requests #1883
Conversation
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since antrea-io#1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes antrea-io#1878
The libOpenflow patch: antoninbas/libOpenflow@32f2e57 @gran-vmv @wenyingd do you know why so many switch cases are not handled correctly in https://github.com/contiv/libOpenflow/blob/01db743640b1c89bc14629581dc94e8250f782c4/openflow13/match.go#L219 |
Codecov Report
@@ Coverage Diff @@
## main #1883 +/- ##
=======================================
Coverage ? 61.73%
=======================================
Files ? 199
Lines ? 17223
Branches ? 0
=======================================
Hits ? 10633
Misses ? 5491
Partials ? 1099
Flags with carried forward coverage won't be shown. Click here to find out more. |
PR antrea-io#1883 fixes a panic in libOpenflow triggered when OVS receives reply traffic for a Traceflow request with a valid dataplane tag as the ToS field and the Linux packet mark set. However, it should be noted that reply packets for Traceflow requests are generally meaningless and should be ignored. In encapMode, The Traceflow implementation should also not timeout when a Traceflow request leaves the overlay: as soon as the request is forwarded through the gateway port, we should consider the request complete, and ignore any potential reply packet. So we include the following changes: * add a new "ForwardedOutOfOverlay" Traceflow action when a request is forwarded out of the network managed by Antrea in encapMode. The Controller can then mark the request as "succeeded". In theory, something similar could be done for other traffic modes, but it would be much more complex. * add support for Traceflow requests for which the destination is the gateway's IP, by reporting a "Delivered" action. * add an OVS flow in charge of dropping reply traffic for Traceflow requests (using the conntrack state to match this traffic), thus ensuring it is not set to the Agent. In our testing, this is especially useful when the destination IP is the local Node's IP, as the IP ToS field seems to be preseved in that case, causing the reply packet to be treated as a Traceflow request. We add end-to-end tests for both cases (external destination IP and Antrea gateway destination IP). See antrea-io#1878
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
We need to discuss with @wenyingd for the libOpenflow cases and redirect libOpenflow in go.mod to original repo.
/test-all |
PR antrea-io#1883 fixes a panic in libOpenflow triggered when OVS receives reply traffic for a Traceflow request with a valid dataplane tag as the ToS field and the Linux packet mark set. However, it should be noted that reply packets for Traceflow requests are generally meaningless and should be ignored. In encapMode, The Traceflow implementation should also not timeout when a Traceflow request leaves the overlay: as soon as the request is forwarded through the gateway port, we should consider the request complete, and ignore any potential reply packet. So we include the following changes: * add a new "ForwardedOutOfOverlay" Traceflow action when a request is forwarded out of the network managed by Antrea in encapMode. The Controller can then mark the request as "succeeded". In theory, something similar could be done for other traffic modes, but it would be much more complex. * add support for Traceflow requests for which the destination is the gateway's IP, by reporting a "Delivered" action. * add an OVS flow in charge of dropping reply traffic for Traceflow requests (using the conntrack state to match this traffic), thus ensuring it is not set to the Agent. In our testing, this is especially useful when the destination IP is the local Node's IP, as the IP ToS field seems to be preseved in that case, causing the reply packet to be treated as a Traceflow request. We add end-to-end tests for both cases (external destination IP and Antrea gateway destination IP). See antrea-io#1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since antrea-io#1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes antrea-io#1878
PR #1883 fixes a panic in libOpenflow triggered when OVS receives reply traffic for a Traceflow request with a valid dataplane tag as the ToS field and the Linux packet mark set. However, it should be noted that reply packets for Traceflow requests are generally meaningless and should be ignored. In encapMode, The Traceflow implementation should also not timeout when a Traceflow request leaves the overlay: as soon as the request is forwarded through the gateway port, we should consider the request complete, and ignore any potential reply packet. So we include the following changes: * add a new "ForwardedOutOfOverlay" Traceflow action when a request is forwarded out of the network managed by Antrea in encapMode. The Controller can then mark the request as "succeeded". In theory, something similar could be done for other traffic modes, but it would be much more complex. * add support for Traceflow requests for which the destination is the gateway's IP, by reporting a "Delivered" action. * add an OVS flow in charge of dropping reply traffic for Traceflow requests (using the conntrack state to match this traffic), thus ensuring it is not set to the Agent. In our testing, this is especially useful when the destination IP is the local Node's IP, as the IP ToS field seems to be preserved in that case, causing the reply packet to be treated as a Traceflow request. We add end-to-end tests for both cases (external destination IP and Antrea gateway destination IP). See #1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since antrea-io#1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes antrea-io#1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since #1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes #1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since #1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes #1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since antrea-io#1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes antrea-io#1878
libOpenflow panics if a packet-in message is sent by OVS with a NXM_NX_PKT_MARK match field. Since #1816, it is a possible situation: Traceflow requests for the Node IP can lead to reply traffic with the packet mark set, which are sent to the Antrea Agent as a PacketIn message. To resolve this issue, we first switch temporarily to a patched libOpenflow version without this issue. When the patch is ported in upstream libOpenflow, we can remove the replace directive from go.mod. Fixes #1878
libOpenflow panics if a packet-in message is sent by OVS with a
NXM_NX_PKT_MARK match field. Since #1816, it is a possible situation:
Traceflow requests for the Node IP can lead to reply traffic with the
packet mark set, which are sent to the Antrea Agent as a PacketIn
message. To resolve this issue, we first switch temporarily to a patched
libOpenflow version without this issue. When the patch is ported in
upstream libOpenflow, we can remove the replace directive from go.mod.
Fixes #1878