Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #5739: Store NetworkPolicy in filesystem as fallback data source #5777: Enable Pod network after realizing initial NetworkPolicies #5795: Support Local ExternalTrafficPolicy for Services with #5798: Fix unit test TestReconcile #5833: Enable IPv4/IPv6 forwarding on demand automatically #5861

Commits on Jan 10, 2024

  1. Store NetworkPolicy in filesystem as fallback data source

    In the previous implementation, traffic from/to a Pod may bypass
    NetworkPolicies applied to the Pod in a time window when the agent
    restarts because realizing NetworkPolicies and enabling forwarding are
    asynchronous.
    
    This patch stores NetworkPolicy data in files when they are received,
    and makes antre-agent fallback to use the files as data source if it
    can't connect to antrea-controller on startup. This prevents security
    regression: a NetworkPolicy that has been realized on a Node will
    continue to work even if antrea-controller is not available after
    antrea-agent restarts.
    
    The benchmark results of the storage's operations are as below:
    
    BenchmarkFileStoreAddNetworkPolicy-40              70383             16102 ns/op             520 B/op          9 allocs/op
    BenchmarkFileStoreAddAppliedToGroup-40             45382             25880 ns/op            3019 B/op          9 allocs/op
    BenchmarkFileStoreAddAddressGroup-40                7400            180000 ns/op           49538 B/op          9 allocs/op
    BenchmarkFileStoreReplaceAll-40                       10         127088004 ns/op        17815943 B/op      33099 allocs/op
    
    The disk usage when storing 1k NetworkPolicies, AddressGroups, and
    AppliedToGroups created by BenchmarkFileStoreReplaceAll is as below:
    
    16M     /var/run/antrea-test/file-store/address-groups
    4.0M    /var/run/antrea-test/file-store/applied-to-groups
    4.0M    /var/run/antrea-test/file-store/network-policies
    
    Signed-off-by: Quan Tian <qtian@vmware.com>
    tnqn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    c1cabec View commit details
    Browse the repository at this point in the history
  2. Enable Pod network after realizing initial NetworkPolicies

    Pod network should only be enabled after realizing initial
    NetworkPolicies, otherwise traffic from/to Pods could bypass
    NetworkPolicy when antrea-agent restarts.
    
    After commit f9fc979 ("Store NetworkPolicy in filesystem as
    fallback data source"), antrea-agent can realize either the latest
    NetworkPolicies got from antrea-controller or the ones got from
    filesystem as fallback. Therefore, waiting for NetworkPolicies to be
    realized should not add marked delay or make antrea-controller a failure
    point of Pod network.
    
    This commit adds an implementation of wait group capable of waiting with
    a timeout, and uses it to wait for common initialization and
    NetworkPolicy realization before installing any flows for Pods. More
    preconditions can be added via the wait group if needed in the future.
    
    Signed-off-by: Quan Tian <qtian@vmware.com>
    tnqn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    8ce6443 View commit details
    Browse the repository at this point in the history
  3. Support Local ExternalTrafficPolicy for Services with ExternalIPs

    Since K8s 1.29, setting Local ExternalTrafficPolicy for ClusterIP
    Services with ExternalIPs is supported.
    
    Signed-off-by: Quan Tian <qtian@vmware.com>
    tnqn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    3b6f3dc View commit details
    Browse the repository at this point in the history
  4. Fix unit test TestReconcile

    cniServer.reconcile() now installs flows asynchorously.
    
    Signed-off-by: Quan Tian <qtian@vmware.com>
    tnqn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    bba0a25 View commit details
    Browse the repository at this point in the history
  5. Enable IPv4/IPv6 forwarding on demand automatically

    Although it has been documented as a prerequisite in [1], there are
    some platforms not enabling ip forwarding by default. kube-proxy ipvs
    mode and some CNIs enable it by themselves to ensure Pod networking
    work properly.
    
    As Antrea needs IP forwarding to be enabled, there seems no reason to
    not do it by itself, rather than expecting users or other components to
    do it.
    
    [1] https://kubernetes.io/docs/setup/production-environment/container-runtimes/#install-and-configure-prerequisites
    
    Signed-off-by: Quan Tian <qtian@vmware.com>
    tnqn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    a71c40d View commit details
    Browse the repository at this point in the history