Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the flaky test TestWorker in idAllocator #1549

Merged
merged 1 commit into from
Dec 8, 2020

Conversation

srikartati
Copy link
Member

Changed the name from TestWorker to TestIDAllocatorWorker

Fixes #1546

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-all-features-conformance: to trigger conformance tests with all alpha features enabled.
  • /skip-all-features-conformance: to skip conformance tests with all alpha features enabled.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@srikartati
Copy link
Member Author

/test-all

@codecov-io
Copy link

codecov-io commented Nov 13, 2020

Codecov Report

Merging #1549 (56ca769) into master (9d3d10b) will increase coverage by 0.89%.
The diff coverage is 58.21%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1549      +/-   ##
==========================================
+ Coverage   63.31%   64.21%   +0.89%     
==========================================
  Files         170      181      +11     
  Lines       14250    15344    +1094     
==========================================
+ Hits         9023     9853     +830     
- Misses       4292     4462     +170     
- Partials      935     1029      +94     
Flag Coverage Δ
e2e-tests 47.81% <38.05%> (?)
kind-e2e-tests 52.52% <52.43%> (-2.87%) ⬇️
unit-tests 40.41% <19.59%> (-0.86%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/antrea-agent/agent.go 0.00% <0.00%> (ø)
cmd/antrea-agent/options.go 21.69% <0.00%> (+0.97%) ⬆️
.../agent/apiserver/handlers/networkpolicy/handler.go 58.33% <ø> (ø)
...gent/controller/noderoute/node_route_controller.go 61.04% <ø> (+14.58%) ⬆️
pkg/agent/stats/collector.go 97.72% <ø> (ø)
pkg/antctl/antctl.go 100.00% <ø> (ø)
pkg/antctl/transform/appliedtogroup/transform.go 0.00% <0.00%> (ø)
pkg/antctl/transform/controllerinfo/transform.go 0.00% <ø> (ø)
pkg/antctl/transform/version/transform.go 44.82% <ø> (ø)
pkg/controller/networkpolicy/tier.go 90.00% <ø> (ø)
... and 93 more

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced by this PR, but I am having a bit of a hardtime understanding the difference between the 2 subtests "delete-rule-with-async-delete-interval" and "delete-rule-with-flow-poll-interval". I think for "delete-rule-with-flow-poll-interval", you expect the deletion delay to be smaller (1ms instead of 5ms), but at these time scales and based on the rest of the code, it doesn't seem to be something that you actually check for.

@@ -222,7 +223,8 @@ func TestWorker(t *testing.T) {
}
return false, nil
}
if err := wait.PollImmediate(time.Millisecond, time.Millisecond*10, conditionFunc); err != nil {

if err := wait.PollImmediate(time.Millisecond, time.Second, conditionFunc); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to poll every millisecond?

Also shouldn't we check that the rule is not deleted immediately when we call forgetRule (which would probably require increasing the delete interval a bit so that the test is reliable)?

Copy link
Member Author

@srikartati srikartati Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we poll every millisecond, we could check whether the rule is immediately deleted or it stayed in the async cache for the expected amount of time.
I was checking for elapsed time (L235) relative to the time just before forgetRule is being called to see if the rule is in the cache for at least allocator.deleteInterval. Previously I used a hardcoded value of 5 ms. Now I changed it to a.deleteInterval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should have some longer time intervals to reduce the probability of flakes in CI, e..g in the 100s of ms

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member Author

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced by this PR, but I am having a bit of a hardtime understanding the difference between the 2 subtests "delete-rule-with-async-delete-interval" and "delete-rule-with-flow-poll-interval". I think for "delete-rule-with-flow-poll-interval", you expect the deletion delay to be smaller (1ms instead of 5ms), but at these time scales and based on the rest of the code, it doesn't seem to be something that you actually check for.

Thanks for the question. The idea was to have two cases for allocator.deleteInterval: one for minAsyncDeleteInterval(10ms) and one for asyncDeleteInterval (5ms). Previously, picked a value that does not provide complete coverage as both minAsyncDeleteInterval and asyncDeleteInterval are equal. Changed the value to 10ms instead of 5ms to get corresponding coverage in newIDAllocator. Modified the names too replacing the old names that correspond to old variables.

@@ -222,7 +223,8 @@ func TestWorker(t *testing.T) {
}
return false, nil
}
if err := wait.PollImmediate(time.Millisecond, time.Millisecond*10, conditionFunc); err != nil {

if err := wait.PollImmediate(time.Millisecond, time.Second, conditionFunc); err != nil {
Copy link
Member Author

@srikartati srikartati Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we poll every millisecond, we could check whether the rule is immediately deleted or it stayed in the async cache for the expected amount of time.
I was checking for elapsed time (L235) relative to the time just before forgetRule is being called to see if the rule is in the cache for at least allocator.deleteInterval. Previously I used a hardcoded value of 5 ms. Now I changed it to a.deleteInterval.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this gets rid of the Windows unit test failures in CI, which seem to be pretty consistent now (50% of the time):

2020-11-21T00:33:25.5638749Z ok  	github.com/vmware-tanzu/antrea/pkg/agent/config	1.044s
2020-11-21T00:33:46.3667470Z ==================
2020-11-21T00:33:46.3668619Z WARNING: DATA RACE
2020-11-21T00:33:46.3669618Z Write at 0x00c000475fe8 by goroutine 90:
2020-11-21T00:33:46.3674024Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.(*idAllocator).allocateForRule()
2020-11-21T00:33:46.3675458Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator.go:125 +0x3c4
2020-11-21T00:33:46.3676923Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.TestWorker.func1()
2020-11-21T00:33:46.3678348Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator_test.go:207 +0x155
2020-11-21T00:33:46.3679554Z   testing.tRunner()
2020-11-21T00:33:46.3680655Z       C:/hostedtoolcache/windows/go/1.15.5/x64/src/testing/testing.go:1123 +0x202
2020-11-21T00:33:46.3681300Z 
2020-11-21T00:33:46.3682007Z Previous read at 0x00c000475fe8 by goroutine 88:
2020-11-21T00:33:46.3683346Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.asyncRuleCacheKeyFunc()
2020-11-21T00:33:46.3684793Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator.go:64 +0x5a
2020-11-21T00:33:46.3685757Z   k8s.io/client-go/tools/cache.(*cache).Delete()
2020-11-21T00:33:46.3686772Z       C:/Users/runneradmin/go/pkg/mod/github.com/tnqn/client-go@v0.18.4-1/tools/cache/store.go:166 +0x76
2020-11-21T00:33:46.3688416Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.(*idAllocator).processDeleteQueueItem()
2020-11-21T00:33:46.3689892Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator.go:167 +0x303
2020-11-21T00:33:46.3691232Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.(*idAllocator).worker()
2020-11-21T00:33:46.3692408Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator.go:147 +0x44
2020-11-21T00:33:46.3693727Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.(*idAllocator).worker-fm()
2020-11-21T00:33:46.3694952Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator.go:146 +0x48
2020-11-21T00:33:46.3696059Z   k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
2020-11-21T00:33:46.3697170Z       C:/Users/runneradmin/go/pkg/mod/k8s.io/apimachinery@v0.18.4/pkg/util/wait/wait.go:155 +0x7c
2020-11-21T00:33:46.3698248Z   k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
2020-11-21T00:33:46.3699411Z       C:/Users/runneradmin/go/pkg/mod/k8s.io/apimachinery@v0.18.4/pkg/util/wait/wait.go:156 +0xba
2020-11-21T00:33:46.3700440Z   k8s.io/apimachinery/pkg/util/wait.JitterUntil()
2020-11-21T00:33:46.3701632Z       C:/Users/runneradmin/go/pkg/mod/k8s.io/apimachinery@v0.18.4/pkg/util/wait/wait.go:133 +0x114
2020-11-21T00:33:46.3702833Z   k8s.io/apimachinery/pkg/util/wait.Until()
2020-11-21T00:33:46.3703694Z       C:/Users/runneradmin/go/pkg/mod/k8s.io/apimachinery@v0.18.4/pkg/util/wait/wait.go:90 +0x64
2020-11-21T00:33:46.3704255Z 
2020-11-21T00:33:46.3704747Z Goroutine 90 (running) created at:
2020-11-21T00:33:46.3705356Z   testing.(*T).Run()
2020-11-21T00:33:46.3706027Z       C:/hostedtoolcache/windows/go/1.15.5/x64/src/testing/testing.go:1168 +0x5bb
2020-11-21T00:33:46.3707214Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.TestWorker()
2020-11-21T00:33:46.3708402Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator_test.go:204 +0x485
2020-11-21T00:33:46.3709163Z   testing.tRunner()
2020-11-21T00:33:46.3709919Z       C:/hostedtoolcache/windows/go/1.15.5/x64/src/testing/testing.go:1123 +0x202
2020-11-21T00:33:46.3710387Z 
2020-11-21T00:33:46.3710914Z Goroutine 88 (running) created at:
2020-11-21T00:33:46.3711995Z   github.com/vmware-tanzu/antrea/pkg/agent/controller/networkpolicy.TestWorker.func1()
2020-11-21T00:33:46.3713223Z       D:/a/antrea/antrea/pkg/agent/controller/networkpolicy/allocator_test.go:213 +0x3b8
2020-11-21T00:33:46.3713977Z   testing.tRunner()
2020-11-21T00:33:46.3714718Z       C:/hostedtoolcache/windows/go/1.15.5/x64/src/testing/testing.go:1123 +0x202
2020-11-21T00:33:46.3715326Z ==================
2020-11-21T00:33:46.3715835Z --- FAIL: TestWorker (0.03s)
2020-11-21T00:33:46.3716844Z     --- FAIL: TestWorker/delete-rule-with-async-delete-interval (0.02s)
2020-11-21T00:33:46.3733030Z ##[error]        allocator_test.go:226: Expect the rule with id 1 to be deleted from async rule cache
2020-11-21T00:33:46.3751176Z     --- FAIL: TestWorker/delete-rule-with-flow-poll-interval (0.01s)
2020-11-21T00:33:46.3753367Z ##[error]        testing.go:1038: race detected during execution of test
2020-11-21T00:33:46.3755479Z ##[error]    testing.go:1038: race detected during execution of test
2020-11-21T00:33:46.3757112Z I1121 00:33:33.287118    5780 networkpolicy_controller.go:363] Waiting for Antrea client to be ready
2020-11-21T00:33:46.3758663Z I1121 00:33:33.327118    5780 networkpolicy_controller.go:374] Antrea client is ready
2020-11-21T00:33:46.3759960Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:383] Waiting for all watchers to complete full sync
2020-11-21T00:33:46.3761030Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:564] Starting watch for AddressGroup
2020-11-21T00:33:46.3762006Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:571] Started watch for AddressGroup
2020-11-21T00:33:46.3763049Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:564] Starting watch for AppliedToGroup
2020-11-21T00:33:46.3764064Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:571] Started watch for AppliedToGroup
2020-11-21T00:33:46.3765108Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:564] Starting watch for NetworkPolicy
2020-11-21T00:33:46.3766122Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:571] Started watch for NetworkPolicy
2020-11-21T00:33:46.3767371Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:601] Received 1 init events for NetworkPolicy
2020-11-21T00:33:46.3769683Z I1121 00:33:33.328120    5780 networkpolicy_controller.go:191] NetworkPolicy K8sNetworkPolicy:ns1/policy1 applied to Pods on this Node
2020-11-21T00:33:46.3770951Z I1121 00:33:33.429127    5780 networkpolicy_controller.go:601] Received 1 init events for AddressGroup
2020-11-21T00:33:46.3772029Z I1121 00:33:33.530129    5780 networkpolicy_controller.go:601] Received 1 init events for AppliedToGroup
2020-11-21T00:33:46.3773242Z I1121 00:33:33.530129    5780 networkpolicy_controller.go:385] All watchers have completed full sync, installing flows for init events
2020-11-21T00:33:46.3774370Z I1121 00:33:33.530129    5780 networkpolicy_controller.go:389] Starting NetworkPolicy workers now
2020-11-21T00:33:46.3775680Z I1121 00:33:33.531131    5780 networkpolicy_controller.go:395] Starting IDAllocator worker to maintain the async rule cache
2020-11-21T00:33:46.3776787Z I1121 00:33:33.532131    5780 networkpolicy_controller.go:363] Waiting for Antrea client to be ready
2020-11-21T00:33:46.3777700Z I1121 00:33:33.532131    5780 networkpolicy_controller.go:374] Antrea client is ready
2020-11-21T00:33:46.3778751Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:383] Waiting for all watchers to complete full sync
2020-11-21T00:33:46.3779743Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:564] Starting watch for AddressGroup
2020-11-21T00:33:46.3780756Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:571] Started watch for AddressGroup
2020-11-21T00:33:46.3781741Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:564] Starting watch for AppliedToGroup
2020-11-21T00:33:46.3782806Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:571] Started watch for AppliedToGroup
2020-11-21T00:33:46.3783860Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:601] Received 1 init events for AppliedToGroup
2020-11-21T00:33:46.3784943Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:564] Starting watch for NetworkPolicy
2020-11-21T00:33:46.3785928Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:571] Started watch for NetworkPolicy
2020-11-21T00:33:46.3786986Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:601] Received 1 init events for NetworkPolicy
2020-11-21T00:33:46.3788197Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:191] NetworkPolicy K8sNetworkPolicy:ns1/policy1 applied to Pods on this Node
2020-11-21T00:33:46.3789471Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:601] Received 1 init events for AddressGroup
2020-11-21T00:33:46.3790658Z I1121 00:33:33.533131    5780 networkpolicy_controller.go:385] All watchers have completed full sync, installing flows for init events
2020-11-21T00:33:46.3791834Z E1121 00:33:33.534130    5780 networkpolicy_controller.go:498] Rule 8af48c355979fb68 is not complete or does not exist in cache
2020-11-21T00:33:46.3794063Z I1121 00:33:33.534130    5780 networkpolicy_controller.go:389] Starting NetworkPolicy workers now
2020-11-21T00:33:46.3795192Z I1121 00:33:33.534130    5780 networkpolicy_controller.go:395] Starting IDAllocator worker to maintain the async rule cache
2020-11-21T00:33:46.3796335Z I1121 00:33:33.534130    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3797373Z I1121 00:33:33.535131    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3798512Z I1121 00:33:33.535131    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3799573Z I1121 00:33:33.734142    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3800629Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:363] Waiting for Antrea client to be ready
2020-11-21T00:33:46.3801592Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:374] Antrea client is ready
2020-11-21T00:33:46.3802553Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:383] Waiting for all watchers to complete full sync
2020-11-21T00:33:46.3803625Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:564] Starting watch for AppliedToGroup
2020-11-21T00:33:46.3804669Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:571] Started watch for AppliedToGroup
2020-11-21T00:33:46.3805694Z I1121 00:33:33.735144    5780 networkpolicy_controller.go:564] Starting watch for AddressGroup
2020-11-21T00:33:46.3806659Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:571] Started watch for AddressGroup
2020-11-21T00:33:46.3807682Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:564] Starting watch for NetworkPolicy
2020-11-21T00:33:46.3808695Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:571] Started watch for NetworkPolicy
2020-11-21T00:33:46.3809862Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:601] Received 1 init events for AppliedToGroup
2020-11-21T00:33:46.3810930Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:601] Received 1 init events for AddressGroup
2020-11-21T00:33:46.3812020Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:601] Received 1 init events for NetworkPolicy
2020-11-21T00:33:46.3813271Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:191] NetworkPolicy K8sNetworkPolicy:ns1/policy1 applied to Pods on this Node
2020-11-21T00:33:46.3814563Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:385] All watchers have completed full sync, installing flows for init events
2020-11-21T00:33:46.3815729Z I1121 00:33:33.736144    5780 networkpolicy_controller.go:389] Starting NetworkPolicy workers now
2020-11-21T00:33:46.3816814Z I1121 00:33:33.737142    5780 networkpolicy_controller.go:395] Starting IDAllocator worker to maintain the async rule cache
2020-11-21T00:33:46.3818152Z I1121 00:33:33.737142    5780 networkpolicy_controller.go:175] NetworkPolicy K8sNetworkPolicy:ns1/policy1 no longer applied to Pods on this Node
2020-11-21T00:33:46.3819366Z I1121 00:33:33.738142    5780 networkpolicy_controller.go:363] Waiting for Antrea client to be ready
2020-11-21T00:33:46.3820332Z I1121 00:33:33.738142    5780 networkpolicy_controller.go:374] Antrea client is ready
2020-11-21T00:33:46.3821285Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:383] Waiting for all watchers to complete full sync
2020-11-21T00:33:46.3822330Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:564] Starting watch for NetworkPolicy
2020-11-21T00:33:46.3823311Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:571] Started watch for NetworkPolicy
2020-11-21T00:33:46.3824364Z I1121 00:33:33.739141    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3825455Z I1121 00:33:33.739141    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3829319Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:564] Starting watch for AddressGroup
2020-11-21T00:33:46.3830552Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:571] Started watch for AddressGroup
2020-11-21T00:33:46.3831563Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:564] Starting watch for AppliedToGroup
2020-11-21T00:33:46.3832629Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:571] Started watch for AppliedToGroup
2020-11-21T00:33:46.3834388Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:601] Received 1 init events for NetworkPolicy
2020-11-21T00:33:46.3835668Z I1121 00:33:33.739141    5780 networkpolicy_controller.go:191] NetworkPolicy K8sNetworkPolicy:ns1/policy1 applied to Pods on this Node
2020-11-21T00:33:46.3836861Z I1121 00:33:33.740141    5780 networkpolicy_controller.go:601] Received 2 init events for AddressGroup
2020-11-21T00:33:46.3837973Z I1121 00:33:33.740141    5780 networkpolicy_controller.go:601] Received 1 init events for AppliedToGroup
2020-11-21T00:33:46.3839232Z I1121 00:33:33.740141    5780 networkpolicy_controller.go:385] All watchers have completed full sync, installing flows for init events
2020-11-21T00:33:46.3840354Z I1121 00:33:33.740141    5780 networkpolicy_controller.go:389] Starting NetworkPolicy workers now
2020-11-21T00:33:46.3841499Z I1121 00:33:33.740141    5780 networkpolicy_controller.go:395] Starting IDAllocator worker to maintain the async rule cache
2020-11-21T00:33:46.3842583Z I1121 00:33:33.741141    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3843677Z I1121 00:33:33.741141    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3844688Z I1121 00:33:33.742141    5780 networkpolicy_controller.go:363] Waiting for Antrea client to be ready
2020-11-21T00:33:46.3845641Z I1121 00:33:33.742141    5780 networkpolicy_controller.go:374] Antrea client is ready
2020-11-21T00:33:46.3846721Z I1121 00:33:33.742141    5780 networkpolicy_controller.go:383] Waiting for all watchers to complete full sync
2020-11-21T00:33:46.3847797Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:564] Starting watch for AppliedToGroup
2020-11-21T00:33:46.3848838Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:571] Started watch for AppliedToGroup
2020-11-21T00:33:46.3849868Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:564] Starting watch for AddressGroup
2020-11-21T00:33:46.3850831Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:571] Started watch for AddressGroup
2020-11-21T00:33:46.3851851Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:564] Starting watch for NetworkPolicy
2020-11-21T00:33:46.3852874Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:571] Started watch for NetworkPolicy
2020-11-21T00:33:46.3853908Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:601] Received 1 init events for AppliedToGroup
2020-11-21T00:33:46.3855017Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:601] Received 1 init events for NetworkPolicy
2020-11-21T00:33:46.3856229Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:191] NetworkPolicy K8sNetworkPolicy:ns1/policy1 applied to Pods on this Node
2020-11-21T00:33:46.3857471Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:601] Received 1 init events for AddressGroup
2020-11-21T00:33:46.3858625Z I1121 00:33:33.743141    5780 networkpolicy_controller.go:385] All watchers have completed full sync, installing flows for init events
2020-11-21T00:33:46.3859789Z I1121 00:33:33.744144    5780 networkpolicy_controller.go:389] Starting NetworkPolicy workers now
2020-11-21T00:33:46.3860882Z I1121 00:33:33.744144    5780 networkpolicy_controller.go:395] Starting IDAllocator worker to maintain the async rule cache
2020-11-21T00:33:46.3862039Z I1121 00:33:33.746144    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3863082Z E1121 00:33:33.752143    5780 networkpolicy_controller.go:640] Unknown event: {BOOKMARK <nil>}
2020-11-21T00:33:46.3864211Z I1121 00:33:33.752143    5780 networkpolicy_controller.go:575] Stopped watch for AddressGroup, total items received: 2
2020-11-21T00:33:46.3865312Z E1121 00:33:33.752143    5780 networkpolicy_controller.go:640] Unknown event: {BOOKMARK <nil>}
2020-11-21T00:33:46.3866377Z I1121 00:33:33.752143    5780 networkpolicy_controller.go:575] Stopped watch for AppliedToGroup, total items received: 2
2020-11-21T00:33:46.3867695Z I1121 00:33:33.753142    5780 networkpolicy_controller.go:148] NetworkPolicy K8sNetworkPolicy:ns1/policy2 applied to Pods on this Node
2020-11-21T00:33:46.3869105Z I1121 00:33:33.757142    5780 networkpolicy_controller.go:175] NetworkPolicy K8sNetworkPolicy:ns1/policy1 no longer applied to Pods on this Node
2020-11-21T00:33:46.3870564Z I1121 00:33:33.761142    5780 networkpolicy_controller.go:175] NetworkPolicy K8sNetworkPolicy:ns1/policy2 no longer applied to Pods on this Node
2020-11-21T00:33:46.3871796Z I1121 00:33:33.765142    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3872822Z I1121 00:33:42.465954    5780 reconciler.go:710] Forgetting rule unknown-rule-id
2020-11-21T00:33:46.3873670Z I1121 00:33:42.465954    5780 reconciler.go:710] Forgetting rule foo
2020-11-21T00:33:46.3874389Z I1121 00:33:42.466954    5780 reconciler.go:710] Forgetting rule foo
2020-11-21T00:33:46.3875157Z I1121 00:33:42.466954    5780 reconciler.go:710] Forgetting rule foo
2020-11-21T00:33:46.3876156Z I1121 00:33:42.468958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3877483Z I1121 00:33:42.469958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3878594Z I1121 00:33:42.469958    5780 reconciler.go:750] Can't find interface for Pod ns1/pod2, skipping
2020-11-21T00:33:46.3879800Z I1121 00:33:42.469958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3881072Z I1121 00:33:42.470959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3882401Z I1121 00:33:42.471959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3883925Z W1121 00:33:42.471959    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:&PodReference{Name:pod1,Namespace:ns1,},ExternalEntity:nil,IPs:[],Ports:[]NamedPort{},}
2020-11-21T00:33:46.3885486Z I1121 00:33:42.472959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3886804Z I1121 00:33:42.472959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3888081Z I1121 00:33:42.473959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3889437Z I1121 00:33:42.474959    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3890703Z I1121 00:33:42.474959    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3892019Z I1121 00:33:42.475960    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3893278Z E1121 00:33:42.478000    5780 reconciler.go:336] rule ingress-rule already realized during the initialization phase
2020-11-21T00:33:46.3894494Z I1121 00:33:42.478960    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3895827Z I1121 00:33:42.479959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3897107Z I1121 00:33:42.479959    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3898525Z I1121 00:33:42.479959    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3899814Z I1121 00:33:42.480960    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3901155Z I1121 00:33:42.480960    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3902236Z I1121 00:33:42.480960    5780 reconciler.go:750] Can't find interface for Pod ns1/pod4, skipping
2020-11-21T00:33:46.3903368Z I1121 00:33:42.481960    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3904685Z I1121 00:33:42.481960    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3905815Z I1121 00:33:42.481960    5780 reconciler.go:770] Can't find interface for Pod ns1/pod4, skipping
2020-11-21T00:33:46.3906944Z I1121 00:33:42.482960    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3908210Z I1121 00:33:42.482960    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3909687Z I1121 00:33:42.483960    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy AntreaClusterNetworkPolicy:name1
2020-11-21T00:33:46.3911220Z I1121 00:33:42.483960    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy AntreaClusterNetworkPolicy:name1
2020-11-21T00:33:46.3912811Z I1121 00:33:42.484959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy AntreaClusterNetworkPolicy:name1
2020-11-21T00:33:46.3914379Z I1121 00:33:42.484959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy AntreaClusterNetworkPolicy:name1
2020-11-21T00:33:46.3916684Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 4]],Ports:[]NamedPort{NamedPort{Port:443,Name:https,Protocol:TCP,},},}
2020-11-21T00:33:46.3918463Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 5]],Ports:[]NamedPort{},}
2020-11-21T00:33:46.3920009Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port https for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 5]],Ports:[]NamedPort{},}
2020-11-21T00:33:46.3921676Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 6]],Ports:[]NamedPort{NamedPort{Port:443,Name:foo,Protocol:TCP,},},}
2020-11-21T00:33:46.3923551Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port https for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 6]],Ports:[]NamedPort{NamedPort{Port:443,Name:foo,Protocol:TCP,},},}
2020-11-21T00:33:46.3925424Z W1121 00:33:42.485956    5780 reconciler.go:925] Can not resolve port https for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 1]],Ports:[]NamedPort{NamedPort{Port:80,Name:http,Protocol:TCP,},},}
2020-11-21T00:33:46.3927307Z W1121 00:33:42.486959    5780 reconciler.go:925] Can not resolve port https for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 2]],Ports:[]NamedPort{NamedPort{Port:80,Name:http,Protocol:TCP,},},}
2020-11-21T00:33:46.3929191Z W1121 00:33:42.486959    5780 reconciler.go:925] Can not resolve port https for endpoints &GroupMember{Pod:nil,ExternalEntity:nil,IPs:[[0 0 0 0 0 0 0 0 0 0 255 255 1 1 1 3]],Ports:[]NamedPort{NamedPort{Port:8080,Name:http,Protocol:TCP,},},}
2020-11-21T00:33:46.3931124Z I1121 00:33:42.487958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3932487Z I1121 00:33:42.488959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3933583Z I1121 00:33:42.488959    5780 reconciler.go:750] Can't find interface for Pod ns1/pod2, skipping
2020-11-21T00:33:46.3934715Z I1121 00:33:42.488959    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3935991Z I1121 00:33:42.489958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3937310Z I1121 00:33:42.490956    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3938878Z W1121 00:33:42.490956    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:&PodReference{Name:pod1,Namespace:ns1,},ExternalEntity:nil,IPs:[],Ports:[]NamedPort{},}
2020-11-21T00:33:46.3940455Z I1121 00:33:42.491956    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3941769Z I1121 00:33:42.491956    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3943051Z I1121 00:33:42.492956    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3944378Z I1121 00:33:42.493956    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3945655Z I1121 00:33:42.493956    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3946967Z I1121 00:33:42.494957    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3948382Z I1121 00:33:42.495956    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3949522Z I1121 00:33:42.495956    5780 reconciler.go:851] IPBlock 10.10.0.0/16 is using unsupported address family, skip it
2020-11-21T00:33:46.3950706Z I1121 00:33:42.496957    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3951976Z I1121 00:33:42.497957    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3953108Z I1121 00:33:42.497957    5780 reconciler.go:750] Can't find interface for Pod ns1/pod2, skipping
2020-11-21T00:33:46.3954189Z I1121 00:33:42.497957    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3955511Z I1121 00:33:42.498957    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3956790Z I1121 00:33:42.499958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3958386Z W1121 00:33:42.499958    5780 reconciler.go:925] Can not resolve port http for endpoints &GroupMember{Pod:&PodReference{Name:pod1,Namespace:ns1,},ExternalEntity:nil,IPs:[],Ports:[]NamedPort{},}
2020-11-21T00:33:46.3959962Z I1121 00:33:42.500958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3961239Z I1121 00:33:42.500958    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3962554Z I1121 00:33:42.501957    5780 reconciler.go:234] Reconciling rule ingress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3963824Z I1121 00:33:42.502957    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3965156Z I1121 00:33:42.502957    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3966474Z I1121 00:33:42.503957    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3967788Z I1121 00:33:42.504957    5780 reconciler.go:234] Reconciling rule egress-rule of NetworkPolicy K8sNetworkPolicy:ns1/name1
2020-11-21T00:33:46.3969013Z I1121 00:33:42.505957    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3970070Z I1121 00:33:43.007003    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3971170Z I1121 00:33:43.008006    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3972195Z I1121 00:33:43.511050    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3973292Z I1121 00:33:43.511050    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3974356Z I1121 00:33:44.011092    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3975449Z I1121 00:33:44.011092    5780 status_controller.go:209] Starting NetworkPolicy StatusController
2020-11-21T00:33:46.3976489Z I1121 00:33:46.316811    5780 status_controller.go:216] Shutting down NetworkPolicy StatusController
2020-11-21T00:33:46.3977232Z FAIL

The warning about a race condition is not very reassuring.

t.Fatalf("Expect the rule with id %v to be deleted from async rule cache", tt.expectedID)
}
_, exists, err := a.getRuleFromAsyncCache(tt.expectedID)
assert.Falsef(t, exists, "Expect rule to be not present in asyncRuleCache")
assert.NoErrorf(t, err, "getRuleFromAsyncCache should not return any error")

elapsedTime := time.Since(start)
assert.GreaterOrEqualf(t, int64(elapsedTime)/int64(time.Millisecond), int64(5), "rule should be there for at least 5ms")
assert.GreaterOrEqualf(t, int64(elapsedTime)/int64(time.Millisecond), int64(a.deleteInterval)/int64(time.Millisecond), "rule should be there for at least 5ms")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the message still says 5ms

@@ -222,7 +223,8 @@ func TestWorker(t *testing.T) {
}
return false, nil
}
if err := wait.PollImmediate(time.Millisecond, time.Millisecond*10, conditionFunc); err != nil {

if err := wait.PollImmediate(time.Millisecond, time.Second, conditionFunc); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should have some longer time intervals to reduce the probability of flakes in CI, e..g in the 100s of ms

Comment on lines 187 to 197
"delete-rule-with-async-delete-interval",
"delete-rule-with-min-async-delete-interval",
nil,
5 * time.Millisecond,
10 * time.Millisecond,
rule,
1,
nil,
},
{
"delete-rule-with-flow-poll-interval",
"delete-rule-with-async-delete-interval",
nil,
1 * time.Millisecond,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should rename testDeleteInterval to testMinDeleteInterval and then add a comment for each test:

  • delete-rule-with-min-async-delete-interval: 10ms is larger than testMinDeleteInterval, rule should take 10ms to be deleted
  • delete-rule-with-async-delete-interval: 1ms is smaller than testMinDeleteInterval, rule should take testMinDeleteInterval (5ms) to be deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for pointing the inconsistency with the test global variables. Modified the code to make it consistent with allocator.go.

@@ -222,15 +223,16 @@ func TestWorker(t *testing.T) {
}
return false, nil
}
if err := wait.PollImmediate(time.Millisecond, time.Millisecond*10, conditionFunc); err != nil {

if err := wait.PollImmediate(time.Millisecond, time.Second, conditionFunc); err != nil {
t.Fatalf("Expect the rule with id %v to be deleted from async rule cache", tt.expectedID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we validate the time it takes for the rule to be deleted? With the current code the test will pass even if the deletion takes 1s. That's also why I suggested using longer time intervals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used epsilonf to validate the time.

@srikartati
Copy link
Member Author

srikartati commented Nov 21, 2020

I hope this gets rid of the Windows unit test failures in CI, which seem to be pretty consistent now (50% of the time):

2020-11-21T00:33:25.5638749Z ok  	github.com/vmware-tanzu/antrea/pkg/agent/config	1.044s
2020-11-21T00:33:46.3667470Z ==================
2020-11-21T00:33:46.3668619Z WARNING: DATA RACE

...

2020-11-21T00:33:46.3977232Z FAIL


The warning about a race condition is not very reassuring.

Data race will be fixed by this patch as mentioned in the comment of Issue #1549.
Thanks for pointing out the issue regarding polling at small intervals in CI tests with race argument. Will fix this in the follow-up patch.

a.forgetRule(tt.rule.FlowID)

startTime := time.Time{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like you should just set startTime := time.Now(), just after calling forgetRule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing with time variable and changed it there. Forgot to revert it back. Thanks for catching.

elapsedTime := time.Since(start)
assert.GreaterOrEqualf(t, int64(elapsedTime)/int64(time.Millisecond), int64(5), "rule should be there for at least 5ms")
// Delta accounts for both deletion time of rule after adding it to deleteQueue and time taken to run the condition function once the poll happens.
assert.InDeltaf(t, int64(elapsedTime)/int64(time.Millisecond), int64(a.deleteInterval)/int64(time.Millisecond), 100, "rule should be there for at least %vms", int64(a.deleteInterval)/int64(time.Millisecond))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.InDeltaf(t, int64(elapsedTime)/int64(time.Millisecond), int64(a.deleteInterval)/int64(time.Millisecond), 100, "rule should be there for at least %vms", int64(a.deleteInterval)/int64(time.Millisecond))
assert.InDeltaf(t, int64(elapsedTime)/int64(time.Millisecond), int64(a.deleteInterval)/int64(time.Millisecond), 100, "rule should be in cache for about %v", a.deleteInterval)

@srikartati
Copy link
Member Author

/test-all

antoninbas
antoninbas previously approved these changes Dec 4, 2020
elapsedTime := time.Since(start)
assert.GreaterOrEqualf(t, int64(elapsedTime)/int64(time.Millisecond), int64(5), "rule should be there for at least 5ms")
// Delta accounts for both deletion time of rule after adding it to deleteQueue and time taken to run the condition function once the poll happens.
assert.InDeltaf(t, int64(elapsedTime)/int64(time.Millisecond), int64(a.deleteInterval)/int64(time.Millisecond), 100, "rule should be there for at least %v", a.deleteInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I pointed out before I think the message here may be a bit misleading. We check that the rule is in the cache for long enough, but we also check that it's not there for too long. I suggested ""rule should be in cache for about %v" before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. After changing the logic, the error message should have changed. Missed the message part in the suggestion.

@srikartati
Copy link
Member Author

/test-all

@srikartati
Copy link
Member Author

/test-all-features-conformance
/test-ipv6-ds-e2e
/test-ipv6-ds-networkpolicy
/test-ipv6-only-conformance

Changed the name from TestWorker to TestIDAllocatorWorker.
This also fixed the data race issue when running the test.
@srikartati
Copy link
Member Author

/test-all

@srikartati
Copy link
Member Author

e2e tests on kind clusters with noEncap issue will be fixed through PR #1626.
Merging this patch.

@srikartati srikartati merged commit 424bae0 into antrea-io:master Dec 8, 2020
@srikartati srikartati deleted the fix_test branch December 8, 2020 06:30
antoninbas pushed a commit that referenced this pull request Dec 23, 2020
Changed the name from TestWorker to TestIDAllocatorWorker.
This also fixes the data race issue when running the test along with the issue of flakiness in the test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestWorker/delete-rule-with-async-delete-interval is flaky when run in Github CI
5 participants