Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Umbrella Issue] Many tests use Thread.sleep instead of synchronization primitives #1223

Open
brendandburns opened this issue Sep 3, 2020 · 14 comments
Assignees
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@brendandburns
Copy link
Contributor

Many of our tests simply have a Thread.sleep(duration) when testing multi-threaded code. This is inherently flaky.

Instead we should use synchronization classes (generally Semaphore) to synchronized the tests which is both faster in the general case and not flaky in the extreme case.

See for instance #1218 and #1219

@brendandburns
Copy link
Contributor Author

brendandburns commented Sep 3, 2020

  • util/src/test/java/io/kubernetes/client/informer/cache/ReflectorRunnableTest.java
  • extended/src/test/java/io/kubernetes/client/extended/workqueue/DefaultDelayingQueueTest.java
  • util/src/test/java/io/kubernetes/client/informer/cache/ProcessorListenerTest.java
  • util/src/test/java/io/kubernetes/client/informer/cache/SharedProcessorTest.java
  • util/src/test/java/io/kubernetes/client/CopyTest.java
  • util/src/test/java/io/kubernetes/client/informer/cache/ControllerTest.java
  • extended/src/test/java/io/kubernetes/client/extended/workqueue/DefaultWorkQueueTest.java
  • extended/src/test/java/io/kubernetes/client/extended/leaderelection/LeaderElectionTest.java
  • util/src/test/java/io/kubernetes/client/informer/impl/DefaultSharedIndexInformerTest.java
  • extended/src/test/java/io/kubernetes/client/extended/controller/LeaderElectingControllerTest.java
  • util/src/test/java/io/kubernetes/client/PortForwardTest.java
  • util/src/test/java/io/kubernetes/client/AttachTest.java
  • util/src/test/java/io/kubernetes/client/ExecTest.java
  • extended/src/test/java/io/kubernetes/client/extended/workqueue/ratelimiter/BucketRateLimiterTest.java
  • spring/src/test/java/io/kubernetes/client/spring/extended/controller/KubernetesInformerCreatorTest.java
  • spring/src/test/java/io/kubernetes/client/spring/extended/controller/KubernetesReconcilerCreatorTest.java
  • extended/src/test/java/io/kubernetes/client/extended/event/EventCorrelatorTest.java

@yue9944882
Copy link
Member

@brendandburns thanks for sorting out these potentially flaking tests.. will fix them later!

/assign

@sarveshkaushal
Copy link
Contributor

I would like to help :-)

@brendandburns
Copy link
Contributor Author

brendandburns commented Sep 3, 2020

@sarveshkaushal we'd welcome the help. In most cases the Thread.sleep(foo) is to wait for some other thread to complete it's work. Instead, we should create a Semaphore, then acquire() one or more permits for the executions we're waiting for, and then release() them in the Thread, then attempt to acquire() in the test thread to implement the wait.

See an example PR #1219

The reason that Thread.sleep() is flaky is because in a CI/CD system like GitHub actions the machines can be massively overloaded, which leads to much longer delays than you might otherwise see when running tests, so the sleep(..) ends up not being long enough. You can always make the sleep longer, but that makes the tests run longer (and it's still flaky in extreme cases) so using synchronization primitives is (almost) always the right answer.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 3, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 2, 2021
@brendandburns
Copy link
Contributor Author

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 9, 2021
@brendandburns brendandburns added the good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. label Jan 9, 2021
@Priya103
Copy link

If still available I would like to contribute

@Vishal-thakur-01
Copy link

well i am new to java and i would like to get in touch with you guys and try to do my best in solving all the issues hope you all will understand my curiosity and get in touch with me and help me to grow and learn.........

@Vyom-Yadav
Copy link

Vyom-Yadav commented Aug 18, 2022

@brendandburns I have one doubt, I was working on util/src/test/java/io/kubernetes/client/informer/cache/SharedProcessorTest.java, particularly on:

@Test
public void testListenerAddition() throws InterruptedException {
SharedProcessor<V1Pod> sharedProcessor = new SharedProcessor<>();
V1Pod foo1 = new V1Pod().metadata(new V1ObjectMeta().name("foo1").namespace("default"));
ProcessorListener.Notification<V1Pod> addNotification =
new ProcessorListener.AddNotification<V1Pod>(foo1);
ProcessorListener.Notification<V1Pod> updateNotification =
new ProcessorListener.UpdateNotification<V1Pod>(null, foo1);
ProcessorListener.Notification<V1Pod> deleteNotification =
new ProcessorListener.DeleteNotification<V1Pod>(foo1);
ExpectingNoticationHandler<V1Pod> expectAddHandler =
new ExpectingNoticationHandler<V1Pod>(addNotification);
ExpectingNoticationHandler<V1Pod> expectUpdateHandler =
new ExpectingNoticationHandler<V1Pod>(updateNotification);
ExpectingNoticationHandler<V1Pod> expectDeleteHandler =
new ExpectingNoticationHandler<V1Pod>(deleteNotification);
sharedProcessor.addAndStartListener(expectAddHandler);
sharedProcessor.addAndStartListener(expectUpdateHandler);
sharedProcessor.addAndStartListener(expectDeleteHandler);
sharedProcessor.distribute(addNotification, false);
sharedProcessor.distribute(updateNotification, false);
sharedProcessor.distribute(deleteNotification, false);
// sleep 1s for notification distribution completing
Thread.sleep(1000);
assertTrue(expectAddHandler.isSatisfied());
assertTrue(expectUpdateHandler.isSatisfied());
assertTrue(expectDeleteHandler.isSatisfied());
}

I can create a semaphore and acquire it, blocking the main thread, but how will I release so that blocked main thread will continue the execution, I looked at many tests, facing similar problem in many of them.

Edit-
I cannot access ExecutorService instance variable which would have been useful over here, please suggest.

@vinayak912002
Copy link

what are the things I need to learn in order to start contributiing to this project?

@Mohdcode
Copy link

what are the things I need to learn to start contributing to this project?

DSA, API, and Kubernetes or just contribute by documentation correction or making better changes

@Vyom-Yadav
Copy link

@brendandburns ping

@tmrpavan
Copy link

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests