-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Deletes scan jobs if their pods have been deleted and logs can't be c… #956
Fix: Deletes scan jobs if their pods have been deleted and logs can't be c… #956
Conversation
84ee6b9
to
138de88
Compare
Codecov Report
@@ Coverage Diff @@
## main #956 +/- ##
==========================================
- Coverage 64.73% 64.27% -0.47%
==========================================
Files 59 59
Lines 6970 7014 +44
==========================================
- Hits 4512 4508 -4
- Misses 1999 2046 +47
- Partials 459 460 +1
Continue to review full report at Codecov.
|
75ad698
to
ad246fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this enhancement @jw-s Overall the controllers code looks good to me.
The only thing I'm not comfortable with are integration tests that you wrote for "relatively rare" corner case when the operator's cache cannot catch up with K8s API Server state and parse logs.
What's more, I'm afraid that the integration tests might became flaky because we cannot reliably control Pods managed by JobController, which is part of K8s controller-manager. (See failed PR checks.)
To sum up, I'd just add unit tests whenever possible to keep integration (~ e2e) test nice and lean so we can afford running them on each PR. See this nice and detailed explanation why I think we should not validate everything at the integration tests phase https://martinfowler.com/articles/practical-test-pyramid.html.
Yeah I completely agree, this area is definitely flake provoking and i'm happy to remove these tests (Though the failure is not because of a flake but actually something else). Do we have an area where we can add unit tests to cover these cases? I couldn't seem to find it. |
We'd have to add _test.go files to implement unit tests for reconcilers. So far we only have integration tests for controllers. If this requires too much effort I'm also happy to approve this PR without very strict requirement for unit testing. |
ad246fc
to
5e192a9
Compare
…ollected Signed-off-by: Joel Whittaker-Smith <jdws.dev@gmail.com>
Signed-off-by: Joel Whittaker-Smith <jdws.dev@gmail.com>
5e192a9
to
d5ffbe8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jw-s for the contribution. The code LGTM!
Fixes #873 and for other scan types which are not mentioned in this issue
The problem which is fixed in this PR is the following:
When a scan job creates the pod which handles scans, if this pod has been deleted before starboard parses the pod's logs, these jobs are never deleted which means when the max concurrent job setting is hit, part of this threshold is eaten up by jobs which never get deleted.
Signed-off-by: Joel Whittaker-Smith jdws.dev@gmail.com