-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advanced Audit tests flaking #60719
Comments
/milestone clear |
/milestone v1.10 |
/milestone v1.10 |
@BenTheElder: You must be a member of the kubernetes-milestone-maintainers github team to set the milestone. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
From apiserver log for run ci-kubernetes-e2e-gci-gce/22334:
I'm pretty sure this is what's happening: the default audit mode changed to batch for the logging backend, but the audit test expects the logs to be there immediately. Recommended Action:
If we rollback #60237 it will cause problems with the scale tests that were re-enabled in kubernetes/test-infra#7000 |
http://k8s-testgrid.appspot.com/sig-release-master-blocking#gci-gce |
Actually, the forward fix would break scalability anyway. I'll send a rollback. |
/status in-progress |
No, because audit logging tests are not enabled in scalability tests, there's |
I think the e2e test fix is better in this case |
Filed #60794 to fix the problem I still believe that using buffered audit logging is a better default, since slowing down all api requests is a pretty serious drawback and should be enabled consciously |
@liggitt @tallclair If you're fine with forward-fixing the problem in e2es, I'd prefer this way, it's the easiest. Fix is in #60794 |
If we make batch mode the default, I think we should adjust the default parameters to include:
I'm OK moving forward with the test fix, but I'd like to fix those defaults before 1.10 is cut, if possible. |
I agree that defaults should change. What about changing the batch size to 1, so that each log entry is written asynchronously in its own goroutine? Anyways, this or very low timeout SGTM, since I don't think that the second option is feasible right now. Anyways, let's continue the discussion in #60739, since it's unrelated to the flaking. |
Automatic merge from submit-queue (batch tested with PRs 60630, 60794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Add retrying to audit logging e2e tests Fixes #60719 Adds retrying to the audit logging e2e tests so it can work when audit logging is in batch mode and actual writing is delayed. ```release-note NONE ``` /cc @tallclair @liggitt @sttts
still failing in upgrade test... e2e fix likely needs backport to 1.9 e2es |
Good point! Created #61134 to address it |
[MILESTONENOTIFIER] Milestone Issue Needs Attention @crassirostris @liggitt @tallclair @kubernetes/sig-auth-misc Action Required: This issue has not been updated since Mar 13. Please provide an update. Note: This issue is marked as Example update:
Issue Labels
|
…-#60794-upstream-release-1.9 Automatic merge from submit-queue. Automated cherry pick of #60794: Add retrying to audit logging e2e tests Cherry pick of #60794 on release-1.9. Fixes #60719, since audit logging behavior has changed in 1.10. Purely e2e change, so no release note #60794: Add retrying to audit logging e2e tests ```release-note NONE ``` /cc @sttts @liggitt
fixed in upgrade test by #61134 |
Started flaking yesterday: https://storage.googleapis.com/k8s-gubernator/triage/index.html?test=Advanced%20Audit
Seems to coincide with #60237
/assign @tallclair @crassirostris
need to triage this ASAP to know if we need to roll back that PR
The text was updated successfully, but these errors were encountered: