Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased flakiness on Travis over the last 1-2 weeks #14800

Closed
bradfrizzell opened this issue Apr 23, 2018 · 5 comments
Closed

Increased flakiness on Travis over the last 1-2 weeks #14800

bradfrizzell opened this issue Apr 23, 2018 · 5 comments

Comments

@bradfrizzell
Copy link
Contributor

My team and I have all been experiencing greatly increased flakiness on Travis the last 2 weeks or so. In addition to integration tests being perhaps a bit more flaky than usual, the unit tests have also started to flake at high frequency whereas they used to never flake at all. Is there any understanding of why this is happening now, and is there any way we can try to help with remedying the situation?

@cathyxz
Copy link
Contributor

cathyxz commented Apr 23, 2018

I'm not sure about the root cause, but I know that @rsimha is working on increasing our resource limits for both Travis and Saucelabs. One part of the flakiness issue may simply be that our PR count and CI load has increased a lot, so longer queue times are making flakiness extra painful.

A couple of things that we can do is:

  1. Aggressively skip flaky tests and leave issues (assigned to component owners) to fix them. Basically if you see a flaky test, skip it.
  2. If you are a test owner and your test flakes, fix it.
  3. There is also follow up work to be done from: Fix all tests that were silently passing in spite of AMP runtime errors #14406. Basically there are a lot of tests with console errors that need debugging / fixing (some of these are actually quite trivial and are just a matter of adding allowConsoleError).

So basically, if you happen to come across tests that flake (yours or not yours), feel free to skip the flaky ones and file an issue, feel free to fix or even just mark (with a TODO + issue number) any console error issues with the test, and fix tests where possible.

Tagging @rsimha for more context.

@cvializ
Copy link
Contributor

cvializ commented Apr 23, 2018

Logs that reproduce the error: https://travis-ci.org/ampproject/amphtml/jobs/370168309

@rsimha
Copy link
Contributor

rsimha commented Apr 23, 2018

We're tracking this via a ticket I've opened with Sauce Labs. Meanwhile, I've merged #14814, which should mitigate the issue on our side by force-quitting the gulp test process on Travis after all browsers have reported test completion.

Occurrences of Travis builds timing out due to no activity during gulp test should go away after this.

@rsimha
Copy link
Contributor

rsimha commented Apr 23, 2018

@bradfrizzell In addition to my comment above, a good way to help reduce unit test flakiness is to pay attention to the console.error output that's printed at the end. Right now, we're seeing the AMP runtime report errors from several tests. We can't suddenly start failing all those tests, so we're printing warnings for now. It would greatly improve things if owners of tests were to run them and fix / address the errors printed during their tests.

For example, I'm seeing ~50 errors being reported when I run the a4a unit tests. See https://gist.github.com/rsimha/df05d98a891c850a067ef17b56f13cd8

@rsimha
Copy link
Contributor

rsimha commented Apr 25, 2018

I'm closing this because the issue is being tracked via #14848.

Meanwhile, I'd encourage you to look into the console errors in https://gist.github.com/rsimha/df05d98a891c850a067ef17b56f13cd8

@rsimha rsimha closed this as completed Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants