Skip to content

Increase timeout of Windows test job #24881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 14, 2020
Merged

Conversation

dougbu
Copy link
Contributor

@dougbu dougbu commented Aug 13, 2020

- no consistency in failures, maybe we have more tests now?
  - both of the runs below involved test failures but results were not uploaded
- https://dev.azure.com/dnceng/public/_build/results?buildId=768850
- https://dev.azure.com/dnceng/public/_build/results?buildId=768703
@dougbu dougbu requested a review from a team August 13, 2020 18:55
@ghost ghost added the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label Aug 13, 2020
@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

Surprised to see the job that had been timing out succeeded in only 49 minutes because the two failing builds this morning logged test failures as well as the timeout. Will look at the *.dmp files from the failures before my final decision on this PR.

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

I'm not seeing any test assemblies in the *.dmp files. The smaller dump for https://dev.azure.com/dnceng/public/_build/results?buildId=768850 shows code in VBCSCompiler.dll was running while that for https://dev.azure.com/dnceng/public/_build/results?buildId=768703 was running Microsoft.Build.dll code (with the compiler nowhere in sight).

@HaoK @BrennanConroy and @davidfowl will dumps be collected just because a process happens to be running when a build is cancelled❔ Alternatively is there a way to determine what the dotnet process is really waiting for❔

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

Note the most recent rolling build was successful but didn't include any changes except to doc comments: fa2d86e...9f0ae10

@HaoK
Copy link
Member

HaoK commented Aug 13, 2020

The new stuff for hangs and dumps was helix specific, so its not going to help for windows test jobs unfortunately

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

🆗 Since the job usually finishes well before its current 180 minute timeout and just uploading a *.dmp file can take 15 minutes, I'm going to increase (double) the cancelTimeoutInMinutes instead. Hope is that will give job enough time to upload test results as well as any dumps.

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

I meant artifacts/logs/

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

@MattGal @ilyas1974 is something odd happening w/ https://dev.azure.com/dnceng/public/_build/results?buildId=769815❔ The job w/ the increased cancel timeout has been sitting in Queued state for a couple of hours now. Later builds seem to be getting Windows.Server.Amd64.VS2019.Open machines ahead of it.

@dougbu
Copy link
Contributor Author

dougbu commented Aug 13, 2020

More to the point, is my best option cancelling that build and starting the pipeline again❔

@ilyas1974
Copy link

it appears that the build is currently in progress so I do not believe any additional action needs to be taken.

@dougbu
Copy link
Contributor Author

dougbu commented Aug 14, 2020

Weird. I'm glad the job finally started running❕

@dougbu dougbu merged commit 9408ed8 into master Aug 14, 2020
@dougbu dougbu deleted the dougbu/update.test.job.timeout branch August 14, 2020 04:05
@MattGal
Copy link
Member

MattGal commented Aug 17, 2020

More to the point, is my best option cancelling that build and starting the pipeline again❔

If it's been > 90 minutes, definitely so. In those scenarios you're hitting https://github.com/dotnet/core-eng/issues/10136; in this situation we just don't have enough machines and AzDO has a hard 90 minute timeout they've offered to eventually extend, but who knows when...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants