Skip to content

testing azdo vm memory - wasm leg timeouts #111662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

pavelsavara
Copy link
Member

@pavelsavara pavelsavara commented Jan 21, 2025

AzDO builds of WASM legs are slow/killed/timeout about 50% of the time.

@pavelsavara
Copy link
Member Author

Observations from azdo build

Right now it shows Nothing to show. Final logs are missing. This can happen when the job is cancelled or times out. because Azdo killed the VM because it was OOM.

But it was showing following messages during build
image

And the reason is probably because there are multiple MSBuild processes in memory, eating gigabytes
image

RSS (resident set size KB) on msbuild totals about 13GB while the real work we are doing at that moment is running emcc/llvm/wasm-opt child process.

This is after the runtime was already build and we are building samples (and maybe tests). At that point we probably don't care about msbuild caches populated during libraries build.

@rainersigwald is there way how we can tell the msbuild to flush caches ?

@lewing @steveisok any advice on making the AzDO agent larger SKU ? At the moment we have 16GB RAM.

@pavelsavara pavelsavara changed the title testing azdo vm testing azdo vm memory - wasm leg timeouts Jan 23, 2025
@rainersigwald
Copy link
Member

is there way how we can tell the msbuild to flush caches ?

No, we don't have a hint for this, just normal GC triggers. We do keep a bunch of memory in weak-handle caches so GC should be able to free stuff up when it does trigger.

@pavelsavara
Copy link
Member Author

is there way how we can tell the msbuild to flush caches ?

No, we don't have a hint for this, just normal GC triggers. We do keep a bunch of memory in weak-handle caches so GC should be able to free stuff up when it does trigger.

"Have You Tried Turning It Off And On Again ?"

Is that what we need to do ? Run different subsets separately from top level shell ? Is there better way ?

@pavelsavara pavelsavara force-pushed the browser_troubleshoot_vm branch from 2359d42 to b430871 Compare January 23, 2025 16:46
@pavelsavara
Copy link
Member Author

Now I'm testing export DOTNET_GCHeapHardLimit=1610612736 on this PR

And GC.Collect(2) here

@github-actions github-actions bot locked and limited conversation to collaborators Feb 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants