-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Prevent excessive compilation of files within a cycle #33061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent excessive compilation of files within a cycle #33061
Conversation
Cycles within a project previously would emit (to both memory and then later to disk) multiple times per file. This would not have been possible without the large amount of time that @bjlaub put into debugging.
Just to provide some more background on our testing and debugging with this change: For our tests, we typically ran the compiler in watch mode, with project references At this point our developers have been using a 6GB max heap (because more memory To trigger the behavior we were seeing, we had to do the following:
We started testing with Node 8.15.1 (the version our codebase is currently building We then switched to Node 12.9.0, which has a newer V8 which seems to perform differently; We ran our experiments with the
until it eventually crashes. We discovered that tsc was calling the In a "clean" state (i.e. one without any incremental build info), our problematic
To test the theory that tsc was re-emitting the same source files multiple times and
After running with this change, we were able to run our test case successfully without Diagnostics:
And memory usage:
We removed the "caching" hack and instead added the change proposed on this PR, which Diagnostics:
And memory usage:
|
@ericanderson the repro steps you have mentioned don't seem to work:
Then built changed a.ts as mentioned and here is the result and you can see a.js and b.js is built only once.
|
Sorry. I forgot to push. One second. Good to go! |
I walked the debugger and can assure you the extra pass is doing real work. Full on re-emits. And its holding the memory until its time to put the files to disk. And @bjlaub said in his testing it was even writing multiple times. |
@bjlaub says he added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to repro the issue .. And The fix isnt corrrect. The fix is to set add seenEmittedFiles.set(affectedFile.path, true)
at https://github.com/microsoft/TypeScript/blob/master/src/compiler/builder.ts#L384
@@ -997,7 +997,21 @@ namespace ts { | |||
} | |||
|
|||
function addToAffectedFilesPendingEmit(state: BuilderProgramState, affectedFilesPendingEmit: readonly Path[]) { | |||
state.affectedFilesPendingEmit = concatenate(state.affectedFilesPendingEmit, affectedFilesPendingEmit); | |||
const actualAffectedFilesPendingEmit = affectedFilesPendingEmit.filter((f) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to repro the issue. The current fix isnt corrrect.
The fix is to set add seenEmittedFiles.set(affectedFile.path, true)
at https://github.com/microsoft/TypeScript/blob/master/src/compiler/builder.ts#L384
return; | ||
} | ||
|
||
state.affectedFilesPendingEmit = concatenate(state.affectedFilesPendingEmit, actualAffectedFilesPendingEmit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test in src/testRunner/unittests/tscWatch/emitAndErrorUpdates.ts
that has host.writeFile overriding the original one that also checks that file is emitted only once in the repro test case you have mentioned.
Actually looking into this, i realise the correct fix is might need more investigations and ensuring that its enough. So i will just take this over. Will report back when i have fix and get everyone here to see if that fixes the issue. Thanks for looking into this. |
Am I reading this right? It looks like |
That's correct and that needs to be handled in multiple scenarios like emitting through getNextAffectedFile , getnextAffectedFilePendingEmit and getNextAffectedFile when semantic diagnostics postpone the emit.. |
Happy to have you chase down the rest of the bug. Thank you very much! |
I would love to learn how all of this works in more detail. Like how the diagnostics can postpone the emit? Perhaps at tsconf? Would PRs be accepted that clean up some of this code? I found certain bits to be hard to follow do to the extra null checking and delayed creation of certain arrays. |
@sheetalkamat any luck so far? |
#33145 handles this. Thanks everyone for helping with the repro and investigation. |
…as complete (#33145) * Add test that fails because file is written multiple times Reported from #33061 * Handle seenEmittedFiles which was not being set when emit of a file was complete. It was issue only when errors are reported before emitting (which puts the files into pendingEmit that needs to check only in seenEmittedFiles) If emit happens before semantic diagnostics query this issue is not repro, because the affected files come into play and those are being set correctly Fixes #31398 * make baselining source map optional * Handle emitDeclarationOnly in --build scenario * Ensure we are using d.ts emit as signature even when --declarationMap is on (map files are emitted before d.ts) * Move module specifiers to verifyTsBuildOutput * implement create Hash to be default hashing plus data so we can verify it easily in baseline * Remove failing baseline * Accept correct baseline name
…as complete (microsoft#33145) * Add test that fails because file is written multiple times Reported from microsoft#33061 * Handle seenEmittedFiles which was not being set when emit of a file was complete. It was issue only when errors are reported before emitting (which puts the files into pendingEmit that needs to check only in seenEmittedFiles) If emit happens before semantic diagnostics query this issue is not repro, because the affected files come into play and those are being set correctly Fixes microsoft#31398 * make baselining source map optional * Handle emitDeclarationOnly in --build scenario * Ensure we are using d.ts emit as signature even when --declarationMap is on (map files are emitted before d.ts) * Move module specifiers to verifyTsBuildOutput * implement create Hash to be default hashing plus data so we can verify it easily in baseline * Remove failing baseline * Accept correct baseline name
Cycles within a project previously would emit (to both memory and then later to disk) multiple times per file.
This would not have been possible without the large amount of time that @bjlaub put into debugging.
This was affecting us on versions 3.4 and 3.5 and master.
Fixes #31398
Nature: When using a tsbuildinfo file, a project with code that was circular in nature, would often OOM the compiler and would take far longer to rebuild for a tiny change vs just rebuilding from scratch.
Repro:
emitter.ts
in the fileprintSourceFileOrBundle
, first line addconsole.log(
=== printSourceFileOrBundle: jsFilePath=${jsFilePath} sourceMapFilePath=${sourceMapFilePath});
node ../locationOfTypescriptCheckout/built/local/tsc.js --build packages/*
a.ts
to addfoo: any
to the interfaceA
.node ../locationOfTypescriptCheckout/built/local/tsc.js --build packages/*
Internally to our company, we have a project that was building in about 110s when there were no tsbuildinfo files. When starting with tsbuildinfo files in place and making a trivial change it would oom 4gb and if we set it to 6gb it would take over 20m to complete.
After this change we can get back to a 3gb heap and the rebuild from a tsbuildinfo file takes 108 seconds.
cc/ @ahejlsberg @weswigham @sheetalkamat @RyanCavanaugh @DanielRosenwasser
Shoutout to @bjlaub for the countless hours of debugging to track this down.