Improve SSA dominator computation memory usage #14965

mikedn · 2017-11-09T20:35:59Z

This improves the memory usage (and speed) of the DF/IDF computation performed by SsaBuilder. The existing implementation relies too much on expensive hashtables when cheaper data structures (vectors) could be used instead.

SSA memory usage decreases by 29% and there's also a 0.24% drop in instructions retired:

PIN data: https://1drv.ms/x/s!Av4baJYSo5pjgrsEI2JcsoUdj3Bh8g

JitMemStats diff: https://gist.github.com/mikedn/a7fad8a509e414d6d9c9101e46818f62

No jit diffs.

BruceForstall · 2017-11-10T16:28:41Z

@dotnet/jit-contrib

rartemev · 2017-11-10T22:01:42Z

src/jit/ssabuilder.cpp

                                                      // add 1.
-    m_pDomPreOrder  = jitstd::utility::allocate<int>(m_allocator, bbArrSize);
-    m_pDomPostOrder = jitstd::utility::allocate<int>(m_allocator, bbArrSize);
+    m_pDomPreOrder  = new (&m_allocator) int[bbArrSize];


Shouldn't we introduce Allocator* GetAllocator() in order to avoid using & everywhere and make code a bit more readable?

Hmm, maybe. Or new operator overloads that take CompAllocator& instead of CompAllocator*. But note that this comes from a commit in another PR - #14953, I only added here so I can work until that PR is merged.

That other PR has been merged but I plan to clean up this and some other CompAllocator related issues in a future PR.

mikedn · 2017-11-12T11:47:25Z

The initial version of this PR also optimized dominator trees (they too make unnecessary use of hashtables) but I left that change out.

Changing hashtables to vectors changes the block visiting order and copy propagation is somehow affected by that and some differences appear in the generated code. Differences look correct (they should be, after all the original hastable order isn't somehow special) but I'd like to understand why the order matters in copy propagation.

mikedn · 2017-11-12T12:04:17Z

There may be still room for improvement:

It's not clear if using a hashtable to map blocks to their DF/IDF sets is the best solution. The premise is that not all blocks will have a non-empty DF/IDF and the hashtable is a sparse data structure (sort of). But some quick instrumentation of this code indicates that half of the blocks end up having a non-empty DF/IDF, that may mean that using a vector indexed by bbNum could be a better choice.
Using vectors for DF/IDF is easy but may also waste some memory due to vector resizing. A single linked list might be better, especially if nodes are pooled so they can be reused.

However, for now I'd rather fix other obvious inefficiencies (e.g SsaRenameState is another memory hog due to its use of jitstd::list for stacks) rather than put a lot of effort into measuring and then come back possibly empty handed.

Makes the code more readable and avoids the duplicated IsShort test that a separate IsMember/AddElemD may generate.

DF(b) can be stored in a vector instead of a hashtable (set). Nothing needs a O(1) membership test, the duplicates that may be generated during DF construction can be easily avoided by observing that: * each block in the graph is processed exactly once * during processing the block is added to the relevant DF vectors, at the end of each vector. If the same DF vector is encountered multiple times then checking if the block is already a member is trivial.

Like DF(b), IDF(b) can be stored in a vector if duplicates are avoided. This can be done by using a BitVector like TopologicalSort and ComputeImmediateDom already do. It's cheaper than using a hashtable (even though it requires a "clear" for every block that has a frontier). Also, storing IDF(b) into a vector makes it easier to track newly added blocks - they're at the end of the vector. Using a separate "delta" set is no longer needed.

mikedn · 2017-11-14T07:46:49Z

Tizen armel failed due to some docker repository connectivity issues: https://ci.dot.net/job/dotnet_coreclr/job/master/job/armel_cross_checked_tizen_prtest/771/consoleFull#-8536224876a086b3e-df04-41d2-bc4d-43e8f9406d07

@dotnet-bot test Tizen armel Cross Checked Innerloop Build and Test

BruceForstall

Thanks for the great work!

mikedn mentioned this pull request Nov 9, 2017

Track actual SSA memory usage #14953

Merged

mikedn force-pushed the ssa-mem-dom branch from 668c15c to 36c02fa Compare November 10, 2017 06:42

mikedn force-pushed the ssa-mem-dom branch from 36c02fa to 3017690 Compare November 10, 2017 21:34

rartemev reviewed Nov 10, 2017

View reviewed changes

mikedn force-pushed the ssa-mem-dom branch 2 times, most recently from e290108 to e25cb34 Compare November 12, 2017 11:28

mikedn added 4 commits November 14, 2017 08:12

Use the same block BitVec for topo sort and IDom

c8d1cce

Add BitVecOps::TryAddElemD

ed1e492

Makes the code more readable and avoids the duplicated IsShort test that a separate IsMember/AddElemD may generate.

mikedn force-pushed the ssa-mem-dom branch from e25cb34 to 62a386a Compare November 14, 2017 06:13

mikedn changed the title ~~[WIP] Improve SSA dominator computation memory usage~~ Improve SSA dominator computation memory usage Nov 14, 2017

BruceForstall approved these changes Nov 15, 2017

View reviewed changes

BruceForstall merged commit a7cf54f into dotnet:master Nov 15, 2017

mikedn deleted the ssa-mem-dom branch December 16, 2017 09:15

mikedn mentioned this pull request Feb 1, 2018

[WIP] If conversion #16156

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve SSA dominator computation memory usage #14965

Improve SSA dominator computation memory usage #14965

Uh oh!

mikedn commented Nov 9, 2017 •

edited

Loading

Uh oh!

BruceForstall commented Nov 10, 2017

Uh oh!

rartemev Nov 10, 2017 •

edited

Loading

Uh oh!

mikedn Nov 10, 2017

Uh oh!

mikedn Nov 14, 2017 •

edited

Loading

Uh oh!

mikedn commented Nov 12, 2017

Uh oh!

mikedn commented Nov 12, 2017

Uh oh!

mikedn commented Nov 14, 2017

Uh oh!

BruceForstall left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Improve SSA dominator computation memory usage #14965

Improve SSA dominator computation memory usage #14965

Uh oh!

Conversation

mikedn commented Nov 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BruceForstall commented Nov 10, 2017

Uh oh!

rartemev Nov 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikedn Nov 10, 2017

Choose a reason for hiding this comment

Uh oh!

mikedn Nov 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikedn commented Nov 12, 2017

Uh oh!

mikedn commented Nov 12, 2017

Uh oh!

mikedn commented Nov 14, 2017

Uh oh!

BruceForstall left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mikedn commented Nov 9, 2017 •

edited

Loading

rartemev Nov 10, 2017 •

edited

Loading

mikedn Nov 14, 2017 •

edited

Loading