Separated steering vector construction for active, other cells #415
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Separating the melting/activation of cells above/below the liquidus (which is now faster as only the relevant cells are iterated over on a given time step) from the iterations over the list of undercooled active cells, thread divergence was significantly reduced. Speedup of around 3-4x on CPU and around 10-25% on GPU, with larger speedups for larger problems, particularly those with many melting/solidification events in the same cell on a given time step. This change also reduces memory usage for problems where the number of times a cell melts/solidifies is highly variable across a given domain.