Skip to content

Conversation

@MattRolchigo
Copy link
Collaborator

Separating the melting/activation of cells above/below the liquidus (which is now faster as only the relevant cells are iterated over on a given time step) from the iterations over the list of undercooled active cells, thread divergence was significantly reduced. Speedup of around 3-4x on CPU and around 10-25% on GPU, with larger speedups for larger problems, particularly those with many melting/solidification events in the same cell on a given time step. This change also reduces memory usage for problems where the number of times a cell melts/solidifies is highly variable across a given domain.

@MattRolchigo MattRolchigo added the performance Simulation performance related label Sep 18, 2025
Copy link
Collaborator

@streeve streeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First look

Copy link
Collaborator

@streeve streeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth considering removing the FutureActive type later since there are so few left (but probably worse performance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Simulation performance related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants