Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short circuiting mod1 to speed offsetc #184

Merged
merged 2 commits into from
Nov 1, 2023
Merged

Short circuiting mod1 to speed offsetc #184

merged 2 commits into from
Nov 1, 2023

Conversation

kbarros
Copy link
Member

@kbarros kbarros commented Nov 1, 2023

For systems with very simple interactions, wrapping indices to the system's periodic boundaries can absorb a significant chunk of the total simulation time. The culprit is the modulo operation (used as mod1 in Julia). In the common case, the index will already be in-bounds, so no wrapping is needed. If this situation is detected, we can short-circuit the mod1 operation and immediately return the original index. Doing so reduces the overhead for index-wrapping from 16% to about 8% of the total run-time in the dynamics benchmark below. It should also provide speedups for the LocalSampler use-case.

using Sunny, Random, BenchmarkTools

# Simple model on square lattice
latvecs = lattice_vectors(1, 1, 10, 90, 90, 90)
cryst = Crystal(latvecs, [[0,0,0]])
sys = System(cryst, (10,10,1), [SpinInfo(1, S=1, g=2)], :dipole; seed=1)
J = -1.0
set_exchange!(sys, J, Bond(1, 1, (1, 0, 0)))

Random.seed!(0)
randomize_spins!(sys)

Δt = 0.1
integrator = ImplicitMidpoint(Δt)

function f(sys, integrator)
    for _ in 1:100_000
        step!(sys, integrator)
    end
end

@btime f(sys, integrator)

# Before PR: 4.31s
# After PR:  4.02s

@profview f(sys, integrator)

# Before PR: 16% in offsetc
# After PR:  8% in offsetc

@kbarros
Copy link
Member Author

kbarros commented Nov 1, 2023

Note that we previously avoided mod1 operations through a nested closure scheme involving a custom variant of the CartesianIndices iterator. That code was removed in #176. However, the shifted CartesianIndices scheme was not very effective. On Sunny@0.5.4 (with the shifted CartesianIndices) I find that the benchmark posted above takes 4.29s. So with the current PR, there is now a significant boost over all previous Sunny versions.

@kbarros kbarros merged commit efd5ce6 into main Nov 1, 2023
4 checks passed
@kbarros kbarros deleted the speedup branch November 1, 2023 03:26
kbarros added a commit that referenced this pull request Nov 13, 2023
#184 was mistakenly merged
prior to completing the optimization. Specifically, `offsetc` should
call `altmod1` instead of `mod1`. The former returns early if it detects
that there is no wrapping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant