Skip to content

Conversation

@thowell
Copy link
Collaborator

@thowell thowell commented Jan 1, 2026

improve EPA memory utilization by computing a dot product of 3-vectors inline instead of storing the result

aloha_pot

mjwarp-testspeed benchmark/aloha_pot/scene.xml --nconmax=24 --njmax=128

this pr

Summary for 8192 parallel rollouts

Total JIT time: 0.69 s
Total simulation time: 4.17 s
Total steps per second: 1,962,558
Total realtime factor: 3,925.12 x
Total time per step: 509.54 ns
Total converged worlds: 8192 / 8192

main (24e07de)

Summary for 8192 parallel rollouts

Total JIT time: 0.66 s
Total simulation time: 4.18 s
Total steps per second: 1,959,876
Total realtime factor: 3,919.75 x
Total time per step: 510.24 ns
Total converged worlds: 8192 / 8192

throughput remains about the same

@thowell thowell requested a review from kbayes January 2, 2026 14:23
@thowell thowell linked an issue Jan 6, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize GJK device memory usage

1 participant