improve ccd memory utilization with arena memory #966

thowell · 2026-01-03T19:37:05Z

improve memory utilization for ccd by implementing arena memory.

memory for ccd, the epa and multicontact arrays, is preallocated in a single wp.array that is added to Data as arena
arena memory is utilized within kernels to back specific epa or multicontact arrays
naccdmax is added to Data and specifies the arena size. this value is the maximum number of expected contacts for any ccd collision pair. since a kernel is launched for each ccd collision pair type, it is only necessary to allocate enough memory for the collision pair with the most contacts. (for scenes with many different types of ccd collision pairs, this memory savings can probably be significant)
only a subset of the epa arrays are necessary for multicontact. as a result, epa arrays that are not utilized by multicontact can be overwritten during multicontact, potentially further reducing memory utilization.

aloha pot

this pr

mjwarp-testspeed benchmark/aloha_pot/scene.xml --nconmax=24 --njmax=128 --memory

Total JIT time: 0.70 s
Total simulation time: 4.13 s
Total steps per second: 1,983,104
Total realtime factor: 3,966.21 x
Total time per step: 504.26 ns
Total converged worlds: 8192 / 8192

Total memory: 2002.56 MB / 48640.12 MB (4.12%)
Model memory (0.27%):
 (no field >= 1% of utilized memory)
Data memory (99.73%):
 geom_xmat: 57.38 MB (2.87%)
 efc.J: 96.00 MB (4.79%)
 arena: 1569.75 MB (78.39%)

this pr with --nccdmax=12

mjwarp-testspeed benchmark/aloha_pot/scene.xml --nconmax=24 --njmax=128 --memory --nccdmax=12

Total JIT time: 0.70 s
Total simulation time: 4.13 s
Total steps per second: 1,982,124
Total realtime factor: 3,964.25 x
Total time per step: 504.51 ns
Total converged worlds: 8192 / 8192

Total memory: 1217.69 MB / 48640.12 MB (2.50%)
Model memory (0.44%):
 (no field >= 1% of utilized memory)
Data memory (99.56%):
 geom_xpos: 19.12 MB (1.57%)
 geom_xmat: 57.38 MB (4.71%)
 qM: 18.00 MB (1.48%)
 qLD: 16.53 MB (1.36%)
 efc.J: 96.00 MB (7.88%)
 arena: 784.88 MB (64.46%)

the reduction in ccd memory is ~50%

main (e3bd7c6)

note: ccd memory utilization is not currently reported on main branch

mjwarp-testspeed benchmark/aloha_pot/scene.xml --nconmax=24 --njmax=128 --memory

Total JIT time: 0.67 s
Total simulation time: 4.16 s
Total steps per second: 1,968,630
Total realtime factor: 3,937.26 x
Total time per step: 507.97 ns
Total converged worlds: 8192 / 8192

Total memory: 432.81 MB / 48640.12 MB (0.89%)
Model memory (1.24%):
 (no field >= 1% of utilized memory)
Data memory (98.76%):
 xfrc_applied: 4.88 MB (1.13%)
 xmat: 7.31 MB (1.69%)
 ximat: 7.31 MB (1.69%)
 geom_xpos: 19.12 MB (4.42%)
 geom_xmat: 57.38 MB (13.26%)
 site_xmat: 4.78 MB (1.10%)
 cinert: 8.12 MB (1.88%)
 actuator_moment: 10.06 MB (2.32%)
 crb: 8.12 MB (1.88%)
 qM: 18.00 MB (4.16%)
 qLD: 16.53 MB (3.82%)
 cvel: 4.88 MB (1.13%)
 cacc: 4.88 MB (1.13%)
 cfrc_int: 4.88 MB (1.13%)
 cfrc_ext: 4.88 MB (1.13%)
 contact.frame: 6.75 MB (1.56%)
 contact.efc_address: 4.50 MB (1.04%)
 efc.J: 96.00 MB (22.18%)
 efc.quad: 12.00 MB (2.77%)
 subtree_bodyvel: 4.88 MB (1.13%)

Kenny-Vilella

Very good work !

Just have a few minor comments.

Kenny-Vilella · 2026-01-07T05:15:49Z

mujoco_warp/testspeed.py

+    print(
+      f"Data\n  nworld: {d.nworld} naconmax: {d.naconmax} njmax: {d.njmax}" + f" naccdmax: {d.naccdmax}\n"
+      if d.naccdmax != d.naconmax
+      else "\n"


Is there a reason to not print anything if naccdmax != naconmax?

Kenny-Vilella · 2026-01-07T05:21:58Z

mujoco_warp/_src/types.py

    ncollision: collision count from broadphase                 (1,)
+    naccdmax: Maximum number of CCD contacts
+    nccd: geom-geom pair counter for arena slots                (len(GeomType)*(len(GeomType)+1)/2,)
+    arena: Arena memory for CCD                                 (narena,)


[nitpick] Not sure if it is clear what narena is.

Kenny-Vilella · 2026-01-07T05:24:05Z

mujoco_warp/_src/types.py

  ncollision: array(1, int)
+
+  # warp only: preallocated arena for convex collision scratch memory
+  naccdmax: int  # max number of CCD contacts


[nitpick] I would remove the comments as it is the only place with such comments in the file

Kenny-Vilella · 2026-01-07T05:42:40Z

mujoco_warp/_src/io.py

    njmax: Number of constraints to allocate per world. Constraint arrays are
           batched by world: no world may have more than njmax constraints.
    naconmax: Number of contacts to allocate for all worlds. Overrides nconmax.
+    naccdmax: Maximum number of CCD contacts. Defaults to naconmax.


Should we say clearly that naccdmax value has priority over nccdmax value?
Same for nconmax/naconmax actually.

Kenny-Vilella · 2026-01-07T05:45:50Z

mujoco_warp/_src/io.py

+  epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face.
+  """
+  MJ_MAX_EPAFACES = 5
+  MJ_MAX_EPAHORIZON = 12


These two values should probably be imported from types

Kenny-Vilella · 2026-01-07T05:59:52Z

mujoco_warp/_src/io.py

+    return naccdmax * epa_total_per_collision
+
+  # multiccd arrays
+  # polygon, clipped: 2 * nmaxpolygon vec3s each


Is there a reason to use "vec3s" with a s?

Kenny-Vilella · 2026-01-07T06:11:36Z

mujoco_warp/_src/collision_convex.py

+  epa_map, epa_horizon). The multicontact inputs are placed first:
+  epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face.
+  """
+  epa_vert_dim = 5 + epa_iterations


[nitpick] It is a bit strange to sometimes use epa_iterations and sometimes ccd_iterations.
I assume that these two terms will always be equal, if it is not the case, then I will double check that they are use appropriately throughout the code.

Kenny-Vilella · 2026-01-07T06:24:48Z

mujoco_warp/_src/collision_convex.py

-  # epa_horizon: index pair (i j) of edges on horizon
-  epa_horizon = wp.empty(shape=(d.naconmax, 2 * MJ_MAX_EPAHORIZON), dtype=int)
+  # reset ccd arena counter
+  d.nccd.zero_()


Is this actually needed?

Kenny-Vilella · 2026-01-07T06:53:20Z

mujoco_warp/_src/collision_convex.py

-    epa_horizon = epa_horizon_in[tid]
+    # construct epa arrays from arena
+    # multicontact inputs first (epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face)
+    base_offset = arenaid * wp.static(per_collision_size)


This is interesting.
We are now allocating per-collision within the array, while the former approach is allocating per-array.

I wonder if we see any perf difference with the different memory layout.

Kenny-Vilella · 2026-01-07T07:02:53Z

mujoco_warp/_src/collision_convex.py

+  # epa arrays used by multicontact
+
+  # epa_vert1: vertices in EPA polytope in geom 1 space
+  layout["epa_vert1"] = (offset, epa_vert_dim)


[nitpick] Not totally convinced that we should keep the dim here, it makes the code a bit inconsistent below.
But it seems to be quite subjective so please feel free to follow what you think is best.

ccd arena

22d8af7

thowell force-pushed the ccd_arena branch from 89cf2c0 to 22d8af7 Compare January 3, 2026 19:43

update comments

43360d8

thowell requested review from adenzler-nvidia, erikfrey and kbayes January 4, 2026 16:09

This was referenced Jan 6, 2026

CCD dataclass for ccd memory #925

Closed

NewtonSolver dataclass for newton solver memory #927

Closed

thowell linked an issue Jan 6, 2026 that may be closed by this pull request

Optimize GJK device memory usage #816

Open

thowell requested a review from Kenny-Vilella January 6, 2026 19:12

erikfrey mentioned this pull request Jan 6, 2026

Report other memory, e.g. inline allocation. #945

Merged

Kenny-Vilella approved these changes Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve ccd memory utilization with arena memory #966

improve ccd memory utilization with arena memory #966

Uh oh!

thowell commented Jan 3, 2026

Uh oh!

Kenny-Vilella left a comment

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Kenny-Vilella Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

improve ccd memory utilization with arena memory #966

Are you sure you want to change the base?

improve ccd memory utilization with arena memory #966

Uh oh!

Conversation

thowell commented Jan 3, 2026

Uh oh!

Kenny-Vilella left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants