Description
Recently, we updated Ahrefs codebase to the latest version of Dune (to be more specific, commit 71024ef) from the previous version we were using from back in April (7bb6de7), which was working fine.
After upgrading, all our Linux CI agents work perfectly fine, but on the macOS agents we have noticed that dune processes stay running forever, with CPU usage over 90%.
Here is some example of a dune process running for 55+minutes and with CPU at 100%:
$ ps aux | grep dune
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
user 3759 100.0 0.0 34302304 448 ?? R 3:20PM 55:12.57 /Users/user/builds/macvm100/ahrefs/monorepo-extra/_opam/bin/dune build -p dune-private-libs -j 7 @install
user 3737 0.0 1.1 34302560 93844 ?? S 3:19PM 0:04.48 /Users/user/builds/macvm100/ahrefs/monorepo-extra/_opam/bin/dune build -p dune-private-libs -j 7 @install
user 36569 0.0 0.0 34122844 856 s000 S+ 4:15PM 0:00.00 grep dune
Looking at different occurrences of the issue, I couldn't find any pattern, on the packages where it appears. What I can say is that it happens on different versions of macos. In particular, the one in the agent used in the command above is 12.3.1
, but it also happens on 12.6.1
and 13.0.1
.
I have tried to gather some information about what the hanging dune
process is exactly doing. Calling sample 3759
with the process id gathered with ps aux
above shows a lot of nested callbacks with camlDune_engine__Process__fun_4514
, not sure if this is expected. You can find the full output of that command here:
Is there anything I could do to help diagnose the problem?