You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
A recurring issue seen with alchemical free energy calculations with SOMD2 is that occasionally trajectories terminate early due to a 'NaN' generated after an integration step. We have also seen cases of trajectories showing transient spikes in non-bonded energies that we would expect cause a numerical integration error.
Because of the stochastic nature and rare frequency of the issue it is difficult to isolate the source of the error.
Describe the solution you'd like
A 'debug' mode that enables buffering of coordinates and energies for the past N integration time-steps would be helpful. The code could be updated to write this information in molecular file formats to allow visualisation of the trajectory in the few steps immediately before a crash occurs.
Describe alternatives you've considered
This could be in principle implemented at the python API by adding extra logic to save/overwrite snapshots after every MD time-step. However this would likely be very slow and make it difficult to re-generate in a timely manner NaN crashes.
We could however buffer internally coordinates and forces and write them to disk only when a crash has been triggerred. There is already low-level logic in the code to attempt to deal with NaN errors by performing energy minimisation. Some compromise on speed (a few fold) would be acceptable for troubleshooting purposes.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
A recurring issue seen with alchemical free energy calculations with SOMD2 is that occasionally trajectories terminate early due to a 'NaN' generated after an integration step. We have also seen cases of trajectories showing transient spikes in non-bonded energies that we would expect cause a numerical integration error.
Because of the stochastic nature and rare frequency of the issue it is difficult to isolate the source of the error.
Describe the solution you'd like
A 'debug' mode that enables buffering of coordinates and energies for the past N integration time-steps would be helpful. The code could be updated to write this information in molecular file formats to allow visualisation of the trajectory in the few steps immediately before a crash occurs.
Describe alternatives you've considered
This could be in principle implemented at the python API by adding extra logic to save/overwrite snapshots after every MD time-step. However this would likely be very slow and make it difficult to re-generate in a timely manner NaN crashes.
We could however buffer internally coordinates and forces and write them to disk only when a crash has been triggerred. There is already low-level logic in the code to attempt to deal with NaN errors by performing energy minimisation. Some compromise on speed (a few fold) would be acceptable for troubleshooting purposes.
The text was updated successfully, but these errors were encountered: