Different outputs with nequip 0.5.3 vs 0.5.0 for structure relaxation #173

davidleocadio · 2022-03-08T08:49:19Z

davidleocadio
Mar 8, 2022

Hello,

I was having good results for structure relaxation with nequip 0.5.0 (e3nn 0.3.5). After changing to nequip 0.5.3 (e3nn 0.4.4) and using a new .yaml file with, what I think are as identical as possible hyperparameters as my previous runs, I haven't gotten good results. I was wondering if you could look at my .yaml files for my new and old setup and tell me if you have any recommendations.

I have tried modifying the loss coefficients and other hyperparameters like using MSEloss instead of PerAtomMSELoss for energy, or changing the relative ratios of force and energy loss coefficients. But this hasn't helped either, so I don't think the problem stems from here. I've also changed the number of training points quite significantly and I can't get the same results as nequip 0.5.0. Could there be anything internal that changed quite a lot that would cause a difference in the predicted outputs? I trust the results from nequip 0.5.0 since they agree with the literature for my particular system.

By the way, as suggested in a prior discussion, I have also noted that structure minimization works better if you call model=model.double() before torch.jit.freeze(model) on nequip/scripts/deploy.py. Without this, structure relaxation wasn't finding correct minimas back when I was using nequip 0.5.0. Perhaps you have some comments on that as well, and maybe there is a way to deploy the model as float64 without having to modify your code. I think the way to do this is to add float64 as the option on the .yaml file for the field default_dtype. I've tried this too and I can't reproduce my results form nequip 0.5.0

Attached are the old vs new config .yaml files.

oldconfig.txt

newconfig.txt

Thanks!

Answered by Linux-cpp-lisp

Mar 8, 2022

what I think are as identical as possible hyperparameters as my previous run

Just a quick look makes clear that the hyperparameters are not identical; are you trying to reproduce your past run, or just get similarly good results with different hyperparameters?

In particular, this in newconfig.yaml:

per_species_rescale_scales: dataset_per_atom_total_energy_std

is NOT the defaults that oldconfig.yaml will use (since it doesn't provide any of these options). The default is dataset_forces_rms. This will make a difference, possibly a significant one, especially if your dataset contains little diversity of total energies but large forces on some atoms.

View full answer

Linux-cpp-lisp · 2022-03-08T15:56:32Z

Linux-cpp-lisp
Mar 8, 2022
Maintainer

Hi @davidleocadio,

A few questions:

Are you trying to reload trained weights from the old install or training from scratch?
Sounds like you get substantively different results in the two versions--- can you comment more on what "different" means? Different errors in training? Only different results for minimization?

Re float32 vs float64, yes, this is something that has been discussed before--- would you mind splitting "deploy for minimization in float64" into a separate discussion, and maybe a "Feature Request" issue as well so we can keep the conversation organized? I don't doubt what people are seeing here, but I want to understand this issue a little better before going for a solution... in particular I want to confirm that it really is enough to convert a trained float32 model into a float64 model, and why this is (numerics of autograd near minima with small forces, maybe?)

0 replies

Linux-cpp-lisp · 2022-03-08T16:05:56Z

Linux-cpp-lisp
Mar 8, 2022
Maintainer

what I think are as identical as possible hyperparameters as my previous run

Just a quick look makes clear that the hyperparameters are not identical; are you trying to reproduce your past run, or just get similarly good results with different hyperparameters?

In particular, this in newconfig.yaml:

per_species_rescale_scales: dataset_per_atom_total_energy_std

is NOT the defaults that oldconfig.yaml will use (since it doesn't provide any of these options). The default is dataset_forces_rms. This will make a difference, possibly a significant one, especially if your dataset contains little diversity of total energies but large forces on some atoms.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different outputs with nequip 0.5.3 vs 0.5.0 for structure relaxation #173

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Different outputs with nequip 0.5.3 vs 0.5.0 for structure relaxation #173

davidleocadio Mar 8, 2022

Replies: 2 comments

Linux-cpp-lisp Mar 8, 2022 Maintainer

Linux-cpp-lisp Mar 8, 2022 Maintainer

davidleocadio
Mar 8, 2022

Linux-cpp-lisp
Mar 8, 2022
Maintainer

Linux-cpp-lisp
Mar 8, 2022
Maintainer