-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure that supercell_matrices of single-atom-displaced and rattled supercells are the same per default #258
Conversation
…from hyperparameter header
… (hyper)params and full for Database type if not defined otherwise
This comment was marked as outdated.
This comment was marked as outdated.
self.supercell_settings[mp_id][ | ||
"supercell_matrix" | ||
] = supercell_matrix_job.output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JaGeo , I restructured the code in a way that the reduce_supercell_size_job
is only called once and the matrix is passed like that. But I have left the reduce_supercell_size_job
(which only will be triggered if no matrix is gieven) in the other functions as it was before, too, in case a user wants to use the single-atom displ. and rattled supercell constructors separately.
I have thought a lot about it and I don't see a reason why single-atom displ. and rattled supercells of the same wf should have different settings. Do you think I shall still add the possibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be that someone wants it. At the moment, I also do not see a clear use case. Let's, for now, keep it with this single job. Just make sure everything works with this functionality for now and we just keep in mind that we don't have it implemented yet :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see if I can maybe put some "NotImplemented" warning. In principle the user can still at the moment pass different settings if they call the supercell constructors separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a simple user info instead
*Si like silicon in the commit 🙈 |
src/autoplex/data/phonons/flows.py
Outdated
@@ -393,7 +393,7 @@ def make( | |||
supercell_matrix_job = reduce_supercell_size_job( | |||
structure=structure, | |||
min_length=self.supercell_settings.get("min_length", 12), | |||
max_length=self.supercell_settings.get("max_length", 25), | |||
max_length=self.supercell_settings.get("max_length", 20), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the default max length of 20 determined? Is this a commonly used value in the filed? If a lattice constant of a unit cell already exceeds 20, what would happen? Would the scale factor be set to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the length of 20 appears to cause the least problems for the run time and yields good supercells. If a lattice parameter of a unit cells exceeds this value, it won't be scaled and stay like it is. But for the phonon part, so large unit cells are not intended to be used (on a regular basis).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some large systems like MOFs, such situations are common, but since we can customize them, this shouldn't be an issue.
tests/auto/test_auto_flows.py
Outdated
@@ -584,7 +584,7 @@ def test_complete_dft_vs_ml_benchmark_workflow_m3gnet( | |||
assert complete_workflow_m3gnet.jobs[4].name == "complete_benchmark_mp-22905" | |||
assert responses[complete_workflow_m3gnet.jobs[-1].output.uuid][1].output[0][0][ | |||
"benchmark_phonon_rmse"] == pytest.approx( | |||
5.2622804443539355, abs=1.0 # bad fit data | |||
5.2622804443539355, abs=3.0 # bad fit data, fluctuates between 4 and 7 | |||
) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this slight fluctuation result from the fitting, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we would need more fit data, but then the unit tests would also take longer.
src/autoplex/auto/phonons/flows.py
Outdated
@@ -442,7 +442,6 @@ def make( | |||
dft_references=dft_references, | |||
supercell_settings=self.supercell_settings, | |||
displacement=displacement, | |||
# TODO add a hyper parameter here for the benchmark | |||
atomwise_regularization_parameter=atomwise_reg_parameter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the docstrings of CompleteDFTvsMLBenchmarkWorkflow
, should we unify the formatting since some start with lowercase letters while others start with uppercase letters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should indeed agree on a convention here. This is something that I can fix in the documentation pull-request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
@@ -2,15 +2,16 @@ | |||
"GAP": { | |||
"general": { | |||
"at_file": "train.extxyz", | |||
"default_sigma": "{0.0001 0.05 0.05 0}", | |||
"default_sigma": "{0.0001 0.1 0.05 0}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For accurate phonon calculations, achieving high precision in force evaluations is crucial. While I understand that specific settings have been made for each structure, I believe a default force sigma of 0.05 would be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, from the Si JCP2020 paper settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still recommend to use 0.05
. 0.1
still seems too large. For most phonon calculations, when I don't use single-atom rattling, I prefer not to enable the automatic sigma setting. In this case, I can use the default value of 0.05 for fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I will change this back! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you
"energy_parameter_name": "REF_energy", | ||
"force_parameter_name": "REF_forces", | ||
"virial_parameter_name": "REF_virial", | ||
"sparse_jitter": 1.0e-8, | ||
"do_copy_at_file": "F", | ||
"openmp_chunk_size": 10000, | ||
"gp_file": "gap_file.xml", | ||
"two_body": true, | ||
"e0_offset": 2.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we force e0_offset to equal 2 instead of using the default value of 0? This setting is not common in GAP fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I put the Si settings from the JCP2020 paper. But I can also set it to 0 if you think that this is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please set it to 0. The program will automatically calculate this value, or the user can customize it. A value of 2 might be a specific setting for Si.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
"l_max": 10, | ||
"n_max": 12, | ||
"l_max": 12, | ||
"n_max": 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would increase n_max before increasing l_max. Or we can stick to n_max=l_max=10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I also use Si settings from JCP2020 paper. But I can doublecheck that I didn't mix it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default values here, including those mentioned earlier, are intended to be applicable to most systems. Increasing l_max would greatly impact computational efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I'll change it back too.
As discussed here #243 (comment)
Another todo: