Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the construction of high-dimensional feature space #67

Open
souno1218 opened this issue Aug 6, 2024 · 5 comments
Open

On the construction of high-dimensional feature space #67

souno1218 opened this issue Aug 6, 2024 · 5 comments

Comments

@souno1218
Copy link

souno1218 commented Aug 6, 2024

When constructing a high-dimensional feature space from primary features, the SISSO Fortran code appears to operate in a recursive manner, applying operators set to the arbitrary elements in the feature space and expanding it.

We believe that the code may potentially omit a certain form of the feature.
If that is the case, we would be grateful for any feedback you could provide regarding the omitted features.
We hope you will forgive us if the above question stems from our careless mistakes.

We would like to respectfully bring the following concerns to your attention:
Despite our best efforts, we have been unable to find the features, such as A/(B+(C+D)), A*(B+(B+C)) and so on, which we believe should be included in the feature space when expanding it using the provided data set (numerical values have been generated randomly).
Furthermore, we would be grateful for any insight you could provide regarding the features that may have been omitted during the expansion of the feature space.

train.dat
========================================
name, A, B, C, D, E,
data_0 0.3565 0.5772 0.2283 0.9890 0.1637
data_1 0.5920 0.8401 0.4583 0.0922 0.8506
data_2 ….
========================================

SISSO.in
========================================
nsf= 5
ops='(+)(-)(*)(/)'
fcomplexity=3
funit=(1:5)
nf_sis=200000
========================================

@rouyang2017
Copy link
Owner

rouyang2017 commented Aug 7, 2024

Hi, this is a detail that we did not note in the paper PRM 2, 083802 (2019).

To see the expressions A/(B+(C+D)) and A*(B+(B+C), please increase the fcomplexity to 4 (fcomplexity=4).
1723001037878
1723001037895

Following the feature generation scheme in PRM 2, 083802 (2019), we have the feature space Phi of different rung. Phi_0 contains all the primary features (fcomplexity=0); Phi_1 contains all the features with fcomplexity<=1; Phi_2 contains all the features with fcomplexity<=2 and MOST features with fcomplexity =3; Phi_3 contains features with higher fcomplexity ...

Unfortunately, the expression A/(B+(C+D)) appear in Phi_3, but not in Phi_2, though its fcomplexity is 3 (3 operators). The reason is that it involve 3 recursive calls (Phi_3), i.e.:

  1. C+D
  2. B+(C+D)
  3. A/(B+(C+D))

These should answer your question.

@rouyang2017
Copy link
Owner

Here I used just 4 samples and a small nf_sis, and so I do not see the A*(B+(B+C)). It can be found by increasing the nf_sis.

@souno1218
Copy link
Author

souno1218 commented Aug 7, 2024

I'm very grateful for your quick reply.
I think I may now have a better grasp on what Phi_N signifies in the output file.
If I've understood correctly, Phi_3 represents the upper limit, and even with fcomplexity = 7, Phi_4 was not calculated.
Could I just check whether I've understood correctly that in this case, a calculation like A+(B/(C+(D+E))) is difficult in principle?
I'm not proficient in English, so I'm using DeepL. I apologize if I've been impolite or if I've misunderstood.

@rouyang2017
Copy link
Owner

Yes. Phi_4 and higher rung require too much memory to be doable in the current code. Future versions will make Phi_4 possible.

That's a good question. A+(B/(C+(D+E))) seems to be in Phi_4, which necessitates the calculation of high-rung feature space. We are planning to work on this in near future.

@souno1218
Copy link
Author

I'm grateful for your help in clarifying this matter. I'm pleased to see the ongoing progress !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants