Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similarity search using bingo-elastic #1196

Open
projectssimm opened this issue Jul 26, 2023 · 1 comment
Open

Similarity search using bingo-elastic #1196

projectssimm opened this issue Jul 26, 2023 · 1 comment

Comments

@projectssimm
Copy link

Hi guys,
I've been using bingo-elastic(python) to get some similarities to certain compounds, but got confused. Here is the situation:
I was trying to get some similarities using the following code:
compoud = indigo.loadMoleculeFromFile("CH1.mol")
sim = TanimotoSimilarityMatch(compoud, 0.5)
similar_records = repository.filter(query_subject=[sim])
for record in similar_records:
print(record.as_indigo_object(indigo).smiles(), indigo.similarity(compoud, record.as_indigo_object(indigo)))
What I meant was trying to get the top similarities and the corresponding similarity values, but I got the following outputs:
C:...\python.exe D:...\test.py
CC(=O)C12OC1CC1C3CC=C4CC(O)CCC4(C)C3CCC21C 0.5970149040222168
O=C1CCCCCCCCCOCCCCCO1 0.38333332538604736
CCCC/C=C\O/C=C/C=C\CCCCCCCC(=O)[O-] 0.3636363744735718
COc1ccc(OC)c(/C=C2\Oc3c(C)c4c(cc3C\2=O)CNH+CO4)c1 0.25641027092933655
O=C1N=C([O-])NC(=O)C1(Cc1ccc2c(OCO2)c1)CN1CC2C[n]3c(=O)cccc3C(C2)C1 0.16249999403953552
COc1ccc(C(=O)COc2ccc3c(O/C(=C\c4cc(Br)ccc4OC)/C3=O)c2C)cc1 0.22727273404598236
COc1cc(/C=C2/Oc3c(CN4CCN(c5cccc[nH+]5)CC4)c([O-])cc(C)c3C/2=O)cc(OC)c1OC 0.21167883276939392
COc1ccc(/C=C2\C(=O)N=C([O-])NC\2=O)cc1OC 0.1818181872367859
CCOC(=O)c1c(C)oc2cc(Br)c(OC(=O)c3ccccc3Cl)cc12 0.15267175436019897
Cc1c(OCC#N)ccc2c1O/C(=C\c1cccc(Br)c1)/C2=O 0.2232142835855484

  1. As far as I understood, I had set the min threhold of the similarity(which=0.5 in this case), but I got so many compounds with similarties less than the threhold, what happened? Did I do something wrong?
  2. How could I get the similar compounds and order them in similarities at the first place? (i.e. Could I use some parameters in TanimotoSimilarityMatch or some other functions to make them in similarity order(desc or asc) at the first place?
    Best,
@PythonZhao
Copy link

Looking forward to resolving this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants