-
Notifications
You must be signed in to change notification settings - Fork 25
Description
An essential component of our package is to have a submodule to compute the dissimilarity/distance/diversity/similarity where normally we assume diversity = 1 - similarity.
A good summary of this is Table 2 in Drug Dev. Res.,72(1):74 - 84, 2011. Given a fact that this is a public repo, I am not going to share the screenshot here.
The Tanimoto coefficient is a classic and gold stand similarity metric for molecular fingerprints and we should support it. But it was found that it favors small molecules when used in molecule subset selection and a modification was proposed. My guess is that we should make this the default for molecular fingerprint inputs.
I will keep adding new things. Any idea will be appreciated.
@PaulWAyers @JuansaCollins @RichRick1 @Khaleeh @alnaba1