Description
Is your feature request related to a problem? Please describe.
This is not a problem per se, but AgglomerativeClustering
and SpectralClustering
in sklearn.cluster
is not always favorable especially for large datasets due to its numerical scaling (benchmark at HDBSCAN docs. For example, personally I usually use genieclust, and would like to use it instead of sklearn clusterers, which is impossible in the current implementation.
Describe the solution you'd like
A Clusterer
base class for interfacing both sklearn and other types of clusterers by inheritance can be implemented and its instance (or class itself) can be given as an argument while splitting. Or it can be some if-else statements in datasail.cluster.clustering.additional_clustering()
, but it might be less elegant.
Describe alternatives you've considered
Alternatively, sklearn clusterers can be replaced with ones from fastclust
package.