Skip to content

HVDM in knn #10

@dourrr

Description

@dourrr

Hello !

I'm trying to use the HVDM metric in KNN from sklearn, but unfortunately it keeps raising problems in the fitting part. I've tried to print things inside the code in order to understand what was going on but it only raises new questions.

I'm working on the "heart" dataset, and i'm doing some feature selection, and i try to see the accuracy of the model using an increasing number of features.
So, i initialize the metric as follows :

hvdm_metric = hvdm(np.concatenate((X[:,features_subset],np.array([Y]).T),axis=1),[len(features_subset)], ind_map, nan_equivalents = [nan_eqv])

(so i first concatenate the data and the output as HVDM needs the output in the data, ind_map is just the mapping of the categorical features to the feature subset)
Then I get an error when using nn.fit(X_T[:,features_subset],Y_T) : It says it tries to divide by 0. From every thing I had tried before, I thought it came from a numerical feature that was wrongly indicated as categorical, but unfortunately it is not that. It tries to compute the distance from each of my output to the mean of the ouputs, but the output is categorical so that's really weird. I guess it comes from the fact that my output Y is included in the data when i initialize the metric...

Do you have any working example of the use of HVDM in knn ? Or any idea on how to avoid that ?
Thanks a lot for your package and you making me discover these metrics !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions