Syllabus | Slides and Assignments | Project | Lecturer

K Nearest Neighbor

Make sure your repo is up-to-date

Assignment codes might be modified during the semester so please pull from this repo first and overwrite your repo with the NearestNeighbor folder.

Build your own kNN classifier (with continuous input)

Implement my_KNN.fit() function in my_KNN.py

Inputs:

X: pd.DataFrame, independent variables, each value is a continuous number of float type
y: list, np.array or pd.Series, dependent variables, each value is a category of int or str type

Implement my_KNN.predict() function in my_KNN.py

Input:

X: pd.DataFrame, independent variables, each value is a continuous number of float type

Output:

Predicted categories of each input data point. List of str or int.

Implement my_KNN.predict_proba() function in my_KNN.py

Input:

X: pd.DataFrame, independent variables, each value is a continuous number of float type

Output:

Prediction probabilities of each input data point belonging to each categories. pd.DataFrame(list of prob, columns = self.classes_).

Example:

self.classes_ = {"2", "1"}
the 5 nearest neighbors for the test data point have labels of {"1":4, "2":1}
then the prob for that data point is {"1": 4/5, "2": 1/5}
return probs = pd.DataFrame(list of prob, columns = self.classes_)

Test my_KNN classifier with A5.py

It is expected to perform the same with sklearn.neighbors.KNeighborsClassifier with inputs algorithm = 'brute'.
Expected output:

(base) zhe@Zhe-Yus-MacBook-Pro NearestNeighbor % python A5.py 
Iris-setosa     1.000000
Iris-setosa     1.000000
Iris-setosa     1.000000
Iris-setosa     1.000000
Iris-setosa     1.000000
Iris-versicolor 1.000000
Iris-versicolor 1.000000
Iris-versicolor 1.000000
Iris-versicolor 1.000000
Iris-versicolor 1.000000
Iris-virginica  1.000000
Iris-virginica  0.600000
Iris-virginica  1.000000
Iris-virginica  1.000000
Iris-virginica  0.800000

Do not forget to push your local changes to the Github server.

Grading Policy

importing additional packages such as sklearn is not allowed.
4 (out of 7) points will be received if A5.py successfully runs and makes predictions.
The rest 3 points will be given based on the percentage of same predictions with the correct implementation.

Hint

If my_KNN.py is too difficult to implement, you can try to complete my_KNN_hint.py.
Then, remember to rename it as my_KNN.py before submitting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assignment5.md

assignment5.md

K Nearest Neighbor

Make sure your repo is up-to-date

Build your own kNN classifier (with continuous input)

Implement my_KNN.fit() function in my_KNN.py

Implement my_KNN.predict() function in my_KNN.py

Implement my_KNN.predict_proba() function in my_KNN.py

Test my_KNN classifier with A5.py

Do not forget to push your local changes to the Github server.

Grading Policy

Hint

Files

assignment5.md

Latest commit

History

assignment5.md

File metadata and controls

K Nearest Neighbor

Make sure your repo is up-to-date

Build your own kNN classifier (with continuous input)

Implement my_KNN.fit() function in my_KNN.py

Implement my_KNN.predict() function in my_KNN.py

Implement my_KNN.predict_proba() function in my_KNN.py

Test my_KNN classifier with A5.py

Do not forget to push your local changes to the Github server.

Grading Policy

Hint