Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Reorganise under-sampling methods #277

Merged
merged 8 commits into from
Apr 29, 2017

Conversation

glemaitre
Copy link
Member

Reference Issue

Fixes #275

What does this implement/fix? Explain your changes.

Any other comments?

@codecov
Copy link

codecov bot commented Mar 29, 2017

Codecov Report

Merging #277 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #277      +/-   ##
==========================================
+ Coverage   98.18%   98.19%   +<.01%     
==========================================
  Files          60       62       +2     
  Lines        3531     3544      +13     
==========================================
+ Hits         3467     3480      +13     
  Misses         64       64
Impacted Files Coverage Δ
...election/tests/test_instance_hardness_threshold.py 100% <ø> (ø)
...ampling/prototype_selection/tests/test_nearmiss.py 100% <ø> (ø)
...n/tests/test_repeated_edited_nearest_neighbours.py 100% <ø> (ø)
...election/tests/test_condensed_nearest_neighbour.py 100% <ø> (ø)
..._selection/tests/test_edited_nearest_neighbours.py 100% <ø> (ø)
..._sampling/prototype_selection/tests/test_allknn.py 100% <ø> (ø)
...otype_selection/tests/test_random_under_sampler.py 100% <ø> (ø)
...ling/prototype_selection/tests/test_tomek_links.py 100% <ø> (ø)
...totype_selection/tests/test_one_sided_selection.py 100% <ø> (ø)
...election/tests/test_neighbourhood_cleaning_rule.py 100% <ø> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c54ea6...caed659. Read the comment docs.

@glemaitre glemaitre changed the title [WIP] Reorganise under-sampling methods [MRG] Reorganise under-sampling methods Mar 29, 2017
@glemaitre
Copy link
Member Author

@chkoar Here it comes

@chkoar
Copy link
Member

chkoar commented Apr 2, 2017

Could we discard these?

image

The :mod:`imblearn.under_sampling.prototype_generation` submodule
contains the method in which a new samples are generated such that the
dataset become more balanced.
"""
Copy link
Member

@chkoar chkoar Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write something like that: The :mod:imblearn.under_sampling.prototype_generation submodule
contains methods that generate new samples in order to balance the dataset.

"""
The :mod:`imblearn.under_sampling.prototype_selection` submodule
contains the method in which a subset of the original data a selected
to create a new dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

from . import prototype_generation
from .prototype_generation import ClusterCentroids

from . import prototype_selection
Copy link
Member

@chkoar chkoar Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it is bringing the file from the each level undernith to under_sampling level

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you add this in order to be able to perfrom the following import?

from imblearn.under_sampling.prototype_generation import ClusterCentroids

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no you can import from imblearn.under_sampling import ClusterCentroids

from .edited_nearest_neighbours import RepeatedEditedNearestNeighbours
from .edited_nearest_neighbours import AllKNN
from .instance_hardness_threshold import InstanceHardnessThreshold
from . import prototype_generation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@glemaitre
Copy link
Member Author

glemaitre commented Apr 6, 2017

Could we discard these?

I kinda like it. I think that it make sense to make it appears in the doc. This is a bit the same with the cluster metrics in scikit-learn. Untilt example and the API and the user guide show you the import, people can get that they are slightly different semantically.

@glemaitre
Copy link
Member Author

@chkoar are you in favor to merge this one

@chkoar
Copy link
Member

chkoar commented Apr 28, 2017

@glemaitre lets merge this

@glemaitre glemaitre merged commit 5b765e3 into scikit-learn-contrib:master Apr 29, 2017
glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017
* MAINT change organisation under-sampling

* FIX conflict api doc

* DOC add entry in whats new

* FIX indent in the documentation

* FIX add current module for linking the documentation

* FIX addres christos comments

* ENH Move the tests

* FIX remove useless import
glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017
* MAINT change organisation under-sampling

* FIX conflict api doc

* DOC add entry in whats new

* FIX indent in the documentation

* FIX add current module for linking the documentation

* FIX addres christos comments

* ENH Move the tests

* FIX remove useless import
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC Reorganisation of the module
2 participants