Skip to content

MiniSom wrapper for seamless integration with SciKit-learn library.

License

Notifications You must be signed in to change notification settings

koloiyolo/sklearn_minisom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI version fury.io Downloads

sklearn_minisom

MiniSom is Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details.

This is MiniSom library wrapper for seamless integration with SciKit-learn package.

Credits to:

This wrapper aims to integrate MiniSOM library into SciKit-learn ecosystem. It enables easy integration with Scikit-learn pipelines and tools like GridSearchCV for hyperparameter optimization. It also provides easy, scikit-learn like API for developers to interact with while aiming to sustain high flexibility and capabilities of MiniSom library.

Example clustering datasets from Comparation of clustering algorithms on SciKit-learn

image

This is separate project and not part of MiniSom library due to creator's of the original project aim to keep their as lightweight as possible.

Table of content

Installation

Just use pip:

pip install sklearn_minisom

Dependencies:

  • minisom>=2.3.3
  • scikit-learn
  • numpy
  • scipy

Examples

Just use it like any other scikit-learn cluster algorithm.

Let's start with importing required libraries and dataset.

from sklearn.datasets import load_wine
from sklearn_minisom import MiniSOM
from sklearn.preprocessing import StandardScaler

data = load_wine()
X = data.data
X = StandardScaler().fit_transform(X)

You can use fit and predict separately.

som = MiniSOM(3, 1, random_seed=40)
som.fit(X)
y = som.predict(X)

Or simply use convenient function.

som = MiniSOM(3, 1, random_seed=40)
y = som.fit_predict(X)

Alternatively you can also use SciKit-learn pipelines.

from sklearn.pipeline import Pipeline

pipeline = ([
    ('scaler', StandardScaler()),
    ('classifier', MiniSOM(3, 1, random_seed=40))
])

y = pipeline.fit_predict(X)

Now let's take a look at what we've got.

image-1

Overview

Class Description
MiniSOM() Main class of the library, used to initialize the Self-Organizing Maps algorithm
Parameters Description
x : int x dimension of the SOM.
y : int y dimension of the SOM.
sigma : float Spread of the neighborhood function.
learning_rate : float Initial learning rate.
num_iteration : int Number of iterations.
decay_function : string or callable Function that reduces learning_rate at each iteration.
neighborhood_function : string Function that weights the neighborhood of a position in the map.
topology : string Topology of the map.
activation_distance : string Distance used to activate the map.
random_seed : int y dimension of the SOM.
verbose : bool If True the status of the training will be printed each time the weights are updated.
use_epochs : bool If True the SOM will be trained for num_iteration epochs.
fixed_points : dict A dictionary k : (c_1, c_2), that will force the training algorithm to use the neuron with coordinates (c_1, c_2) as winner for the sample k instead of the best matching unit.
sigma_decay_function : string Function that reduces sigma at each iteration.
Attributes Description
labels_ : ndarray of shape (n_samples,) Labels of each point.
weights_ : ndarray of shape (grid_size_x, grid_size_y, feature_size) Returns the weights of the neural network.
n_features_in_ : int Number of features seen during fit.
inertia_ : float Sum of squared distances of samples to their closest neuron weight vector.
Functions Description
fit(X) Initializes SOM algorithm from minisom library and fits on it data matrix.
transform(X) Transform the data by finding the best matching unit (BMU) for each sample. Returns the BMU coordinates for each input sample.
predict(X) Predict the cluster assignment (BMU) for each sample. Returns the grid position (BMU) for each sample as the cluster assignment.
fit_transform(X) Fit the SOM and return the transformed BMU coordinates for each input sample.
fit_predict(X) Fit the SOM and return the predicted cluster assignments.
get_params(deep=True) Get parameters of the estimator.
set_params(**params) Set the parameters of this estimator.

About

MiniSom wrapper for seamless integration with SciKit-learn library.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages