Skip to content
This repository was archived by the owner on Aug 3, 2025. It is now read-only.
/ grad Public archive

The first thing anybody doing their bachelors or masters in machine learning is implementing SGD

Notifications You must be signed in to change notification settings

Mari6814/grad

Repository files navigation

This is my implementation of the basic stuff you do when you start learning machine learning.

Notebooks

Please look at practical examples here: notebook.ipynb

Installation

Just install the requirements with pip install -r requirements.txt.

Usage

There's the following components:

  • grad.py: Contains my implementation of a ComputeGraph that builds networks, compiles and then computes them.
  • functions.py: Contains finished functions used in ML (mx+b, s'', gausS)
  • layers.py: Contains a finished dense layer implementation.
  • losses.py: Contains finished loss functions (MSE, CrossEntropy)
  • optimizers.py: Contains finished optimizers (SGD)

Computation graph

The ComputeGraph not only is used to formalize how to compute a network, but can also render the network as a graph.

Take a look at an example graph of a network: graph.png

The arrows flow from the variable x to teach node that where the values is used. The arrows back to nodes or x are the gradients.

Example

Let's define a simple dense network for the IRIS dataset. It takes a 4d feature vector, transforms it to 10d, then back to 3d so that we can predict the logits of our 3 target classes:

x = Placeholder('x')
net = dense(x, 4, 10, activation=Sigmoid)
net = dense(net, 10, 3, activation=Sigmoid)

The network can be trained by binding a target variable y to our ground truth and train the network through a mse function and SGD optimizer:

y = Placeholder('y')
loss = mse(y, net)
train = ComputeGraph().compile(loss, optimizer=SGD(lr=0.001))

Then we train the network for 10k iterations:

for i in range(10):
    loss = np.mean([train.run({x: iris_x, y: iris_y}) for i in range(1000)])
    print((1+i) * 1000, 'it, loss:', loss)

We can observe the output loss decreasing:

1000 it, loss: 0.09217271446448472
2000 it, loss: 0.03474766448223313
3000 it, loss: 0.023682791951110133
4000 it, loss: 0.019925369264752663
5000 it, loss: 0.017983636937195995
6000 it, loss: 0.01675917131624988
7000 it, loss: 0.015897457591654455
8000 it, loss: 0.015248857488413243
9000 it, loss: 0.014738584255826649
10000 it, loss: 0.014324533454920541

And finally we evaluate on a test dataset, by feeding unseen data in the trunk of the network (without training head):

ComputeGraph().compile(net)
test_acc = np.mean(np.argmax(predict.run({x: iris_x_test}), -1) == np.argmax(iris_y_test, -1))
print('acc:', round(test_acc * 100, 2), '%')
# acc: 100.0 %

About

The first thing anybody doing their bachelors or masters in machine learning is implementing SGD

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published