An implementation of Andrej Karpathy's micrograd
Reimplementing this as a warm up to implementing in Rust! Two blog posts accompany this implementation:
This repo was indexed by DeepWiki, which generated documentation for it - take a look at the API reference generated for this tiny codebase :).
To experiment with a very simple binary classifier model that has randomly generated inputs and targets:
git clone git@github.com:msyvr/micrograd-python.git
cd micrograd-python
python train.py
Parameters for step size, number of epochs, activation function, etc., can be updated in train.py
to see the effects of each on model convergence to targets.
Model performance can be evaluated as a function of:
- number of layers
- nodes per layer
- loss function
Separately, given a configuration of the above parameters, training efficiency can be evaluated as a function of:
- step size
- (maximum) training epochs
Ideally, validate the following in comparison to outputs using PyTorch:
- confirm that the
Value
structure's methods work as intended - validate that { forward pass, backpropagation } work as intended
- validate that training results closely approximate those with PyTorch
Evals might be split into two categories:
- For given training data, model architecture, loss function, and activation function, identify optimal combinations of step size and number of epochs.
- Investigate relationship between quality of results and each of { training data sets, model architecture, loss function, and activation function }.