Skip to content

Commit

Permalink
Add forward forward app (nebuly-ai#126)
Browse files Browse the repository at this point in the history
* add forward forward app

* implement comments

* import nebullvm as a dependency in apps

* add readme

* Add figs

* add architecture description

* rename title

* Add links and change image format

* Updated readme and added image

* rename matrixmaster and modify readme

Co-authored-by: diegofiori <d.fiori@nebuly.ai>
Co-authored-by: Nebuly <83510798+nebuly-ai@users.noreply.github.com>
  • Loading branch information
3 people authored Dec 20, 2022
1 parent 8bbe607 commit 5fb48f6
Show file tree
Hide file tree
Showing 20 changed files with 1,685 additions and 14 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,11 @@ Achieve sub-10ms response time for any AI application, including generative and


- [x] [Speedster](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/speedster): Automatically apply SOTA optimization techniques to achieve the maximum inference speed-up on your hardware.
- [ ] [OptiMate](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/optimate): Interactive tool guiding savvy users in achieving the best inference performance out of a given model / hardware setup.
- [x] [Forward-Forward](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/forward_forward): Test the performance of the Forward-Forward algorithm in PyTorch.
- [ ] [OpenAlphaTensor](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/open_alpha_tensor): Boost your DL model's performance with OpenAlphaTensor's custom-generated matrix multiplication algorithms (AlphaTensor open-source).
- [ ] [LargeSpeedster](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/large_speedster): Automatically apply SOTA optimization techniques on large AI models to achieve the maximum acceleration on your hardware.
- [ ] [CloudSurfer](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/cloud_surfer): Discover the optimal inference hardware and cloud platform to run an optimized version of your AI model.
- [ ] [MatrixMaster](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/matrix_master): Boost your DL model's performance with MatrixMaster's custom-generated matrix multiplication algorithms (AlphaTensor open-source).
- [ ] [OptiMate](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/optimate): Interactive tool guiding savvy users in achieving the best inference performance out of a given model / hardware setup.

## Maximize Apps
Make your Kubernetes GPU infrastructure efficient. Simplify cluster management, maximize hardware utilization and minimize costs.
Expand All @@ -45,7 +46,7 @@ Don’t settle on generic AI-models. Extract domain-specific knowledge from larg
## Simulate Apps
The time for trial and error is over. Simulate the performances of large models on different computing architectures to reduce time-to-market, maximize accuracy and minimize costs.
- [ ] [Simulinf](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/simulinf): Simulate inference performances of your AI model on different hardware and cloud platforms.
- [ ] [TrainingSim](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/training_sim): Easily simulate and optimize the training of large AI models on a distributed infrastructure.
- [ ] [TrainingSim](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/training_sim): Easily simulate and optimize the training of large AI models on a distributed infrastructure.


Couldn't find the optimization app you were looking for? Please open an issue or contact us at info@nebuly.ai and we will be happy to develop it together.
Expand Down
80 changes: 80 additions & 0 deletions apps/accelerate/forward_forward/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Forward-Forward Algorithm App

This app implements a complete open-source version of [Geoffrey Hinton's Forward Forward](https://www.cs.toronto.edu/~hinton/FFA13.pdf) Algorithm, an alternative approach to backpropagation.

The Forward Forward algorithm is a method for training deep neural networks that replaces the backpropagation forward and backward passes with two forward passes, one with positive (i.e., real) data and the other with negative data that could be generated by the network itself.

Unlike the backpropagation approach, forward forward does not require calculating the gradient of the loss function with respect to the network parameters. Instead, each optimization step can be performed locally and the weights of each layer can be updated immediately after the layer has performed its forward pass.

If you appreciate the project, show it by [leaving a star ⭐](https://github.com/nebuly-ai/nebullvm/stargazers)

<img width="1012" alt="Screenshot 2022-12-20 at 14 45 22" src="https://user-images.githubusercontent.com/83510798/208681462-2d8fc8f8-b24e-41a3-978a-72101f7f6392.png">

## Installation

The forward-forward app is built on top of nebullvm, a framework for efficiency-based apps. The app can be easily installed from source code. First you have to clone the repository and navigate to the app directory:

```bash
git clone https://github.com/nebuly-ai/nebullvm.git
cd nebullvm/apps/accelerate/forward_forward
```

Then install the app:

```bash
pip install .
```
This process will just install the minimum requirements for running the app. If you want to run the app on a GPU you have to install the CUDA version of PyTorch. You can find the instructions on the official PyTorch website.

## Usage
At the current stage, this implementation supports the main architectures discussed by Hinton in his paper. Each architecture can be trained with the following command:

```python
from forward_forward import train_with_forward_forward_algorithm


trained_model = train_with_forward_forward_algorithm(
model_type="progressive",
n_layers=3,
hidden_size=2000,
lr=0.03,
device="cuda",
epochs=100,
batch_size=5000,
theta=2.,
)
```

Three architectures are currently supported:
* `progressive`: the most simple architecture described in the paper. It has a pipeline-like structure and each layer can be trained independently from the following ones. Our implementation differs respect the original one since the labels are injected in the image concatenating them to the flattened tensor instead of replacing the first n_classes pixels value with a one-hot-representation of the label.

* `recurrent`: the recurrent architecture described in the paper. It has a recurrent-like structure and its based on the `GLOM` architecture proposed by Hinton.

* `nlp`: A simple network which can be used as a language model.

The recurrent and nlp network architectures are better explained below.

## Recurrent Architecture
The recurrent architecture is based in the `GLOM` architecture for videos, proposed by Hinton in the paper [How to represent part-whole hierarchies in a neural network](https://arxiv.org/pdf/2102.12627.pdf). Its application to the forward-forward algorithm aims at enabling each layer to learn not just from the previous layer output, but from the following layers as well. This is done by concatenating the outputs of the previous layer and following layers computed at the previous time-step. A learned representation of the label (positive or negative) it is given as input to the last layer. The following figure shows the structure of the network:

<p align="center">
<img width="500" alt="recurrent_net" src="https://user-images.githubusercontent.com/38586138/208651417-498c4bd4-f2dc-4613-a376-0b69317c73d4.png">
</p>

## NLP Architecture
The forward-forward architecture developed for NLP is a simple network which can be used as a language model. The network is composed by few normalized fully connected layers followed by a ReLU activation. All hidden representations are then concatenated together and given as input to the softmax for predicting the next token. The network can be trained in a progressive way, i.e. each layer can be sequentially trained separately from the following ones. The following figure shows the structure of the network:

<p align="center">
<img width="500" class="center" alt="nlp_net" src="https://user-images.githubusercontent.com/38586138/208651624-c159b230-f903-4e13-aaa7-b39a0d1c52fc.png">
</p>

## What is missing
This app implements the main architectures exposed by hinton in its paper. However, there are still some features that are not implemented yet. In particular, the following features are missing:

* [ ] Implementation of unsupervised training.
* [ ] Implementation of the `progressive` architecture using local receptive fields instead of fully connected layers.
* [ ] Training on CIFAR-10 for CV-based architectures.

And don't forget to [leave a star ⭐](https://github.com/nebuly-ai/nebullvm/stargazers) if you appreciate the project!
If you have any questions about the implementation, [open an issue](https://github.com/nebuly-ai/nebullvm/issues) or contact us in the [community chat](https://discord.gg/RbeQMu886J).

3 changes: 3 additions & 0 deletions apps/accelerate/forward_forward/forward_forward/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from forward_forward.api.functions import ( # noqa F401
train_with_forward_forward_algorithm,
)
File renamed without changes.
52 changes: 52 additions & 0 deletions apps/accelerate/forward_forward/forward_forward/api/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
from torchvision import datasets

from forward_forward.root_op import (
ForwardForwardRootOp,
ForwardForwardModelType,
)


def train_with_forward_forward_algorithm(
n_layers: int = 2,
model_type: str = "progressive",
device: str = "cpu",
hidden_size: int = 2000,
lr: float = 0.03,
epochs: int = 100,
batch_size: int = 5000,
theta: float = 2.0,
shuffle: bool = True,
**kwargs,
):
model_type = ForwardForwardModelType(model_type)
root_op = ForwardForwardRootOp(model_type)

output_size = None
if model_type is ForwardForwardModelType.PROGRESSIVE:
input_size = 28 * 28 + len(datasets.MNIST.classes)
elif model_type is ForwardForwardModelType.RECURRENT:
input_size = 28 * 28
output_size = len(datasets.MNIST.classes)
else: # model_type is ForwardForwardModelType.NLP
input_size = 10 # number of characters
output_size = 30 # length of vocabulary
assert (
kwargs.get("predicted_tokens") is not None
), "predicted_tokens must be specified for NLP model"

root_op.execute(
input_size=input_size,
n_layers=n_layers,
hidden_size=hidden_size,
optimizer_name="Adam",
optimizer_params={"lr": lr},
loss_fn_name="alternative_loss_fn",
batch_size=batch_size,
epochs=epochs,
device=device,
shuffle=shuffle,
theta=theta,
output_size=output_size,
)

return root_op.get_result()
12 changes: 12 additions & 0 deletions apps/accelerate/forward_forward/forward_forward/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
from nebullvm.apps.base import App

from forward_forward.root_op import ForwardForwardRootOp


class ForwardForwardApp(App):
def __init__(self):
super().__init__()
self.root_op = ForwardForwardRootOp()

def execute(self, *args, **kwargs):
return self.root_op.execute(*args, **kwargs)
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
from abc import ABC, abstractmethod

import torch

from nebullvm.operations.base import Operation

from forward_forward.utils.modules import (
FCNetFFProgressive,
RecurrentFCNetFF,
LMFFNet,
)


class BaseModelBuildOperation(Operation, ABC):
def __init__(self):
super().__init__()
self.model = None

@abstractmethod
def execute(
self,
input_size: int,
n_layers: int,
hidden_size: int,
optimizer_name: str,
optimizer_params: dict,
loss_fn_name: str,
output_size: int = None,
):
raise NotImplementedError

def get_result(self):
return self.model


class FCNetFFProgressiveBuildOperation(BaseModelBuildOperation):
def __init__(self):
super().__init__()

def execute(
self,
input_size: int,
n_layers: int,
hidden_size: int,
optimizer_name: str,
optimizer_params: dict,
loss_fn_name: str,
output_size: int = None,
):
layer_sizes = [input_size] + [hidden_size] * n_layers
model = FCNetFFProgressive(
layer_sizes=layer_sizes,
optimizer_name=optimizer_name,
optimizer_kwargs=optimizer_params,
loss_fn_name=loss_fn_name,
epochs=-1,
)
if output_size is not None:
output_layer = torch.nn.Linear(layer_sizes[-1], output_size)
model = torch.nn.Sequential(model, output_layer)

self.model = model


class RecurrentFCNetFFBuildOperation(BaseModelBuildOperation):
def __init__(self):
super().__init__()

def execute(
self,
input_size: int,
n_layers: int,
hidden_size: int,
optimizer_name: str,
optimizer_params: dict,
loss_fn_name: str,
output_size: int = None,
):
layer_sizes = [input_size] + [hidden_size] * n_layers + [output_size]
model = RecurrentFCNetFF(
layer_sizes=layer_sizes,
optimizer_name=optimizer_name,
optimizer_kwargs=optimizer_params,
loss_fn_name=loss_fn_name,
)
self.model = model


class LMFFNetBuildOperation(BaseModelBuildOperation):
def __init__(self):
super().__init__()

def execute(
self,
input_size: int,
n_layers: int,
hidden_size: int,
optimizer_name: str,
optimizer_params: dict,
loss_fn_name: str,
output_size: int = None,
):
model = LMFFNet(
token_num=output_size,
hidden_size=hidden_size,
n_layers=n_layers,
seq_len=input_size,
optimizer_name=optimizer_name,
optimizer_kwargs=optimizer_params,
loss_fn_name=loss_fn_name,
epochs=-1,
predicted_tokens=-1,
)
self.model = model
Loading

0 comments on commit 5fb48f6

Please sign in to comment.