This project provides a flexible and extensible implementation of a multithreaded feedforward neural network in Java with popular optimizers included, wrapped up in a console user interface with a realtime accuracy graph visualizer while training. The neural network is designed to be easy to use, import and customize, making it a valuable tool for various machine learning and deep learning tasks. This library was made without any external machine learning, math, or other libraries to aid in its creation (pure Java, at least for the base class).
Releases are hosted on Maven Central. To add a dependency to this library, use:
<dependency>
<groupId>io.github.spaceshark123</groupId>
<artifactId>neuralnetwork</artifactId>
<version>1.0.0</version>
</dependency>-
User-friendly Console Interface: Everything is wrapped in an easy-to-use console interface using commands to perform tasks, similar to a shell terminal. This allows for no-code experimentation with neural networks.
-
Realtime Accuracy Graph While Training: Displays realtime graph of accuracy over epochs for mnist networks while training, allowing users to visualize improvements as they occur.
-
Realtime MNIST Drawing Tool: Includes a drawing tool that allows users to draw on a 28x28 canvas and recieve realtime predictions by the network of which MNIST digit it is. This allows for easy debugging and evaluation.
-
Parallelism and Multithreading: Uses parallel computing and multithreading to dramatically accelerate computation time for evaluation and training mini-batches.
-
Training Callback Interface: Provides an interface that includes a callback method that can be passed into the Train function to execute custom code on each mini-batch iteration of training.
-
Customizable Topology: Define the number of neurons in each layer and activation functions for each layer.
-
Multiple Activation Functions: Supports popular activation functions including linear, sigmoid, tanh, relu, leaky relu, binary, and softmax.
-
Multiple Loss Functions: Supports major loss/error functions: MSE, MAE, Sum Squared Error, Categorical Cross-Entropy, and Binary Cross-Entropy, with easy extensibility to add custom loss functions by implementing the
LossFunctioninterface from theio.github.spaceshark123.neuralnetworkpackage. -
Multiple Optimizers: Includes popular optimizers like Stochastic gradient descent (SGD), SGD with momentum, AdaGrad, RMSProp, and Adam. Supports additional, custom optimizers using the
Optimizerinterface from theio.github.spaceshark123.neuralnetwork.optimizerpackage. -
Weight Initialization: Utilizes weight initialization techniques like Xavier (for linear, sigmoid, and tanh) and He (for relu) for better convergence.
-
Data Augmentation: Includes option to augment MNIST dataset on import, which performs random affine transformations (translation, rotation, scaling) to increase generalizability of networks.
-
Training: Train your neural network using mini-batched Gradient Descent (SGD) with customizable optimizer, loss function, learning rate, mini-batch size, and learning rate decay.
-
Train/Test Set Cross-Validation: During training, train and test set accuracy are measured for each mini-batch, ensuring a complete picture of network performance and prevention of overfitting. The console interface allows users to specify a train/test split ratio.
-
Gradient Clipping: Helps prevent exploding gradients during training by setting a gradient clipping threshold.
-
Regularization Techniques: Supports popular regularization techniques like L1 and L2 to reduce overfitting by minimizing parameter complexity.
-
Save and Load Models: Easily save trained models to disk and load them for future use.
The following packages are available for use:
io.github.spaceshark123.neuralnetwork: Contains the mainNeuralNetworkclass and related interfaces for optimizers and training callbacks.io.github.spaceshark123.neuralnetwork.cli: Contains the console interface for user interaction with the neural network. Intended for direct program running rather than import.io.github.spaceshark123.neuralnetwork.optimizer: Contains optimizer implementations and theOptimizerinterface for creating custom optimizers.io.github.spaceshark123.neuralnetwork.activation: Contains theActivationFunctioninterface and various activation function implementations (Linear, Sigmoid, Tanh, ReLU, LeakyReLU, Binary, Softmax) for use in the neural network.io.github.spaceshark123.neuralnetwork.loss: Contains theLossFunctioninterface and various loss function implementations (MSE, MAE, Sum Squared Error, Categorical Cross-Entropy, Binary Cross-Entropy) for use in training and evaluation.io.github.spaceshark123.neuralnetwork.callback: Contains theTrainingCallbackinterface for creating custom training callbacks and theChartUpdaterclass for visualizing accuracy data in realtime during training.io.github.spaceshark123.neuralnetwork.util: Contains utility classes likeRealTimeSoftDrawCanvasfor the MNIST drawing tool.io.github.spaceshark123.neuralnetwork.experimental: Contains experimental and not fully implemented classes likeConvolutionalNeuralNetwork. DO NOT USE THESE CLASSES.
For use in your own Java projects, simply import the relevant packages/classes and it will immediately be usable. The following section covers the proper syntax for:
-
Initialize the Neural Network: Create a neural network by specifying the topology (number of neurons in each layer) and activation functions (from the
io.github.spaceshark123.neuralnetwork.activationpackage).int[] topology = {inputSize, hiddenLayerSize, outputSize}; ActivationFunction[] activations = { new ReLU(), // example of relu activation new LeakyReLU(0.01), // example of leaky relu with alpha = 0.01 new Softmax() // example of softmax activation }; NeuralNetwork network = new NeuralNetwork(topology, activations); network.init(0.1); //initializes weights and biases according to spread amount
-
Training: Train the neural network using your train and test/validation datasets and desired hyperparameters, taking advantage of multiple CPU cores to speed up training time.
double[][] trainInputs = {...}; double[][] trainOutputs = {...}; double[][] testInputs = {...}; double[][] testOutputs = {...}; int epochs = 100; double learningRate = 0.01; int batchSize = 32; LossFunction lossFunction = new MSE(); // specify loss function for training (io.github.spaceshark123.neuralnetwork.loss) double decay = 0.1; // Learning rate decay double momentum = 0.9; network.clipThreshold = 1; //default gradient clipping threshold //set regularization of network network.setRegularizationType(NeuralNetwork.RegularizationType.L2); network.setRegularizationLambda(0.001); Optimizer optimizer = new Adam(0.9, 0.999); //specify optimizer for training (io.github.spaceshark123.neuralnetwork.optimizer) //train the network with no callback network.train(trainInputs, trainOutputs, testInputs, testOutputs, epochs, learningRate, batchSize, lossFunction, decay, optimizer, null);
Optimizers implement the Optimizer interface and included optimizers are found in the io.github.spaceshark123.neuralnetwork.optimizer package. Included optimizers are:
SGD()SGDMomentum(double momentum)AdaGrad()RMSProp(double decayRate)Adam(double beta1, double beta2)
Custom optimizers can be made by creating a class implementing the Optimizer interface from the io.github.spaceshark123.neuralnetwork.optimizer package. This can be used to create other optimizers not already included in the NeuralNetwork class.
public static class CustomOptimizer implements Optimizer {
//assign to the elements of biases and weights in the step function
private double[][] biases;
private double[][][] weights;
private int[] neuronsPerLayer;
@Override
public void initialize(int[] neuronsPerLayer, double[] [] biases, double[][][] weights) {
this.biases = biases;
this.weights = weights;
this.neuronsPerLayer = neuronsPerLayer;
//other initializations
...
}
@Override
public void step(double[][] avgBiasGradient, double[][] [] avgWeightGradient, double learningRate) {
for (int i = 1; i < neuronsPerLayer.length; i++) {
for (int j = 0; j < neuronsPerLayer[i]; j++) {
//set biases
biases[i][j] = ...
for (int k = 0; k < neuronsPerLayer[i - 1]; k++) {
//set weights
weights[i][j][k] = ...
}
}
}
}
}Loss functions used during training implement the LossFunction interface and are found in the io.github.spaceshark123.neuralnetwork.loss package. Included loss functions are:
MeanSquaredError()MeanAbsoluteError()SumSquaredError()CategoricalCrossEntropy()BinaryCrossEntropy()
Custom loss functions can be created by implementing the LossFunction interface from the io.github.spaceshark123.neuralnetwork.loss package. This can be used to create other loss functions not already included in the NeuralNetwork class.
public class CustomLoss implements LossFunction {
@Override
public double computeLoss(double[] predicted, double[] actual) {
double loss = 0.0;
//custom loss code
...
return loss;
}
@Override
public double[] computeGradient(double[] predicted, double[] actual) {
double[] gradient = new double[predicted.length];
//custom gradient code
...
return gradient;
}
@Override
public String getName() {
return "CUSTOM"; // display name of loss function
}
}Activation functions implement the ActivationFunction interface and are found in the io.github.spaceshark123.neuralnetwork.activation package. Included activation functions are:
Linear()Sigmoid()Tanh()ReLU()LeakyReLU(double alpha)Binary()Softmax()
Custom activation functions can be created by implementing the ActivationFunction interface and registering them as a valid activation function with the ActivationFunctionFactory using its register method. The following example shows how to create a custom activation function with one parameter param1 and register it with the factory.
public class CustomActivation implements ActivationFunction {
private final double param1;
public CustomActivation(double param1) {
this.param1 = param1;
}
@Override
public double activate(double x) {
//custom activation code
...
}
@Override
public double derivative(double x) {
//custom derivative code
...
}
@Override
public String getName() {
return "CUSTOM"; // display name of activation function
}
@Override
public String toConfigString() {
return "CUSTOM(param1=" + param1 + ")"; // string used in config/saved files to uniquely identify this activation function
}
}
//register the custom activation function with the factory
ActivationFunctionFactory.register("CUSTOM", params -> {
double param1 = params.containsKey("param1")
? Double.parseDouble(params.get("param1"))
: 1.0;
return new CustomActivation(param1);
});Optionally, provide a custom training callback by passing in a class implementing the TrainingCallback interface from the io.github.spaceshark123.neuralnetwork.callback package as an argument. This can be used to make your own custom train addons like a graph visualization of the data. The ChartUpdater class has been provided in the same package to visualize accuracy data using this callback interface.
public class Callback implements TrainingCallback {
//testAccuracy is -1 if the current mini-batch doesn't have a test accuracy
@Override
public void onEpochUpdate(int epoch, int batch, double progress, double trainAccuracy, double testAccuracy) {
System.out.println("this statement is run for every mini-batch in training");
}
}
class Main {
public static void main(String[] args) {
...
Callback callback = new Callback();
network.train(inputs, outputs, epochs, learningRate, batchSize, lossFunction, decay, optimizer, callback);
}
}- Mutation: Mutate the neural network for a genetic algorithm (evolution).
network.mutate(c, v); //mutates the network with chance c and variation v- Evaluation: Use the trained model to make predictions and evaluate the loss
double[] input = {...};
double[] prediction = network.evaluate(input);
double[] expected = {...};
String lossFunction = "mse"; // or "sse" or "categorical_crossentropy"
double loss = network.loss(prediction, expected, lossFunction);- Save and Load: Save the trained model to disk and load it for future use, either as a java object, which isn't human readable and doesn't transfer between programming languages but is faster, or a plain text file containing parameters, which is human readable and also transferrable between programming languages.
// Save the model as a java object
NeuralNetwork.save(network, "my_model_java.nn");
// Load the model from a file formatted as a java object
NeuralNetwork loadedNetwork = NeuralNetwork.load("my_model_java.nn");
// Save the model as a plain text file
NeuralNetwork.saveParameters(network, "my_model.txt");
// Load the model from a plain text file
NeuralNetwork loadedTxtNetwork = NeuralNetwork.load("my_model.txt");The plain text file is separated into lines that each contain a unique set of parameters specified by the first token and followed by the corresponding values, all separated by spaces. For example, one line could contain: topology 784 512 10, which would translate to a neural network with 3 layers of those sizes. The headings and their specifications are as follows:
numlayers: contains an integer for the number of layers in the network. usually the first line.topology: containsnumlayersintegers describing the number of neurons in each layer. usually the second line.activations: containsnumlayersstrings describing the activation functions for each layer. This includes the input layer, even though it is never used. The valid activation functions are contained in the registered activation functions of theActivationFunctionFactoryclass. By default, these are: LINEAR, SIGMOID, TANH, RELU, LEAKYRELU(param1={value}), BINARY, SOFTMAX.regularization: contains an all-caps string describing the mode of regularization and a decimal for the lambda value (regularization strength)biases: contains all the biases for all the layers. in order from input to output layer, first neuron to last neuron for each layer. includes the input layer, even though it is never used.weights: contains all the weights. Internally, weights is represented as a 3D array with 1st dimension layer, 2nd dimension neuron #, and 3rd dimension incoming neuron # from previous layer. All weights are flattened into series in order from input to output layer, first neuron to last neuron for each layer, and first neuron to last neuron for each previous layer.
- Access/Modify Parameters: get/set the parameters and information about the network.
int numLayers = network.numLayers;
int[] topology = network.getTopology(); //topology[i] is # of neurons of layer i
double[][][] weights = network.getWeights();
network.setWeight(L,n2,n1,w); //sets the weight between layer L neuron n2 and layer L-1 neuron n1 to w
double[][] biases = network.getBiases();
network.setBias(L,n,b); //sets the bias of layer L neuron n to b
ActivationFunction[] activations = network.getActivations();
network.setActivation(L,new ReLU()); //sets the activation of layer L to relu
double[][] neurons = network.getNeurons();
String info = network.toString();
//the following two lines do the same thing
System.out.println(info);
System.out.println(network);To use this neural network implementation, you can interact with a custom console provided by the program. A maven wrapper has been included for use. If you do not have Maven installed (check with mvn -v), you can use the wrapper scripts to build and run the project by replacing mvn with ./mvnw (Linux/Mac) or mvnw.cmd (Windows) in the commands below:
-
Build the JAR: First, make sure you are working in the project directory. If you are running the full project with the console interface, run the following commands to compile and run the program:
Compile:
mvn package -Pcli # builds the project with the cli profile (allows for runnable JAR)Run:
# Run the produced JAR (replace <version> with the actual version in target/) java -jar target/neuralnetwork-cli-<version>.jar # Or use a glob to avoid typing the version (works if only one matching JAR exists) java -jar target/neuralnetwork-cli-*.jar
This will launch the program's custom console with all dependencies included (using shading), allowing you to control and modify neural networks.
Or, if you are using the library, running mvn package by itself gives the package as a library. You can compile and run your own Java files that import the NeuralNetwork class directly using the import io.github.spaceshark123.neuralnetwork.NeuralNetwork; statement.
-
help: Display a list of available commands and their descriptions. -
create: Create a new neural network by specifying its topology and activation functions. -
init: initializes the neural network parameters (weights and biases) with the specified bias spread and weight initialization method. ('he' or 'xavier') -
load: Load a pre-trained neural network from a saved model file. -
train: Train the current neural network using your dataset, loss function, optimizer, and desired hyperparameters. -
evaluate: Use the trained neural network to make predictions on new data. -
mutate: Mutate the parameters of the network for a genetic algorithm/implementation -
mnist: Initialize/import the MNIST dataset for use in training/evaluating or draw your own handwritten digits for the network to evaluate in realtime. -
info: Display information about the neural network's parameters -
reset: Reset the current neural network to default -
modify: changes parameters of neural network -
regularization: changes regularization type (L1, L2, none) and lambda (strength) of neural network -
magnitude: Display information about the magnitudes of the network's parameters (min/max/average) -
loss: Calculate the loss/accuracy of the network on a test dataset -
save: Save the current neural network to a file for future use. -
exit: Exit the program.
-
Type a command and press Enter to execute it. Follow the prompts to provide the required information.
-
For example, to create a new neural network, you would type
create, and then follow the prompts to specify the topology and activation functions. -
To train a network, use the
traincommand and provide details like the path to the training file or specifymnist, number of epochs, learning rate, mini-batch size, loss function, optimizer, learning decay rate -
For evaluation, use the evaluate command, and input the data you want to predict on, or
mnist [case #] -
The mnist dataset has been built into the program to allow for building/evaluating hand-drawn digit recognition networks directly using the console and recieving a realtime accuracy visualization to assist with training. You can run
mnist testto draw your own hand-written digits for the network to evaluate in realtime. themnistcommand to import the MNIST dataset has an optionalaugmentedmodifier to randomly apply affine transformations to the data, making the network more generalizable at the cost of some accuracy and convergence speed. mnist networks MUST HAVE input size 784 and output size 10. In the majority of cases, the output layer has softmax activation and is trained using the categorical_crossentropy loss function.
-
You can save the trained model using the save command. This will save the model to a file for future use.
-
To load a pre-trained model, use the load command and specify the file path of the saved model.
All training and test dataset files must be formatted in the following way:
- first line has 3 numbers specifying # cases, input and output size
- every line is a separate training case
- on each line, input is separated by spaces, then equal sign, then output separated by spaces
- To exit the program, simply type exit, and the program will terminate.
Unit tests have been provided in the src/test/java/com/github/spaceshark123/neuralnetwork/ directory using the JUnit testing framework. To run the tests, use the following command:
mvn testThese tests will automatically run on each build to verify the correctness of the code.
A few neural networks and their training sets have been pre-included into the project in the models and datasets directories respectively, ready to be loaded in:
MNISTNetwork: an untrained neural network with the correct topology to evaluate MNIST cases (digit recognition). accuracy ≈ 10% (object mode)MNISTNetworkTrained: a trained neural network that evaluates MNIST cases (digit recognition) with high, generalized accuracy from training on an augmented data set. Good for testing your own digits. accuracy ≈ 98.14% (parameters mode)
TrainSet1: training/test dataset for adding 2 numbers (2 inputs, 1 output) (plain text mode)MNIST: the MNIST dataset for training/evaluating handwritten digits (stored asmnist_train.txtandmnist_test.txtindatadirectory), 784 inputs, 10 outputstestTrainSet: training/test dataset with a single datapoint for testing purposes (plain text mode)