Introducing GeneticML Module, an advancement in the realm of reinforcement learning tailored for applications that defy the constraints of locally differentiable loss functions. While traditional optimization techniques often rely on gradients to navigate complex landscapes, certain scenarios, such as those encountered in highly nonlinear and discontinuous systems, render such methods ineffective. In response, our Genetic Optimization Module harnesses the power of genetic algorithms to tackle these challenges head-on. By emulating the principles of natural selection and evolution, this module offers a versatile and potent solution for optimizing complex, non-differentiable functions, revolutionizing reinforcement learning in domains where gradient-based approaches fall short. Notably, our module boasts a generic design that accommodates a wide array of problem domains, ensuring adaptability and scalability. Furthermore, its multithreaded architecture empowers efficient parallel execution, significantly accelerating the optimization process and making it an indispensable tool for real-time or resource-intensive applications.
The lib is designed around two main trait:
The Agent is the object interacting with the simulation. It is the object to be optimized.
pub trait Agent: Clone + Send + Sync + 'static {
fn step(&mut self, input: &Vec<Vec<f64>>) -> Vec<Vec<f64>>;
fn reset(&mut self);
fn mutate(&self, mutation_rate: f64) -> Self;
}
Step method dimensions:
Input:
Output:
The Simulation is the object responsible of simulating the environment and evaluating the agents.
pub trait Simulation: Clone + Send + Sync + 'static {
fn evaluate_agent<A>(&self, agent: &mut A) -> f64
where
A: Agent;
fn on_generation(&mut self, generation_number: usize);
}
The agents can take an form but since it is commun to use neural networks some layers and activation functions are already implemented.
The layers available for the moment are:
- Linear
- GRU
All layers are serializable and deserializable.
And the activation functions:
- Relu
- Sigmoid
- Tanh
For more information see the xornot or timeseries forecasting example.
The training is made from a checkpoint. This checkpoint is a vector of Agent. It can be used with random agents for a training from scratch or with trained agents for transfer learning.
pub fn training_from_checkpoint<A, S>(
population: Vec<A>,
simulation: &mut S,
nb_individus: usize,
nb_generation: usize,
survivial_rate: f64,
mutation_rate: f64,
mutation_decay: f64,
) -> Vec<A>
where
A: Agent,
S: Simulation
Here are some implementation examples. These examples are only dedicated to show use to use the library. For all of them, a gradient or analytical aproach it better.
This is a trivial example made to understand the package.
cargo run --example random_number_guess
Each Agent has its own guess that can't change. The Simulation contains the target number to guess and the observation is the taget with noise. The fitness is evaluated using the observation.
struct TestAgent {
guess: f64,
}
struct TestSimulation<A: Agent> {
agent: A,
target: f64,
obs: f64,
}
The control system example is an inverted pendulum. The Agent is the controller and the Simulation is the inverted pendulum's simulation.
cargo run --example control_system
With the controller equation being:
pub struct Controller {
x_coeff: f64,
x_dot_coeff: f64,
theta_coeff: f64,
theta_dot_coeff: f64,
}
struct InvertedPendulum<A: Agent> {
agent: A,
m: f64, // Mass of the pendulum
m_kart: f64, // Mass of the cart
l: f64, // Length of the pendulum
g: f64, // Acceleration due to gravity
x: f64, // Cart position
theta: f64, // Pendulum angle
x_dot: f64, // Cart velocity
theta_dot: f64, // Pendulum angular velocity
theta_acc: f64,
x_acc: f64,
}
Trival example to show the use of the neural network module.
cargo run --example xornot
Timeserie forecsting example:
cargo run --release --example timeseries_forecasting