Name	Name	Last commit message	Last commit date
Latest commit History 97 Commits
bin	bin
experiments	experiments
img	img
src	src
.gitignore	.gitignore
LICENSE-2.0.html	LICENSE-2.0.html
README.md	README.md
pom.xml	pom.xml

genetic-programming

Symbolic regression solver, based on genetic programming methodology.

Description
1.1 Crossover
1.2 Mutation
1.3 Optimization of coefficients
1.4 Reducing complexity of syntax trees
Example
2.1 f(x,y,z) - ?
2.2 Try it in command line
Quick start
3.1 Just download jar
3.2 Try it with Maven
3.3 Hello world

Description

Each mathematical expression can be represented in form of syntax tree:

Actually, it worth to keep in mind, that there exists infinite number of different syntax trees, which corresponds to semantically equivalent expressions. For example:

In practice, on of the most generic problems - is reconstruction of original function, having the information about its values in some specific points.

It is possible to apply genetic algorithm - for solving of given problem:

In terms of Genetic Algorithm - each syntax tree can be treated as a "chromosome" (an entity, which can "mutate" and change by "crossover" with other "chromosome")
It is needed to define fitness function: the function, which will calculate, how good each formula (which was encoded by syntax tree) - can represent existing data (e.g.: using mean squared error value).

Crossover

During "crossover" - syntax tree is modified by substituion of its subtree, with some subtree from other syntax tree.

Following image explains implementation of "crossover" operation over syntax trees:

Mutation

Currently implemented following "mutation" operations:

Substituion of some node of syntax tree - with node, which corresponds to different arithmetical operation:
Substituion of some subtree with randomly generated subtree:
Removing of some intermediate node from syntax tree:
Expanding tree from root:
Swapping subtrees for non-commutative oparations:

Optimization of coefficients

Actually, some syntax tree might represent correct structure of searchable function, but due to some non-optimal values of coefficients - given syntax tree can be considered as non-optimal by fitness function.

For example, following image displays target values of searchable function (red crosses) - and two functions-candidates (green and blue):

Blue line has smaller value of mean squared error, but, actually - green parabola - would be a better candidate for the final solution.

By this reason, current implementation of Symbolic Regression Solver - uses second pass of Genetic Algorithm - for optimizing of coefficients of each syntax tree. On the picture below - represented the way, how coefficients of each syntax tree - could be transformed to "chromosome":

Reducing complexity of syntax trees

Simplification of subtrees, which contains only constant-valued:
Pruning trees, which has height larger then threshold:

Example

Lets try to reconstruct original function, by following target values.

f(x,y,z) - ?

x	y	z	f(x,y,z)
26	35	1	830
8	24	-11	130
20	1	10	477
33	11	2	1217
37	16	7	1524

Below presented picture, which shows dynamics of changes of mean squared error, according to the best "evolved" syntax trees (the axis "x" - represents number of iteration):

Try it in command line

Download http://github.com/lagodiuk/genetic-programming/tree/master/bin/symbolic_regression_1.0.jar

Create file xyz.txt with following content:

# allowed functions are: ADD SUB MUL DIV SQRT POW LN SIN COS
# set which functions to use:
  ADD MUL SUB

# looking for:
f(x, y, z) — ?

# define training set:
f(26, 35, 1) = 830
f(8, 24, -11) = 130
f(20, 1, 10) = 477
f(33, 11, 2) = 1217
f(37, 16, 7) = 1524

(this file can be downloaded from: https://github.com/lagodiuk/genetic-programming/blob/master/bin/xyz.txt)

Launch: java -jar symbolic_regression_1.0.jar xyz.txt - and observe output for each iteration (output will be in format: iteration numbre, value of mean squared error, and evolved formula).

Quick start

just download jar

The most simple way is download http://github.com/lagodiuk/genetic-programming/tree/master/bin/symbolic_regression_1.0.jar and add it to your classpath.

try it with maven

This project depends on Generic Genetic Algorithm project (as a maven dependency).

git clone https://github.com/lagodiuk/genetic-algorithm.git
git clone https://github.com/lagodiuk/genetic-programming.git
mvn -f genetic-algorithm/pom.xml install
mvn -f genetic-programming/pom.xml install

Now you can add following maven dependencies to your project:

<dependency>
	<groupId>com.lagodiuk</groupId>
	<artifactId>ga</artifactId>
	<version>1.0.1</version>
</dependency>

<dependency>
	<groupId>com.lagodiuk</groupId>
	<artifactId>gp</artifactId>
	<version>1.0</version>
</dependency>

hello world

import java.util.LinkedList;
import java.util.List;

import com.lagodiuk.gp.symbolic.SymbolicRegressionEngine;
import com.lagodiuk.gp.symbolic.SymbolicRegressionIterationListener;
import com.lagodiuk.gp.symbolic.TabulatedFunctionFitness;
import com.lagodiuk.gp.symbolic.Target;
import com.lagodiuk.gp.symbolic.interpreter.Expression;
import com.lagodiuk.gp.symbolic.interpreter.Functions;

/**
 * f(x) - ? <br/>
 * 
 * f(0) = 0 <br/>
 * f(1) = 11 <br/>
 * f(2) = 24 <br/>
 * f(3) = 39 <br/>
 * f(4) = 56 <br/>
 * f(5) = 75 <br/>
 * f(6) = 96 <br/>
 * 
 * (target function is f(x) = x^2 + 10*x)
 */
public class HelloSymbolicRegression {

	public static void main(String[] args) {

		// define training set
		TabulatedFunctionFitness fitness =
				new TabulatedFunctionFitness(
						new Target().when("x", 0).targetIs(0),
						new Target().when("x", 1).targetIs(11),
						new Target().when("x", 2).targetIs(24),
						new Target().when("x", 3).targetIs(39),
						new Target().when("x", 4).targetIs(56),
						new Target().when("x", 5).targetIs(75),
						new Target().when("x", 6).targetIs(96));

		SymbolicRegressionEngine engine =
				new SymbolicRegressionEngine(
						fitness,
						// define variables
						list("x"),
						// define base functions
						list(Functions.ADD, Functions.SUB, Functions.MUL, Functions.VARIABLE, Functions.CONSTANT));

		addListener(engine);

		// 200 iterations
		engine.evolve(200);
	}

	/**
	 * Track each iteration
	 */
	private static void addListener(SymbolicRegressionEngine engine) {
		engine.addIterationListener(new SymbolicRegressionIterationListener() {
			@Override
			public void update(SymbolicRegressionEngine engine) {

				Expression bestSyntaxTree = engine.getBestSyntaxTree();

				double currFitValue = engine.fitness(bestSyntaxTree);

				// log to console
				System.out.println(
						String.format("iter = %s \t fit = %s \t func = %s",
								engine.getIteration(), currFitValue, bestSyntaxTree.print()));

				// halt condition
				if (currFitValue < 5) {
					engine.terminate();
				}
			}
		});
	}

	private static <T> List<T> list(T... items) {
		List<T> list = new LinkedList<T>();
		for (T item : items) {
			list.add(item);
		}
		return list;
	}
}

Some additional information can be found in the following article (in Russian language).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

genetic-programming

Table of contents

Description

Crossover

Mutation

Optimization of coefficients

Reducing complexity of syntax trees

Example

f(x,y,z) - ?

Try it in command line

Quick start

just download jar

try it with maven

hello world

About

Releases

Packages

Languages

mohammedri/genetic-programming

Folders and files

Latest commit

History

Repository files navigation

genetic-programming

Table of contents

Description

Crossover

Mutation

Optimization of coefficients

Reducing complexity of syntax trees

Example

f(x,y,z) - ?

Try it in command line

Quick start

just download jar

try it with maven

hello world

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages