-
Notifications
You must be signed in to change notification settings - Fork 6
symbolic_regression_part4
Evolving multiple programs at the same time is great but my problem requires multiple variables. How should I proceed?
If you only need multiple variables (without multiple programs) src_search
is enough. Stop reading and turn back to wiki / source.
In general try to use src_search
because it directly supports model metrics and validation strategies.
The case of multiple variables and multiple programs cannot be supported in a unique way and the user is forced to customize the generic search
class to match his requirements.
A painstaking extension of the previous example is technically viable but we have a better option.
Instead of a user-defined terminal (c
), we can use the predefined vita::variable
terminal. Variables are convenient placeholders filled at the beginning of program/individual execution with user-provided values.
In the main()
function:
prob.sset.insert<c>();
has been replaced with:
prob.sset.insert<vita::variable>("x1", 0);
prob.sset.insert<vita::variable>("x2", 1);
prob.sset.insert<vita::variable>("x3", 2);
The constructor of a variable takes two parameters:
- the name of the variable (e.g.
"x1"
); - an index used to retrieve the value of the variable at execution time (e.g.
0
). More about this point follows below.
A training case / example can be represented with a simple structure:
struct example
{
example(const std::vector<double> &ex_a, const vita::matrix<double> &ex_b,
const std::vector<double> &ex_x)
: a(ex_a), b(ex_b), x()
{
std::copy(ex_x.begin(), ex_x.end(), std::back_inserter(x));
}
std::vector<double> a;
vita::matrix<double> b;
std::vector<vita::value_t> x;
};
x
contains the value of the variables for a given example (x[i]
is the value of the i
-th variable).
Our problem crunches real numbers so the constructor takes vectors of double
s.
Vita however tries to support many use-cases adopting vita::value_t
for storing / passing values. This forces a conversion from a vector of double
s (ex_x
) to a vector of value_t
s (x
).
std::copy
performs the conversion once and for all (delaying the conversion at parameter-passing-time is less efficient).
The training set is a collection of examples:
using training_set = std::vector<example>;
Almost every iterable container could be used (e.g. std::list
instead of std::vector
).
Now we can take advantage of the existing sum_of_errors_evaluator
(see src/evaluator.h
) class to quickly write your evaluator.
sum_of_errors_evaluator
is a template class that, given an error functor (ERRF
) and a training set (DAT
):
- calculates the sum of the errors of a model/program over the training set;
- converts the total error in a standardized fitness.
template<class T, class ERRF, class DAT>
class sum_of_errors_evaluator : public src_evaluator<T, DAT>
{
public:
static_assert(std::is_class_v<ERRF>);
static_assert(detail::is_iterable_v<DAT>);
static_assert(detail::is_error_functor_v<ERRF, DAT>);
explicit sum_of_errors_evaluator(DAT &);
fitness_t operator()(const T &) override;
// ...
};
The error functor object (ERRF
) acquires a program via its constructor and calculates the error on a specific example:
class error_functor
{
public:
error_functor(const PROGRAM &);
double operator()(const EXAMPLE &) const;
// ...
};
Implementing ERRF::operator()
isn't hard since the code from the previous example is already good:
class error_functor
{
public:
error_functor(const candidate_solution &s) : s_(s) {}
double operator()(const example &ex) const
{
std::vector<double> f(N);
std::transform(s_.begin(), s_.end(), f.begin(),
[&ex](const auto &i)
{
const auto ret(vita::run(i, ex.x));
return vita::has_value(ret) ? std::get<vita::D_DOUBLE>(ret)
: 0.0;
});
std::vector<double> model(N, 0.0);
for (unsigned i(0); i < N; ++i)
for (unsigned j(0); j < N; ++j)
model[i] += ex.b(i, j) * f[j];
double delta(std::inner_product(ex.a.begin(), ex.a.end(),
model.begin(), 0.0,
std::plus<>(),
[](auto v1, auto v2)
{
return std::fabs(v1 - v2);
}));
return delta;
}
private:
candidate_solution s_;
};
Two important remarks are:
-
vita::run(i)
has been changed withvita::run(i, ex.x)
thus enabling the passage of values from the training case to the variables; - the functor returns
delta
directly, leaving tosum_of_errors_evaluator
the burden of the conversion to a standardized fitness.
(for your ease all the code is in the examples/symbolic_regression05.cc file)