Here there is a code for processing next task:
Select best 'optimization' algorithm from {a1, a2, a3, ..., an} if there are N samples results for each of them
For examples, there are 3 optimization methods like a1(start_statement)
, a2(start_statement)
, a3(start_statement)
and u need to select which one is best for your task. For this u should run these algorithms N
times (more times is better) and create table like (N
rows for each simulation, 3 columns for each algorithm)
a1 | a2 | a3 |
---|---|---|
number11 | number12 | number13 |
number21 | number22 | number23 |
... | ... | ... |
numberN1 | numberN2 | numberN3 |
U can use this table as R dataframe or export to .csv
if u want.
In file abtest.R there is ab_test(dataframe)
function for compare these algorithms with statistical analysis.
The function provides these steps:
- Check normal distribution for each column using Anderson-Darling Test For Normality
- If all columns are normal check presence of important differences using Variance Analysis, otherwise Kruskal-Wallis Rank Sum Test
- If there are some important differences, compare each pair using:
- Student's T-Test if both columns are normal
- Wilcoxon Rank Sum And Signed Rank Tests otherwise
At the end there will be the matrix like:
a1 a2 a3
a1 0 1 -1
a2 -1 0 -1
a3 1 1 0
Where [a1, a2] == 1
means that mean(a1)
is importantly more than mean(a2)
but [a1, a3] == -1
means that mean(a1)
is importantly less than mean(a3)
During the working it also creates boxplot for comparing results. Next result examples are got from testing several hyperparameters' values of genetic algorithm realization for ration searching at Nutrient planner