Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Das-Dennis weight initialization method #295

Merged
merged 19 commits into from
Jun 23, 2021
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
* Introduce Policy Methods for MOEA/D-DE
([#293](https://github.com/mlpack/ensmallen/pull/293)).

* Add Das-Dennis weight initialization method
([#295](https://github.com/mlpack/ensmallen/pull/295)).
jonpsy marked this conversation as resolved.
Show resolved Hide resolved
### ensmallen 2.16.1: "Severely Dented Can Of Polyurethane"
###### 2021-03-02
* Fix test compilation issue when `ENS_USE_OPENMP` is set
Expand Down
7 changes: 5 additions & 2 deletions doc/optimizers.md
Original file line number Diff line number Diff line change
Expand Up @@ -1581,6 +1581,7 @@ initialize the reference directions.

The following types are available:

* **`Uniform`**
* **`BayesianBootstrap`**

The _`DecompPolicyType`_ template parameter refers to the strategy used to
Expand All @@ -1594,9 +1595,11 @@ The following types are available:

For convenience the following types can be used:

* **`DefaultMOEAD`** (equivalent to `MOEAD<BayesianBootstrap, Tchebycheff>`): utilizes BayesianBootstrap for weight init
* **`DefaultMOEAD`** (equivalent to `MOEAD<Uniform, Tchebycheff>`): utilizes Uniform method for weight initialization
and Tchebycheff for weight decomposition.

* **`BBSMOEAD`** (equivalent to `MOEAD<BayesianBootstrap, Tchebycheff>`): utilizes Bayesian Bootstrap method for weight initialization and Tchebycheff for weight decomposition.

#### Attributes

| **type** | **name** | **description** | **default** |
Expand Down Expand Up @@ -1629,7 +1632,7 @@ Attributes of the optimizer may also be changed via the member methods
SchafferFunctionN1<arma::mat> SCH;
arma::vec lowerBound("-10 -10");
arma::vec upperBound("10 10");
DefaultMOEAD opt(150, 300, 1.0, 0.9, 20, 20, 0.5, 2, 1E-10, lowerBound, upperBound);
DefaultMOEAD opt(300, 300, 1.0, 0.9, 20, 20, 0.5, 2, 1E-10, lowerBound, upperBound);
typedef decltype(SCH.objectiveA) ObjectiveTypeA;
typedef decltype(SCH.objectiveB) ObjectiveTypeB;
arma::mat coords = SCH.GetInitialPoint();
Expand Down
15 changes: 9 additions & 6 deletions include/ensmallen_bits/moead/moead.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

//! Weight initialization policies.
#include "weight_init_policies/bbs_init.hpp"
#include "weight_init_policies/uniform_init.hpp"

namespace ens {

/**
Expand All @@ -45,7 +47,7 @@ namespace ens {
* year={2008},
* @endcode
*/
template<typename InitPolicyType = BayesianBootstrap,
template<typename InitPolicyType = Uniform,
typename DecompPolicyType = Tchebycheff>
class MOEAD {
public:
Expand All @@ -72,8 +74,8 @@ class MOEAD {
* @param upperBound The upper bound on each variable of a member
* of the variable space.
*/
MOEAD(const size_t populationSize = 150,
const size_t maxGenerations = 300,
MOEAD(const size_t populationSize = 300,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the test failing with populationSize = 150?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This is because the binomial coefficient value (H) should equate to the size of the population. This algorithm is restrictive in that way. You can read Section IV (If I recall correctly) of the cited paper to know more. To ensure this, you can find I've added an if statement inside Generate method to catch a mismatch.

The main point is, there is only a certain number of points that can be generated uniformly given the number of partitions(gaps between points) and the number of objectives.

const size_t maxGenerations = 500,
const double crossoverProb = 1.0,
const double neighborProb = 0.9,
const size_t neighborSize = 20,
Expand Down Expand Up @@ -111,8 +113,8 @@ class MOEAD {
* @param upperBound The upper bound on each variable of a member
* of the variable space.
*/
MOEAD(const size_t populationSize = 150,
const size_t maxGenerations = 300,
MOEAD(const size_t populationSize = 300,
const size_t maxGenerations = 500,
const double crossoverProb = 1.0,
const double neighborProb = 0.9,
const size_t neighborSize = 20,
Expand Down Expand Up @@ -324,7 +326,8 @@ class MOEAD {
DecompPolicyType decompPolicy;
};

using DefaultMOEAD = MOEAD<BayesianBootstrap, Tchebycheff>;
using DefaultMOEAD = MOEAD<Uniform, Tchebycheff>;
using BBSMOEAD = MOEAD<BayesianBootstrap, Tchebycheff>;
} // namespace ens

// Include implementation.
Expand Down
207 changes: 207 additions & 0 deletions include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
/**
* @file uniform_init.hpp
* @author Nanubala Gnana Sai
*
* The Uniform (Das Dennis) methodology of Weight Initialization.
*
* ensmallen is free software; you may redistribute it and/or modify it under
* the terms of the 3-clause BSD license. You should have received a copy of
* the 3-clause BSD license along with ensmallen. If not, see
* http://www.opensource.org/licenses/BSD-3-Clause for more information.
*/
#ifndef ENSMALLEN_MOEAD_UNIFORM_HPP
#define ENSMALLEN_MOEAD_UNIFORM_HPP

namespace ens {

/**
* The Uniform (Das Dennis) method for initializing weights. This algorithm guarantees
* that the distance between adjacent points would be uniform.
*
* For more information, see the following:
jonpsy marked this conversation as resolved.
Show resolved Hide resolved
*
* @code
* article{zhang2007moea,
* title={MOEA/D: A multiobjective evolutionary algorithm based on decomposition},
* author={Zhang, Qingfu and Li, Hui},
* journal={IEEE Transactions on evolutionary computation},
* pages={712--731},
* year={2007}
* @endcode
*/
class Uniform
{
public:
/**
* Constructor for Uniform Weight Initializatoin Policy.
*/
Uniform()
{
/* Nothing to do. */
}

/**
* Generate the reference direction matrix.
*
* @tparam MatType The type of the matrix used for constructing weights.
* @param numObjectives The dimensionality of objective space.
* @param numPoints The number of reference directions requested.
* @param epsilon Handle numerical stability after weight initialization.
*/
template<typename MatType>
MatType Generate(size_t numObjectives,
size_t numPoints,
double epsilon)
{
size_t numPartitions = FindNumParitions(numObjectives, numPoints);
size_t validNumPoints = FindNumUniformPoints(numObjectives, numPartitions);

//! The requested number of points is not matching any partition number.
if (numPoints != validNumPoints)
{
size_t nextValidNumPoints = FindNumUniformPoints(numObjectives, numPartitions + 1);
std::ostringstream oss;
jonpsy marked this conversation as resolved.
Show resolved Hide resolved
oss << "DasDennis::Generate(): " << "The requested numPoints " << numPoints
<< " cannot be generated uniformly.\n " << "Either choose numPoints as "
<< validNumPoints << "(numPartition = " << numPartitions << ") or "
<< "numPoints as " << nextValidNumPoints << "(numPartition = "
<< numPartitions + 1 << ").";
jonpsy marked this conversation as resolved.
Show resolved Hide resolved
throw std::logic_error(oss.str());
}

return DasDennis<MatType>(numObjectives, numPoints,
numPartitions, epsilon);
}

private:
/**
* Finds the number of points which can be sampled uniformly from a
* unit simplex given the number of partitions.
*/
size_t FindNumUniformPoints(const size_t numObjectives,
const size_t numPartitions)
{
//! O(N) algorithm to calculate binomial coefficient.
auto BinomialCoefficient =
[](size_t n, size_t k) -> size_t
{
size_t retval = 1;
// Since, C(n, k) = C(n, n - k).
if (k > n - k)
k = n - k;

// [n * (n - 1) * .... * (n - k + 1)] / [k * (k - 1) * .... * 1].
for (size_t i = 0; i < k; ++i)
{
retval *= (n - i);
retval /= (i + 1);
}

return retval;
};
return BinomialCoefficient(numObjectives + numPartitions - 1, numPartitions);
Comment on lines +86 to +103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself, check that part in more depth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this to your notes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, thinking about this... The lambda usage is nice, but should we maybe prefer to write this as a standalone function? Similar to the link post? Also, I think the code on the link post is 1-1 with what is written here. Do we need to worry about licensing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for making it lambda inside the function was to secure the scope, it won't be used elsewhere. Regarding the heavy correlation, GFG (Geeks for geeks) is kinda open-sourced, and is made to help people(like Stack Overflow). Besides, finding Binomial Coefficient in O(n) isn't a ground breaking research that we're using without citing :) .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://www.geeksforgeeks.org/copyright-information/ this is fine, but let's add a comment which references the source for the code.

}

/**
* Calculates the appropriate number of partitions such that, the binomial
* coefficient value is closest to the number of points requested.
*/
size_t FindNumParitions(size_t numObjectives, size_t numPoints)
{
if (numObjectives == 1) return 0;
// Iteratively increase numPartitions so that the binomial coefficient
// comes near to numPoints;
size_t numPartitions {1};
size_t sampledNumPoints = FindNumUniformPoints(numPartitions,
numObjectives);
while (sampledNumPoints <= numPoints)
{
++numPartitions;
sampledNumPoints = FindNumUniformPoints(numObjectives,
numPartitions);
}

return numPartitions - 1;
}

/**
* A helper function for DasDennis
*/
template<typename AuxInfoStackType,
typename MatType>
void DasDennisHelper(AuxInfoStackType& progressStack,
MatType& weights,
const size_t numObjectives,
const size_t numPoints,
const size_t numPartitions,
const double epsilon)
{
typedef typename MatType::elem_type ElemType;
typedef typename arma::Row<ElemType> RowType;

size_t counter = 0;
const ElemType delta = 1.0 / (ElemType)numPartitions;

while ((counter < numPoints) && !progressStack.empty())
{
MatType point{};
size_t beta{};
std::tie(point, beta) = progressStack.back();
progressStack.pop_back();

if (point.size() + 1 == numObjectives)
{
point.insert_rows(point.n_rows, RowType(1).fill(
delta * static_cast<ElemType>(beta)));
Comment on lines +155 to +156
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not super clean, but I think it works, at least this way we can avoid the std::vector conversion.

weights.col(counter) = point + epsilon;
++counter;
}

else
{
for (size_t i = 0; i <= beta; ++i)
{
MatType pointClone(point);
pointClone.insert_rows(pointClone.n_rows, RowType(1).fill(
delta * static_cast<ElemType>(i)));
progressStack.push_back({pointClone, beta - i});
}
}
}
}

jonpsy marked this conversation as resolved.
Show resolved Hide resolved
/**
* Generates the weight matrix after verifying the
* validity of the parameters.
*/
template <typename MatType>
MatType DasDennis(const size_t numObjectives,
const size_t numPoints,
const size_t numPartitions,
const double epsilon)
{
//! Holds auxillary information required for the helper function.
//! Holds the current point and beta value.
using AuxContainer = std::pair<MatType, size_t>;

std::vector<AuxContainer> progressStack{};
//! Init the progress stack.
progressStack.push_back({{}, numPartitions});
MatType weights(numObjectives, numPoints);
weights.fill(arma::datum::nan);
DasDennisHelper<decltype(progressStack), MatType>(
progressStack,
weights,
numObjectives,
numPoints,
numPartitions,
epsilon);

return weights;
}

};

} // namespace ens

#endif
25 changes: 12 additions & 13 deletions tests/moead_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ TEST_CASE("MOEADSchafferN1DoubleTest", "[MOEADTest]")
const double expectedUpperBound = 2.0;

DefaultMOEAD opt(
150, // Population size.
300, // Population size.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -108,7 +108,7 @@ TEST_CASE("MOEADSchafferN1TestVectorDoubleBounds", "[MOEADTest]")
const double expectedUpperBound = 2.0;

DefaultMOEAD opt(
150, // Population size.
300, // Population size.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -168,7 +168,7 @@ TEST_CASE("MOEADFonsecaFlemingDoubleTest", "[MOEADTest]")
const double expectedUpperBound = 1.0 / sqrt(3);

DefaultMOEAD opt(
150, // Population size.
300, // Max generations.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -223,7 +223,7 @@ TEST_CASE("MOEADFonsecaFlemingTestVectorDoubleBounds", "[MOEADTest]")
const double expectedUpperBound = 1.0 / sqrt(3);

DefaultMOEAD opt(
150, // Population size.
300, // Max generations.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -278,7 +278,7 @@ TEST_CASE("MOEADSchafferN1FloatTest", "[MOEADTest]")
const double expectedUpperBound = 2.0;

DefaultMOEAD opt(
150, // Population size.
300, // Population size.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -340,7 +340,7 @@ TEST_CASE("MOEADSchafferN1TestVectorFloatBounds", "[MOEADTest]")
const double expectedUpperBound = 2.0;

DefaultMOEAD opt(
150, // Population size.
300, // Population size.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -400,7 +400,7 @@ TEST_CASE("MOEADFonsecaFlemingFloatTest", "[MOEADTest]")
const float expectedUpperBound = 1.0 / sqrt(3);

DefaultMOEAD opt(
150, // Population size.
300, // Max generations.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -455,7 +455,7 @@ TEST_CASE("MOEADFonsecaFlemingTestVectorFloatBounds", "[MOEADTest]")
const float expectedUpperBound = 1.0 / sqrt(3);

DefaultMOEAD opt(
150, // Population size.
300, // Max generations.
300, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
Expand Down Expand Up @@ -499,8 +499,8 @@ TEST_CASE("MOEADFonsecaFlemingTestVectorFloatBounds", "[MOEADTest]")

/**
* Test against the first problem of ZDT Test Suite. ZDT-1 is a 30
* variable-2 objective problem with a convex Pareto Front.
*
* variable-2 objective problem with a convex Pareto Front.
*
* NOTE: For the sake of runtime, only ZDT-1 is tested against the
* algorithm. Others have been tested separately.
*/
Expand All @@ -512,8 +512,8 @@ TEST_CASE("MOEADZDTONETest", "[MOEADTest]")
const double upperBound = 1;

DefaultMOEAD opt(
150, // Population size.
300, // Max generations.
300, // Population size.
150, // Max generations.
1.0, // Crossover probability.
0.9, // Probability of sampling from neighbor.
20, // Neighborhood size.
Expand All @@ -538,6 +538,5 @@ TEST_CASE("MOEADZDTONETest", "[MOEADTest]")
size_t numVariables = coords.size();
double sum = arma::accu(coords(arma::span(1, numVariables - 1), 0));
double g = 1. + 9. * sum / (static_cast<double>(numVariables - 1));

REQUIRE(g == Approx(1.0).margin(0.99));
}