This repository serves only as a summary for my work. All the codes produced during the coding phase are given in the links right below!
Project proposal: https://www-polsys.lip6.fr/~phuoc/includes/gsoc22.pdf
-
Pull request 1: GeomScale/volesti#233
Branch: https://github.com/huuphuocle/volesti/tree/feature/sampling_correlation_matrices -
Pull request 2: GeomScale/volesti#240
Branch: https://github.com/huuphuocle/volesti/tree/feature/ReHMCWalk
Notations
-
$A$ : a correlation matrix of size$n \times n$ -
$A_{i,j}$ : a matrix in the linear matrix inequality of$K$ where$a_{i,j} = a_{j,i} = 1$ and$0$ elsewhere -
$K$ : the spectrahedron associated to the set of$n\times n$ correlation matrices -
VT: Vector TypeEigen::Matrix<NT, Eigen::Dynamic, 1> -
MT: Matrix TypeEigen::Matrix<NT, Eigen::Dynamic, Eigen::Dynamic>
In this section, I summarize the new implementation to be integrated into volesti for sampling correlation matrices.
I implemented two new classes for representing the spectrahedron volesti's Spectrahedron class in include/convex_bodies/correlation_matrices:
CorrelationSpectrahedron(correlation_spectrahedron.hpp)CorrelationSpectrahedron_MT(correlation_spectrahedron_MT.hpp)
withCorreMatrix(corre_matrix.hpp) for encoding matrices asPoint.
Some modifications are also made in existing classes of volesti to enable BallWalk, RDHRWalk and BilliardWalk on these new classes.
I implemented ReHMC Walk for sampling from log-concave distributions in include/random_walks:
GaussianReHMCWalk(gaussian_ReHMC_correlation.hpp)ExponentialReHMCWalk(exponential_ReHMC_correlation.hpp)
I added several functions in include/sampling/sample_correlation_matrices.hpp to facilitate the use of sampling correlation matrices in volesti. These functions are listed below according to the sampling distribution and templated by
WalkType: type of the random walk in use, e.g., BallWalk, RDHRWalk, BilliardWalkPointType: type of points in use (normalPointorCorreMatrix)RNGType: type of random number generatorsPointList: type of lists of sample points, e.g.,std::vector<PointType>orstd::list<PointType>.
Common parameters
n: size of matricesrandPoints: a list ofPointto store output correlation matriceswalkL: walk length of the random walksnum_points: the requested number of ouput matricesnburns: the number of burnt samples
Uniform sampling
uniform_correlation_sampling<WalkType, Point, RNGType>(n, randPoints, walkL, num_points, nburns): uniform sampling with theCorrelationSpectrahedronclassuniform_correlation_sampling_MT<WalkType, Point, RNGType>(n, randPoints, walkL, num_points, nburns): uniform sampling with theCorrelationSpectrahedron_MTclass
Gaussian sampling:
For a distribution of density proportional with
gaussian_correlation_sampling<WalkType, Point, RNGType>(n, randPoints, walkL, num_points, a, nburns)gaussian_correlation_sampling_MT<WalkType, PointMT, RNGType>(n, randPoints, walkL, num_points, a, nburns)
Exponential sampling
For a distribution of density proportional with
exponential_correlation_sampling<WalkType, Point, RNGType>(n, randPoints, walkL, num_points, c, T, nburns)exponential_correlation_sampling_MT<WalkType, Point, RNGType>(n, randPoints, walkL, num_points, c, T, nburns)
Please look at examples/correlation_matrices/sampler.cpp for a full example.
// Define some types
typedef BoostRandomNumberGenerator<boost::mt19937, double, 3> RNGType;
typedef point<double> Point;
// Declare the output list of points
std::vector<Point> randPoints;
// Set walk length
int walkL = 1;
// Uniform sampling of 1000 3x3 correlation matrices with `AcceleratedBilliardWalk`:
uniform_correlation_sampling<AcceleratedBilliardWalk, point<double>, RNGType>(3, randPoints, 1, num_points, 0);
// Gaussian sampling of 1000 3x3 correlation matrices with `GaussianReHMCWalk` using `CorreMatrix<double>` Point:
gaussian_correlation_sampling_MT<GaussianReHMCWalk, CorreMatrix<double>, RNGType>(3, randPoints, walkL, 1000, a, 0);
The main operations for sampling correlation matrices are the oracles:
- membership tests whether the current point lies in the
-
intersection computes the intersection of
$x + t \cdot v$ and the boundary of$K$ - reflection computes the direction of reflection.
This class inherits Spectrahedron class. The main attributes of this class are
-
MT: A correlation matrix$A = (a_{i,j})_{1 \leq i,j \leq n}$ -
VT: A vector of independent coefficients of a correlation matrix -
unsigned int n: the size of matrices -
unsigned int d: the dimension of vectors of coefficients,$d = n(n-1)/2$ .
In this class, the correlation matrices are still manipulated through vectors of coefficients with the help of point class and its basic operators. Therefore, integrating this new class CorrelationSpectrahedron into random walk algorithms existing in volesti is straightforward.
However, the important innovation here is that some member functions are modified to use new geometric oracles (for efficiency purposes). For instance, membership oracle is_in is changed to accelerate BallWalk algorithms, intersection (line_positive_intersection, line_intersect) and reflection (compute_reflection) oracles are changed to accelerate BilliardWalk algorithms.
These new aspects are detailed below in Geometric oracles section.
This class serves for the same purposes of CorrelationSpectrahedron class. The main difference is that it uses a new PointType class CorreMatrix which stores a matrix in its attribute mat. This helps avoid switching between matrices and vectors of coefficients during computation. The implementation of this class includes the following:
-
CorreMatrixclass with basic operators (+,-,*,/,+=,-=,$\ldots$ ). -
CorrelationSpectrahedron_MTclass with member functions modified to useCorreMatrixas PointType - New
GetDirectionfunctions ininclude/sampling/sphere.hppforCorreMatrixthat returns a direction stored in matrix form.
Since we operate only on symmetric matrices and only the lower triangular part of the symmetric matrices are referenced by Eigen::SelfAdjointEigenSolver, c.f.,
https://eigen.tuxfamily.org/dox/classEigen_1_1SelfAdjointEigenSolver.html,
the CorreMatrix class stores only the lower part of matrices and uses triangularView<Eigen::Lower> to operate on them.
I added some functions in include/matrix_operations/EigenvaluesProblems.h to accelerate random walk algorithms:
-
Membership: Testing
$A \in K$ means testing whether$A$ is positive semidefinite. The old function involesticomputes the largest eigenvalue of$A$ withEigen::SelfAdjointEigenSolverorSpectra::SymEigsSolver.
I changed this toEigen::LDLTthat computesA's Cholesky decomposition whenAis positive semidefinite or signals an error otherwise (seeisPositiveSemidefinitefunction). -
Intersection: The intersection oracle needs to solve a generalized eigenvalue problem for symmetric matrices. This principle is not changed but by experiments, I found that
Eigen::GeneralizedSelfAdjointEigenSolveris faster than the currently in-useSpectra::SymGEigsSolver.
Hence, I added a functionminPosLinearEigenvalue_EigenSymSolverwhich is used by the intersection oracles inCorrelationSpectrahedronandCorrelationSpectrahedron_MTclasses. -
Reflection: In the project proposal, we explained a formula for the normal vector at the reflection point that helps compute the reflection direction. However, each matrix
$A_{i,j}$ in the linear matrix inequality of$K$ has only two non-zero entries$a_{i,j} = a_{j,i} = 1$ . Substituting these matrices in the normal vector formula gives a much simpler formula:$$n_{normal} \propto (e_2e_1,e_3e_2,e_3e_1,\ldots)$$ where$e = (e_1,\ldots,e_n)$ is an eigenvector obtained from intersection oracle.
Using this formula is much faster than the naive one and helps improve the performance of random walks which intensively use reflections, e.g.,BilliardWalkorReHMCWalk.
As explained in the project proposal, Reflective Hamiltonian Monte-Carlo Walk allows to sample correlation matrices from a distribution over
In principle, the only difference between ReHMC walks comes from
I implemented ReHMC Walk for two log-concave distributions in include/random_walks:
-
GaussianReHMCWalk: Gaussian distribution with log-density$- a \langle x, x \rangle$ -
ExponentialReHMCWalk: Exponential distribution with log-density$-\langle c, x \rangle/T$ .
Some small optimizations are done to skip repetitive computations (but the main speed-up is still coming from faster reflection oracles).
Tests for my new implementations in volesti is written in test/sampling_correlation_matrices_test.cpp which consists of new tests using doctest. I also modified test/CMakeLists.txt to generate tests by CMake.
My tests assert whether
- the generated matrices are correlation matrices
- the Potential Scale Reduction Factor (PSRF) of the whole sample set is smaller than
$1.1$ .
I tested Ball Walk, RDHR Walk and Billiard Walk on Spectrahedron, CorrelationSpectrahedron and CorrelationSpectrahedron_MT for various matrix size
- ✔️ The matrices generated by random walk algorithms all satisfy to be correlation matrices
- The PSRF varies according to Walk Type,
$n$ and the number of samples. The experiment section below will detail the difference of random walks in this aspect.
In examples/correlation_matrices/sampler.cpp, I implemented the function:
old_uniform_sampling<WalkType, Point>(n, randPoints, walkL, num_points, nburns)
that uses the general Spectrahedron class (and the LMI formed by CorrelationSpectrahedron and CorrelationSpectrahedron_MT.
First, I will explain in details the speed-up for Ball Walk, RDHR Walk and Billiard Walk. These random walks already exist in volesti but the new operations in CorrelationSpectrahedron and CorrelationSpectrahedron_MT bring better performance.
The images below represent
Ball Walk is already implemented in volesti. Ball Walk algorithm depends mainly on two operations:
- Generating a random direction (see
GetDirectionfunctions ininclude/sampling/sphere.hpp).
NewGetDirectionfunctions are added to return a direction in matrix form. - Checking the positive semi-definiteness of a generated matrix.
Here, the membership oracleis_inofSpectrahedronclass and its subclasses is used intensively. As explained above, the old implementation ofis_inis replaced byEigen::LDLT. The table below illustrates the speed-up using this newis_inmethod.
Timings (in milliseconds) for Ball Walk to sample
| Old implementation | CorrelationSpectrahedron | CorrelationSpectrahedron_MT | |
|---|---|---|---|
| 5 | 170 | 42 | 51 |
| 10 | 492 | 104 | 145 |
| 15 | 1069 | 206 | 291 |
| 20 | 2129 | 345 | 490 |
| 25 | 3981 | 525 | 743 |
| 30 | 6702 | 740 | 1005 |
| 35 | 11506 | 1003 | 1405 |
| 40 | 28597 | 1327 | 1852 |
| 45 | 51578 | 1675 | 2349 |
| 50 | 73294 | 2116 | 2958 |
| 55 | 113227 | 2554 | 3593 |
| 60 | 143268 | 3094 | 4300 |
| 65 | 186087 | 3658 | 5073 |
| 70 | 245637 | 4306 | 5975 |
| 75 | 317582 | 5017 | 7109 |
| 80 | 411608 | 5949 | 8195 |
| 85 | 510693 | 6806 | 9317 |
In Fig.1, we use Ball Walk with walk length 1 which results in a sample set where many samples coincide. The walk length is increased to 5 in Fig.2 (15000 actually generated), which provides better samples.
RDHR Walk mainly uses line_intersect functions. I observed in experiments that the line_intersect function of Spectrahedron class is much slower than the ones of CorrelationSpectrahedron and CorrelationSpectrahedron. This bottleneck comes from the difference in implementations line_intersect in those classes:
- In
Spectrahedron, it uses two eigenvalue problems that construct the matrices twice and compute respectively the positive and negative distances. Moreover, I corrected a bug which leads to wrong sign of negative distances (a-missing). - In the new classes, I use
Spectrato compute in one function call both distances.
Deeper reasons that lead to large difference in performance may depend on how Spectra is implemented and are unclear for me at the moment.
Timings (in milliseconds) for RDHR Walk to sample
| Old implementation | CorrelationSpectrahedron | CorrelationSpectrahedron_MT | |
|---|---|---|---|
| 5 | 89 | 29 | 31 |
| 10 | 190 | 195 | 203 |
| 15 | 467 | 262 | 282 |
| 20 | 790 | 312 | 335 |
| 25 | 1400 | 386 | 415 |
| 30 | 2248 | 448 | 492 |
| 35 | 3500 | 561 | 634 |
| 40 | 5500 | 624 | 706 |
| 45 | 9320 | 761 | 860 |
| 50 | 17150 | 829 | 957 |
| 55 | 33910 | 984 | 1143 |
| 60 | 42530 | 1070 | 1242 |
| 65 | 63800 | 1262 | 1476 |
| 70 | 88000 | 1348 | 1596 |
Billiard Walk mainly uses the intersection and reflection oracles. The new implementations of these oracles in CorrelationSpectrahedron and CorrelationSpectrahedron_MT accelerate Billiard Walk in volesti for sampling correlation matrices.
The main acceleration comes from the new compute_reflection function which is specialized for correlation matrices. Since Billiard Walk uses intensively reflection, this results in an impressive performance illustrated in the table below.
Timings (in milliseconds) for Billiard Walk to sample
| Old implementation | CorrelationSpectrahedron | CorrelationSpectrahedron_MT | |
|---|---|---|---|
| 5 | 40 | 0 | 0 |
| 10 | 630 | 70 | 70 |
| 15 | 4110 | 360 | 350 |
| 20 | 12650 | 920 | 900 |
| 25 | 37330 | 2150 | 2280 |
| 30 | 121150 | 4780 | 4640 |
| 35 | 350640 | 8660 | 8790 |
| 40 | 868060 | 14200 | 14690 |
The two ReHMC Walks are added in my second pull request. Some generated samples are plotted below.
In Fig. 5, I sampled
In Fig.6, I sampled
We expected that the samples concentrate around the mode
For Ball Walk, RDHR Walk and Billiard Walk, the new classes provide good performance for generating correlation matrices comparing with Spectrahedron class.
From the tables of timing above, we see that each step of Ball Walk and RDHR Walk is faster than Billiard Walk (which is obvious since each step of Billiard Walk is more sophisticated).
However, we also need to consider the mixing time, i.e., the number of steps to achieve the convergence of Markov random walks. This mixing time can be measured by PSRF which, by experiments, is much smaller for Billiard Walk. For instance, in high dimension, e.g.,
At the stage of writing this summary, the pull request of my implementation of ReHMC Walk is under review. As explained above, the experiments of Gaussian ReHMC Walk give reasonable results. However, the Exponential ReHMC Walk still needs to be verified.
A possible follow-up after the final submission of Google Summer of Code 2022 consists of:
- Reimplement membership and intersection oracles in
Spectrahedron(e.g. usingEigen::LDLT) - Comparing with other software for generating correlation matrices
- A general implementation of ReHMC walk that is templated by a functor
f(x) - Further optimization of member functions of
CorrelationSpectrahedron_MTandCorreMatrixclasses - Simplify user functions (e.g., fusion
_MTand nonMTuser functions)
While building volesti's test on an Ubuntu 22.04 system, I encountered an error:
doctest.h:4189:47: error: size of array 'altStackMem' is not an integral constant-expression
4189 | static char altStackMem[4 * SIGSTKSZ];
This comes from doctest.h v1.2.9 being used in test/. A proposed fix is to use doctest.h v2.4.8 found at https://github.com/doctest/doctest/blob/v2.4.8/doctest/doctest.h.
Logically, starting from the origin point, running Billiard Walk for the two new classes should results into the same set of samples (if the random seed is fixed). In my implementation, this is unfortunately not the case. The problem comes from the member function compute_reflection.
- In
CorrelationSpectrahedron, there is a dot product computed byEigenbuilt-indot. - In
CorrelationSpectrahedron_MT, I need to write adotoperator forCorreMatrix(in a naive way). Even though using any of them gives a set of samples that pass all the tests, which means: - All the samples lie correctly in the spectrahedron,
- The PSRF is smaller than 1.1,
but in some experiments, I found that the numerical values output by these two
dotfunctions have a small (0.00001) difference. I do not want to replace the use ofEigen'sdotby a naive implementation (which will solve the problem) asEigen'sdotis more efficient.





