Latest code for MWU-MKL project.
This code contains a native C++ implementation and some python bindings aimed toward use with numpy and scikit-learn, along with some older MATLAB bindings (although I recommend using the newer python interface).
You can find an arXiv preprint of the relevant paper at:
http://arxiv.org/abs/1206.5580
This algorithm is built by combining
- Arora & Kale’s work on matrix multiplicative weight updates for solving semidefinite programs (SDPs) and
- Lanckriet et al.’s positive linear combination formulation of the multiple kernel learning problem; this can be expressed as a quadratically constrained quadratic program (QCQP), which can be transformed into an SDP.
Arora & Kale’s framework requires computing a
matrix exponential, which can be expensive unless
approximated. The beauty of our algorithm is that we
have a closed form of the exponential that takes
linear time in the number of input points to
compute. This is a huge win because now the
bottleneck lies in updating the scratch space, which
only takes
- [ ] Update for C++11
- [ ] Add parallel implementation (using a BSP framework)
Hopefully is broadcast-capable and does not need to be made into a clique; also possible to make a “broadcast node” but hopefully there is no “bottlenecking” effect in the underlying implementation
- primal_var: basically needs to be broadcast to every vertex – this is how the vertices update their weights and get what they need for the oracle
- is_pos, is_neg, entropy: utility funs
- oracle: this is an aggregation/reduction – it
takes g and y values from each vertex and
produces the top g corresponding to +y and the
top g corresponding to -y. It produces the
corresponding vertices that maximize g for +/-
y.
- Idea: could probably implement two separate aggreagators for this.
- if the maxima sum to less than -2, all vertices should vote to finish and produce failure.
- exponentiateM: another aggregation/reduction – produces the primal variable
- scratch (Galpha/alGal): the complicated part of
the parallel version
- vertex computes its elements of each gram column (+/-, for every kernel)
- produces an update for all the alGal that gets aggregated and added to the old values
- maintains its own row of Galpha