Bolt Online Learning Toolbox

Release:	\|version\|
Date:	\|today\|
Author:	peter.prettenhofer@gmail.com

Contents:

.. toctree::
   :maxdepth: 1

   install.rst
   overview.rst
   using-cli.rst
   using-api.rst
   whatsnew.rst

Introduction

Bolt features discriminative learning of linear predictors (e.g. SVM or Logistic Regression) using fast online learning algorithms. Bolt is aimed at large-scale, high-dimensional and sparse machine-learning problems. In particular, problems encountered in information retrieval and natural language processing.

Bolt considers linear models (:class:`bolt.model.LinearModel`) for binary classification,

f(\mathbf{x}) = \operatorname{sign}(\mathbf{w}^T \mathbf{x} + b) ,

and generalized linear models (:class:`bolt.model.GeneralizedLinearModel`) for multi-class classification,

f(\mathbf{x}) = \operatorname*{arg\,max}_y \mathbf{w}^T \Phi(\mathbf{x},y) + b_y .

Where \mathbf{w} and b are the model parameters that are learned from training data. In Bolt the model parameters are learned by minimizing the regularized training error given by,

E(\mathbf{w},b) = \sum_{i=1}^n L(y_i,f(\mathbf{x}_i)) + \lambda R(\mathbf{w}) ,

where L is a loss function that measures model fit and R is a regularization term that measures model complexity.

Features

Bolt supports the following trainers for binary classification:

Stochastic Gradient Descent (:class:`bolt.trainer.sgd.SGD`)

Supports various loss functions L : Hinge, Modified Huber, Log.

Supports various regularization terms R : L2, L1, and Elastic Net.

PEGASOS (:class:`bolt.trainer.sgd.PEGASOS`)

For multi-class classification:

One-versus-all (:class:`bolt.trainer.OVA`)

Averaged Perceptron (:class:`bolt.trainer.avgperceptron.AveragedPerceptron`)

Maximum Entropy (:class:`bolt.trainer.maxent.MaxentSGD`)

aka Multinomial Logistic Regression

Trained via SGD.

Benchmark

The following RCV1-CCAT benchmark results show that Bolt is competitive to state-of-the-art linear SVM solvers such as SVM^Perf, liblinear, or sgd. The dataset comprises 781.264 training documents, each represented by a 47.152 dimensional feature vector.

Algorithm	Training time	Accuracy
SVM^light	>600.00 sec
SVM^Perf [1]	11.60 sec	94.79
liblinear [2]	9.00 sec	94.77
bolt [3]	2.33 sec	94.79
sgd [4]	1.09 sec	94.77

[1]	Uses C=1000

[2]	Uses SVM (Dual), B=1

[3]	Uses E=5, r=0.00001, l=0, b

[4]	Uses epochs=5, lambda=0.00001

Indices and tables

:ref:`genindex`
:ref:`modindex`
:ref:`search`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.rst

index.rst

Bolt Online Learning Toolbox

Indices and tables

Files

index.rst

Latest commit

History

index.rst

File metadata and controls

Bolt Online Learning Toolbox

Indices and tables