R/iprobit: Binary and multinomial probit regression using I-priors

This is an R package which extends I-prior regression to unordered categorical responses via a probit link function. This allows the user to fit models for classification or inference using fitted probabilities. Estimation is performed using a variational EM algorithm. Visit http://phd.haziqj.ml for details.

Binary classification (toy example)

Model fitting

dat <- gen_spiral(n = 300, seed = 123)  # generate binary toy example data set
mod <- iprobit(y ~ X1 + X2, dat, one.lam = TRUE, kernel = "fbm")
## ==========================================
## Converged after 56 iterations.

Model summary

summary(mod)
## Call:
## iprobit(formula = y ~ X1 + X2, data = dat, kernel = "fbm", one.lam = TRUE)
## 
## Classes: 1, 2 
## 
## RKHS used:
## Fractional Brownian motion with Hurst 0.5 (X1 + X2)
## 
## Hyperparameters:
##             Mean   S.D.    2.5%  97.5%
## Intercept 0.0000 0.0577 -0.1132 0.1132
## lambda    5.6718 0.2320  5.2171 6.1265
## ---
## 
## Closed-form VB-EM algorithm. Iterations: 56/100 
## Converged to within 1e-05 tolerance. Time taken: 3.573205 secs
## Variational lower bound: -140.711 
## Training error: 0%. Brier score: 0.01466541

Boundary plot for two-dimensional covariates

iplot_predict(mod)

Multiclass classification (toy example)

Model fit report and parameter estimates

dat <- gen_mixture(n = 400, m = 4, sd = 1.5, seed = 123)  # generate 4-class 
                                                          # toy example data set
(mod <- iprobit(y ~ X1 + X2, dat, train.samp = sample(1:400, size = 392),
                control = list(maxit = 10)))  # set aside 8 points for testing
## ===========================================================================
## Convergence criterion not met.
## Training error rate: 6.89 %
## Lower bound value: -208.9813 
## 
##            Class = 1 Class = 2 Class = 3 Class = 4
## Intercept    0.32094   0.26087   0.30916   0.46155
## lambda[1,]  -0.21341   0.00000   0.62978   0.00000
## lambda[2,]   0.00000  -0.50221   0.00000  -2.66854

Boundary plot for two-dimensional covariates

iplot_predict(mod, dec.bound = TRUE, plot.test = TRUE, grid.len = 50)

Obtain out-of-sample test error rates, predicted classes and probabilities

predict(mod)
## Training error: 0.000%
## Brier score   : 0.071 
## 
## Predicted classes:
## [1] 1 1 2 2 3 4 4 4
## Levels: 1 2 3 4 
## 
## Predicted probabilities:
##       1     2     3     4
## 1 0.750 0.126 0.007 0.117
## 2 0.852 0.137 0.000 0.010
## 3 0.064 0.831 0.103 0.002
## 4 0.046 0.859 0.095 0.001
## 5 0.000 0.089 0.906 0.005
## 6 0.399 0.011 0.015 0.576
## 7 0.136 0.050 0.229 0.585
## 8 0.146 0.015 0.110 0.729

Fisher's Iris data set

Model fitting (common RKHS scale across classes for each covariate)

mod <- iprobit(Species ~ ., iris, kernel = "fbm", one.lam = TRUE,
               common.RKHS.scale = TRUE, common.intercept = FALSE,
               control = list(alpha0 = 1, theta0 = 1, 
                              stop.crit = 1e-1))
## ==========================
## Converged after 34 iterations.

summary(mod)
## Call:
## iprobit(formula = Species ~ ., data = iris, kernel = "fbm", one.lam = TRUE, 
##     common.intercept = FALSE, common.RKHS.scale = TRUE, control = list(alpha0 = 1, 
##         theta0 = 1, stop.crit = 0.1))
## 
## Classes: setosa, versicolor, virginica 
## 
## RKHS used:
## Fractional Brownian motion with Hurst 0.5 (Sepal.Length + ... + Petal.Width)
## 
## Hyperparameters:
##                Mean   S.D.   2.5%  97.5%
## Intercept[1] 0.8813 0.0816 0.7213 1.0413
## Intercept[2] 1.0581 0.0816 0.8980 1.2181
## Intercept[3] 1.0606 0.0816 0.9006 1.2207
## lambda       0.3589 0.0120 0.3353 0.3824
## ---
## 
## Closed-form VB-EM algorithm. Iterations: 34/100 
## Converged to within 0.1 tolerance. Time taken: 5.10163 secs
## Variational lower bound: -50.39342 
## Training error: 4%. Brier score: 0.02759783

Obtain training error rates, predicted classes and probabilities with posterior quantiles

fitted(mod, quantiles = TRUE)
##                     2.5%   25%   50%   75% 97.5%
## Training error (%) 2.000 2.667 3.333 4.000 5.683
## Brier score        0.025 0.027 0.029 0.032 0.036
## 
## Predicted probabilities for Class = setosa 
##    2.5%   25%   50%   75% 97.5%
## 1 0.966 0.979 0.985 0.989 0.993
## 2 0.937 0.964 0.975 0.982 0.991
## 3 0.957 0.977 0.982 0.988 0.994
## 4 0.938 0.961 0.972 0.981 0.990
## 5 0.967 0.980 0.986 0.990 0.994
## # ... with 145 more rows
## 
## Predicted probabilities for Class = versicolor 
##    2.5%   25%   50%   75% 97.5%
## 1 0.003 0.006 0.008 0.012 0.022
## 2 0.004 0.010 0.014 0.022 0.042
## 3 0.002 0.004 0.008 0.012 0.021
## 4 0.004 0.009 0.014 0.022 0.033
## 5 0.002 0.005 0.007 0.011 0.021
## # ... with 145 more rows
## 
## Predicted probabilities for Class = virginica 
##    2.5%   25%   50%   75% 97.5%
## 1 0.002 0.004 0.006 0.009 0.014
## 2 0.003 0.007 0.010 0.015 0.028
## 3 0.003 0.006 0.009 0.013 0.030
## 4 0.004 0.008 0.012 0.017 0.034
## 5 0.002 0.004 0.006 0.008 0.015
## # ... with 145 more rows

Monitor convergence

iplot_lb(mod)

Plot of training error over time

iplot_error(mod)

Plot of fitted probabilities

iplot_fitted(mod)

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
R		R
README_files/figure-markdown_github-ascii_identifiers		README_files/figure-markdown_github-ascii_identifiers
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
appveyor.yml		appveyor.yml
iprobit.Rproj		iprobit.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R/iprobit: Binary and multinomial probit regression using I-priors

Binary classification (toy example)

Model fitting

Model summary

Boundary plot for two-dimensional covariates

Multiclass classification (toy example)

Model fit report and parameter estimates

Boundary plot for two-dimensional covariates

Obtain out-of-sample test error rates, predicted classes and probabilities

Fisher's Iris data set

Model fitting (common RKHS scale across classes for each covariate)

Obtain training error rates, predicted classes and probabilities with posterior quantiles

Monitor convergence

Plot of training error over time

Plot of fitted probabilities

About

Releases 1

Packages

Languages

License

haziqj/iprobit

Folders and files

Latest commit

History

Repository files navigation

R/iprobit: Binary and multinomial probit regression using I-priors

Binary classification (toy example)

Model fitting

Model summary

Boundary plot for two-dimensional covariates

Multiclass classification (toy example)

Model fit report and parameter estimates

Boundary plot for two-dimensional covariates

Obtain out-of-sample test error rates, predicted classes and probabilities

Fisher's Iris data set

Model fitting (common RKHS scale across classes for each covariate)

Obtain training error rates, predicted classes and probabilities with posterior quantiles

Monitor convergence

Plot of training error over time

Plot of fitted probabilities

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages