The goal of equatiomatic is to reduce the pain associated with
writing LaTeX code from a fitted model. In the future, the package aims
to support any model supported by
broom; so far it has
only been tested with lm
and glm
models and, at present, only
supports binomial glm
models (i.e., not ordinal or multinomial
models).
equatiomatic is not yet on CRAN. Install the development version from GitHub with
remotes::install_github("datalorax/equatiomatic")
The gif above shows the basic functionality.
To convert a model to LaTeX, feed a model object to extract_eq()
:
library(equatiomatic)
# Fit a simple model
mod1 <- lm(mpg ~ cyl + disp, mtcars)
# Give the results to extract_eq
extract_eq(mod1)
#> $$
#> \operatorname{mpg} = \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \epsilon
#> $$
The model can be built in any standard way—it can handle shortcut syntax:
mod2 <- lm(mpg ~ ., mtcars)
extract_eq(mod2)
#> $$
#> \operatorname{mpg} = \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp}) + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs}) + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> $$
When using categorical variables, it will include the levels of the variables as subscripts:
library(palmerpenguins)
mod3 <- lm(body_mass_g ~ bill_length_mm + species, penguins)
extract_eq(mod3)
#> $$
#> \operatorname{body\_mass\_g} = \alpha + \beta_{1}(\operatorname{bill\_length\_mm}) + \beta_{2}(\operatorname{species}_{\operatorname{Chinstrap}}) + \beta_{3}(\operatorname{species}_{\operatorname{Gentoo}}) + \epsilon
#> $$
It helpfully preserves the order the variables are supplied in the formula:
set.seed(8675309)
d <- data.frame(cat1 = rep(letters[1:3], 100),
cat2 = rep(LETTERS[1:3], each = 100),
cont1 = rnorm(300, 100, 1),
cont2 = rnorm(300, 50, 5),
out = rnorm(300, 10, 0.5))
mod4 <- lm(out ~ cont1 + cat2 + cont2 + cat1, d)
extract_eq(mod4)
#> $$
#> \operatorname{out} = \alpha + \beta_{1}(\operatorname{cont1}) + \beta_{2}(\operatorname{cat2}_{\operatorname{B}}) + \beta_{3}(\operatorname{cat2}_{\operatorname{C}}) + \beta_{4}(\operatorname{cont2}) + \beta_{5}(\operatorname{cat1}_{\operatorname{b}}) + \beta_{6}(\operatorname{cat1}_{\operatorname{c}}) + \epsilon
#> $$
You can wrap the equations so that a specified number of terms appear on
the right-hand side of the equation using terms_per_line
(defaults to
4):
extract_eq(mod2, wrap = TRUE)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp})\ + \\
#> &\quad \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs})\ + \\
#> &\quad \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$
extract_eq(mod2, wrap = TRUE, terms_per_line = 6)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp}) + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt})\ + \\
#> &\quad \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs}) + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$
When wrapping, you can change whether the lines end with trailing math
operators like +
(the default), or if they should begin with them
using operator_location = "end"
or operator_location = "start"
:
extract_eq(mod2, wrap = TRUE, terms_per_line = 4, operator_location = "start")
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp})\\
#> &\quad + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs})\\
#> &\quad + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$
By default, all text in the equation is wrapped in \operatorname{}
.
You can optionally have the variables themselves be italicized (i.e. not
be wrapped in \operatorname{}
) with ital_vars = TRUE
:
extract_eq(mod2, wrap = TRUE, ital_vars = TRUE)
#> $$
#> \begin{aligned}
#> mpg &= \alpha + \beta_{1}(cyl) + \beta_{2}(disp) + \beta_{3}(hp)\ + \\
#> &\quad \beta_{4}(drat) + \beta_{5}(wt) + \beta_{6}(qsec) + \beta_{7}(vs)\ + \\
#> &\quad \beta_{8}(am) + \beta_{9}(gear) + \beta_{10}(carb) + \epsilon
#> \end{aligned}
#> $$
If you include extract_eq()
in an R Markdown chunk with
results="asis"
, knitr will render the equation.
Alternatively, you can run the code interactively, copy/paste the equation to where you want it in your document, and make any edits you’d like.
If you install
texPreview you can
use the preview()
function to preview the equation in RStudio:
preview(extract_eq(mod1))
Both extract_eq()
and preview()
work with magrittr pipes, so you
can do something like this:
library(magrittr) # or library(tidyverse) or any other package that exports %>%
extract_eq(mod1) %>%
preview()
There are several extra options you can enable with additional arguments
to extract_eq()
You can return actual numeric coefficients instead of Greek letters with
use_coefs = TRUE
:
extract_eq(mod1, use_coefs = TRUE)
#> $$
#> \operatorname{mpg} = 34.66 - 1.59(\operatorname{cyl}) - 0.02(\operatorname{disp}) + \epsilon
#> $$
By default, it will remove doubled operators like “+ -”, but you can
keep those in (which is often useful for teaching) with fix_signs = FALSE
:
extract_eq(mod1, use_coefs = TRUE, fix_signs = FALSE)
#> $$
#> \operatorname{mpg} = 34.66 + -1.59(\operatorname{cyl}) + -0.02(\operatorname{disp}) + \epsilon
#> $$
This works in longer wrapped equations:
extract_eq(mod2, wrap = TRUE, terms_per_line = 3,
use_coefs = TRUE, fix_signs = FALSE)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= 12.3 + -0.11(\operatorname{cyl}) + 0.01(\operatorname{disp})\ + \\
#> &\quad -0.02(\operatorname{hp}) + 0.79(\operatorname{drat}) + -3.72(\operatorname{wt})\ + \\
#> &\quad 0.82(\operatorname{qsec}) + 0.32(\operatorname{vs}) + 2.52(\operatorname{am})\ + \\
#> &\quad 0.66(\operatorname{gear}) + -0.2(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$
You’re not limited to just lm
models! Try out logistic regression
with glm()
:
set.seed(8675309)
d <- data.frame(out = sample(0:1, 100, replace = TRUE),
cat1 = rep(letters[1:3], 100),
cat2 = rep(LETTERS[1:3], each = 100),
cont1 = rnorm(300, 100, 1),
cont2 = rnorm(300, 50, 5))
mod5 <- glm(out ~ ., data = d, family = binomial(link = "logit"))
extract_eq(mod5, wrap = TRUE)
#> $$
#> \begin{aligned}
#> \log\left[ \frac { P( \operatorname{out} = \operatorname{1} ) }{ 1 - P( \operatorname{out} = \operatorname{1} ) } \right] &= \alpha + \beta_{1}(\operatorname{cat1}_{\operatorname{b}}) + \beta_{2}(\operatorname{cat1}_{\operatorname{c}}) + \beta_{3}(\operatorname{cat2}_{\operatorname{B}})\ + \\
#> &\quad \beta_{4}(\operatorname{cat2}_{\operatorname{C}}) + \beta_{5}(\operatorname{cont1}) + \beta_{6}(\operatorname{cont2}) + \epsilon
#> \end{aligned}
#> $$
This project is brand new. If you would like to contribute, we’d love
your help! We are particularly interested in extending to more models.
At present, we have only tested lm
and glm
, but hope to support any
model supported by broom
in the future.
Please note that the ‘equatiomatic’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.