Can I use anova and or emmeans on taxatree_models #120

SilasK · 2023-07-20T14:35:49Z

SilasK
Jul 20, 2023

Hello,

I want to do a statistical analysis with three groups.
I would like to use a anova for this.

I would like to use anova and or emmeans on the taxatree_models.

It more or less works with using anova in the taxatree_models2stats function. but this is not paralellized.

anova_stats <- taxatree_models2stats(lm_models, fun= function(...) tidy(anova(...)) )

I think it is also possilbe to run the anova directly in the taxatree_models and set type= function(...) anova(lm(...))

But I would need the lm models for post-hoc tests..

SilasK · 2023-07-20T14:36:13Z

SilasK
Jul 20, 2023
Author

Maybe its the same as using lm with dummy variables.. https://stats.stackexchange.com/questions/190984/anova-vs-multiple-linear-regression-why-is-anova-so-commonly-used-in-experiment

0 replies

SilasK · 2023-07-27T15:21:34Z

SilasK
Jul 27, 2023
Author

I think it would still be cool to be able to apply a function like anova on all the models. to do correction for multiple testing..

0 replies

david-barnett · 2023-07-31T10:32:55Z

david-barnett
Jul 31, 2023
Maintainer

Hi Silas

In response to your question, I had a look at the taxatree_models functionality again.
It should be possible both to retrieve ANOVA stats from each model in a tidy format as you showed in your example code, and with keep_models = TRUE in taxatree_models2stats you can keep the models for any other purpose.
So it seems like you solved your own problem? Otherwise, let me know if I have misunderstood your objective?

An important second point: I also realised when looking again at taxatree_models examples that the example I gave of applying log2 transformations before running taxatree_models is not appropriate. This code results in the addition of log2 transformed values. Please only use the taxatree_models code examples from microViz version 0.11.0 where I have now provided a corrected procedure and examples, in which the log transformation can be applied after aggregation at each rank. Please do not rely on any results of any previous taxatree_models code you may have written that log transforms abundance data before running taxatree_models, as the results from all ranks except the lowest are incorrect. The associations should generally remain in the same direction, but the magnitude and significance will often change, particularly at higher ranks of aggregation. I am sincerely sorry if this mistake in my code causes you any problems.

Do reach out with any further questions. I appreciate your interest in microViz.

Below is a working example from version 0.11.0

library(microViz)
#> microViz version 0.11.0 - Copyright (C) 2023 David Barnett
#> ! Website: https://david-barnett.github.io/microViz
#> ✔ Useful?  For citation details, run: `citation("microViz")`
#> ✖ Silence? `suppressPackageStartupMessages(library(microViz))`
library(broom)

ps <- shao19 %>%
    ps_filter(infant_age == 4, !is.na(birth_weight)) %>% 
    ps_mutate(
        birth_mode = dplyr::recode(
            .x = c_section_type, Elective_CS = "CS Elective",
            Emergency_CS = "CS Emergency", .missing = "Vaginal"
        ),
        birthweight_z = as.vector(scale(birth_weight))
    ) 

psx <- ps %>% 
    tax_prepend_ranks() %>%
    tax_transform(trans = "compositional", rank = "genus") %>% 
    tax_filter(min_prevalence = 0.1, use_counts = TRUE) %>% 
    taxatree_models(
        type = lm,
        ranks = c("phylum", "class", "order", "family", "genus"),
        trans = "log2", trans_args = list(add = "halfmin"),
        variables = c("birth_mode", "birthweight_z") 
    )  
#> Proportional min_prevalence given: 0.1 --> min 30/292 samples.
#> 2023-07-31 12:32:28.91592 - modelling at rank: phylum
#> 2023-07-31 12:32:28.997311 - modelling at rank: class
#> 2023-07-31 12:32:29.105242 - modelling at rank: order
#> 2023-07-31 12:32:29.23594 - modelling at rank: family
#> 2023-07-31 12:32:29.399877 - modelling at rank: genus

psx_anova <- psx %>% 
    taxatree_models2stats(fun = function(m) tidy(anova(m)), .keep_models = TRUE)

psx_anova %>% taxatree_stats_get() 
#> # A tibble: 159 × 9
#>    term          taxon     rank  formula    df  sumsq meansq statistic   p.value
#>    <fct>         <chr>     <fct> <chr>   <int>  <dbl>  <dbl>     <dbl>     <dbl>
#>  1 birth_mode    p: Prote… phyl… `p: Pr…     2 7.12e2 3.56e2     7.94   4.38e- 4
#>  2 birthweight_z p: Prote… phyl… `p: Pr…     1 1.37e1 1.37e1     0.306  5.80e- 1
#>  3 Residuals     p: Prote… phyl… `p: Pr…   288 1.29e4 4.48e1    NA     NA       
#>  4 birth_mode    p: Bacte… phyl… `p: Ba…     2 6.02e3 3.01e3    83.2    3.13e-29
#>  5 birthweight_z p: Bacte… phyl… `p: Ba…     1 1.91e1 1.91e1     0.527  4.69e- 1
#>  6 Residuals     p: Bacte… phyl… `p: Ba…   288 1.04e4 3.62e1    NA     NA       
#>  7 birth_mode    p: Actin… phyl… `p: Ac…     2 1.64e3 8.21e2    23.6    3.13e-10
#>  8 birthweight_z p: Actin… phyl… `p: Ac…     1 9.21e0 9.21e0     0.265  6.07e- 1
#>  9 Residuals     p: Actin… phyl… `p: Ac…   288 1.00e4 3.47e1    NA     NA       
#> 10 birth_mode    p: Firmi… phyl… `p: Fi…     2 6.04e2 3.02e2    37.0    4.96e-15
#> # ℹ 149 more rows

psx_anova %>% taxatree_models_get() %>% .[["genus"]] %>% head(2)
#> $`g: Escherichia`
#> 
#> Call:
#> `g: Escherichia` ~ birth_mode + birthweight_z
#> 
#> Coefficients:
#>            (Intercept)  birth_modeCS Emergency       birth_modeVaginal  
#>               -17.5307                  1.1743                  8.9510  
#>          birthweight_z  
#>                -0.4156  
#> 
#> 
#> $`g: Bacteroides`
#> 
#> Call:
#> `g: Bacteroides` ~ birth_mode + birthweight_z
#> 
#> Coefficients:
#>            (Intercept)  birth_modeCS Emergency       birth_modeVaginal  
#>               -19.4358                  1.3316                  7.4421  
#>          birthweight_z  
#>                 0.3599

^{Created on 2023-07-31 with reprex v2.0.2}

1 reply

SilasK Aug 4, 2023
Author

Great. I updated to microbiz 0.11.

Yes. I managed to run anova and emmeans trough the taxatree_models2stats stats. And I could plot the resulting stats tree plot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use anova and or emmeans on taxatree_models #120

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Can I use anova and or emmeans on taxatree_models #120

SilasK Jul 20, 2023

Replies: 3 comments · 1 reply

SilasK Jul 20, 2023 Author

SilasK Jul 27, 2023 Author

david-barnett Jul 31, 2023 Maintainer

SilasK Aug 4, 2023 Author

SilasK
Jul 20, 2023

Replies: 3 comments 1 reply

SilasK
Jul 20, 2023
Author

SilasK
Jul 27, 2023
Author

david-barnett
Jul 31, 2023
Maintainer

SilasK Aug 4, 2023
Author