LinDA for longitudinal microbiome analysis #7

EOMAK91 · 2023-02-02T20:07:10Z

Hi,

I am currently using MassLin2 to analyze my longitudinal microbiome dataset; however, I saw your recent paper describing LinDA and its strengths in comparison to MassLin2. Therefore, I would like to compare my current results to those I would get from LinDA. However, after browsing through your page, I do not see an example of how to use LinDA to analyze longitudinal microbiome data. Could you point me in the direction of a tutorial or webpage that demonstrates how to adapt LinDA for longitudinal analysis?

Thanks in advance.

zhouhj1994 · 2023-02-04T08:49:26Z

For longitudinal data, one may use the mixed-effect model to incorporate subject as a random effect, since one subject always has multiple observations causing correlations among data. The use of mixed-effect models in LinDA is the same as in linear regression (lm), such as 'formula = '~Smoke+Sex+(1|SubjectID)' in the example in the readme page.

PeterCx · 2023-03-15T17:33:53Z

Dear @zhouhj1994

I am using LinDA for longitudinal microbiome analysis. I have questions as a few things are not clear to me. This is also related to another issue #5

I have 3 groups (Control, Placebo, Treatment) and 3 timepoints (0, 22, 52). Thus in total there are 18 pairwise comparisons for which I want to know what taxa are differentially abundant.

I have used the following code:

linda.obj <- linda(Species_Filt, Metadata, formula = '~Treatment*Time+(1|Sample)')

alp1 <- linda.obj$output$TreatmentControl$log2FoldChange
se1 <- linda.obj$output$TreatmentControl$lfcSE
df1 <- linda.obj$output$TreatmentControl$df

alp2 <- linda.obj$output$Time22$log2FoldChange
se2 <- linda.obj$output$Time22$lfcSE
df2 <- linda.obj$output$Time22$df

alp3 <- linda.obj$output$`TreatmentControl:Time22`$log2FoldChange
se3 <- linda.obj$output$`TreatmentControl:Time22`$lfcSE
df3 <- linda.obj$output$`TreatmentControl:Time22`$df

cov12 <- linda.obj$covariance$TreatmentControl$Time22
cov13 <- linda.obj$covariance$TreatmentControl$`TreatmentControl:Time22`
cov23 <- linda.obj$covariance$Time22$`TreatmentControl:Time22`

stat <- (alp1+alp3) / (sqrt(se1^2+se3^2+2*cov13))
pvalue <- 2 * pt(-abs(stat), (df1+df3)/2)
padj <- p.adjust(pvalue, method = 'BH')
rownames(linda.obj$output[[1]])[padj <= 0.1]

This code should allow me to see if there is a difference between Control and Treatment at Day 22. linda.object$covariance is NULL so I cant complete the calculation.

However, even with this method its not possible to get all pairwise comparisons. For example - Treatment at Day 0 vs Treatment at Day 22 or to check if there is a difference between Control and Placebo at any time.

How should I proceed to get all pairwise comparisons?

Your help is greatly appreciated. Many thanks

Kind regards,

P

zhouhj1994 · 2023-03-16T08:20:37Z

Thanks for the use of our package.

I am not sure whether your codes will work. If the "Treatment" variable has three levels: Control, Placebo, Treatment, then "Control" will be treated as the reference level (alphabetizing rule), so there shouldn't be such results as "linda.obj$output$TreatmentControl" but "linda.obj$output$TreatmentPlacebo" and "linda.obj$output$TreatmentTreatment" instead.

To get all pairwise comparisons, for example,

Treatment at Day 0 vs Treatment at Day 22: First please make sure that your "Time" variable is categorical (i.e., factor type in R syntax), and note that "0" will be treated as the reference level. Then the following codes would do the task.

alp1 <- linda.obj$output$Time22$log2FoldChange
se1 <- linda.obj$output$Time22$lfcSE
df1 <- linda.obj$output$Time22$df

alp2 <- linda.obj$output$TreatmentTreatment:Time22$log2FoldChange
se2 <- linda.obj$output$TreatmentTreatment:Time22$lfcSE
df2 <- linda.obj$output$TreatmentTreatment:Time22$df

cov12 <- linda.obj$covariance$Time22$TreatmentTreatment:Time22

stat <- (alp1+alp2) / (sqrt(se1^2+se2^2+2*cov12))
pvalue <- 2 * pt(-abs(stat), (df1+df2)/2)
padj <- p.adjust(pvalue, method = 'BH')
rownames(linda.obj$output[[1]])[padj <= 0.1]

Control and Placebo at any time: If you mean Control and Placebo at the same day, then the differential analysis would be

# Time 0
alp1 <- linda.obj$output$TreatmentPlacebo$log2FoldChange
se1 <- linda.obj$output$TreatmentPlacebo$lfcSE
df1 <- linda.obj$output$TreatmentPlacebo$df

stat <- alp1/se1
pvalue <- 2 * pt(-abs(stat), df1)
padj <- p.adjust(pvalue, method = 'BH')
rownames(linda.obj$output[[1]])[padj <= 0.1]

# Time 22
alp2 <- linda.obj$output$TreatmentPlacebo:Time22$log2FoldChange
se2 <- linda.obj$output$TreatmentPlacebo:Time22$lfcSE
df2 <- linda.obj$output$TreatmentPlacebo:Time22$df

cov12 <- linda.obj$covariance$TreatmentPlacebo$TreatmentPlacebo:Time22

stat <- (alp1+alp2) / (sqrt(se1^2+se2^2+2*cov12))
pvalue <- 2 * pt(-abs(stat), (df1+df2)/2)
padj <- p.adjust(pvalue, method = 'BH')
rownames(linda.obj$output[[1]])[padj <= 0.1]

# Time 52: similar to Time 22

For other comparisons, the analyses would be similar, just be clear that there would be the following 8 outputs

TreatmentPlacebo, TreatmentTreatment, Time22, Time52, TreatmentPlacebo:Time22, TreatmentPlacebo:Time52,
TreatmentTreatment:Time22, TreatmentTreatment:Time52

and

TreatmentPlacebo (can be seen as the baseline Placebo effect, i.e., at time 0), TreatmentTreatment (can be seen as the baseline Treatment effect, i.e., at time 0), Time22 (can be seen as the baseline time 22 effect, i.e., in control group) , Time52 (can be seen as the baseline time 52 effect, i.e., in control group)

PeterCx · 2023-03-16T11:22:43Z

Thanks for the response. This is very very helpful. However, I cannot find the covariances in the output.

cov12 <- linda.obj$covariance returns NULL

There is no output called covariance in linda.obj. How are you calculating this or what am I missing?

Thanks very much. P

zhouhj1994 · 2023-03-24T05:31:45Z

Sorry for the late reply.

Are you using the latest version of the package? The old version didn't include the covariance in the output.

Have you tried this example? https://github.com/zhouhj1994/LinDA Can you see the covariances in the output?

Thanks.

PeterCx · 2023-03-27T08:53:58Z

No problem.

Yes I was using wrong version. I was using the linda function implemented in the microbiomeStat package. I have now installed the LinDA package correctly via devtools and its al working perfectly. Thanks again for all your help.

P

zhouhj1994 · 2023-03-27T13:39:30Z

Great.

Ivy-ops · 2023-10-09T19:05:32Z

Thank you for the insightful discussion. I have a question related to this. It would be highly appreciated if you could help me with the explanation of the LinDA results. @zhouhj1994
For example, in @EOMAK91 example, Time has 0,22 and 52. Should we consider the "Time" variable as a continuous variable or as a categorical factor? How does the interpretation differ when "Time" is viewed as continuous versus when it's treated as a factor?

zhouhj1994 · 2023-10-12T08:34:41Z

If you want to understanding the effect of time, in other words, how is the abundance changed by one unit change of time, you may treat it as continuous.
If you are interest in the difference between time 0 and time22/time 52，you may treat it as categorical.

Ivy-ops · 2023-10-16T16:02:00Z

Hi @zhouhj1994, thank you for the answer.

I've executed the following command: fit <- linda(feature.dat=feature.dat, meta.dat=meta.dat, formula = "~ treatment * time"), where "time" includes the values 0, 22, and 52, and "treatment" includes two levels, "yes" and "no." Could you help me check whether my use of linda output is correct? Much appreciated!

If I consider "time" as a continuous variable:
Does "fit$output$time$log2FoldChange" imply that a one-unit change in time results in an x log2FoldChange in each feature?
Does "fit$output$treatmentyes$log2FoldChange" mean that, in comparison to the "no" treatment, the "yes" treatment yields an x log2FoldChange in feature.dat?

If I consider "time" as a categorical variable:
Does "fit$output$time$time22$log2FoldChange" signify that time22 exhibits an x log2FoldChange in each feature compared to time 0?
Does "fit$output$treatmentyes$log2FoldChange" mean that, relative to the "no" treatment, the "yes" treatment results in an x log2FoldChange in feature.dat?

zhouhj1994 · 2023-10-19T03:09:07Z

Your interpretation is not bad. But you may want to say "...while keeping other variables fixed" or something like that and be careful of the interaction term.

Your model is "treatment + time + treatment * time"

Consider "time" as a continuous variable:

The effect of time: for treatment "no", the effect is "fit$output$time$log2FoldChange", while for treatment "yes", the effect is "fit$output$time$log2FoldChange" + "fit$output$treatmentyes:time$log2FoldChange"

The effect of treatment: while fixing time, the treatment "yes" yields "fit$output$treatmentyes$log2FoldChange" + "fit$output$treatmentyes:time$log2FoldChange" change in comparison to the treatment "no".

Consider "time" as a categorical variable:

The effect of time22: for treatment "no", the effect is "fit$output$time22$log2FoldChange", while for treatment "yes", the effect is "fit$output$time22$log2FoldChange" + "fit$output$treatmentyes:time22$log2FoldChange"

The effect of treatment: for fixed time22, the treatment "yes" yields "fit$output$treatmentyes$log2FoldChange" + "fit$output$treatmentyes:time22$log2FoldChange" change in comparison to the treatment "no".

Ivy-ops · 2023-10-23T15:10:00Z

Thank you for the correction and the details! @zhouhj1994

PengfanZhang · 2024-05-15T03:09:54Z

Hi, I have a naive question. What does fit$output$time$log2FoldChange mean when time is a continuous variable? Which two groups are used to calculate the fold change? Thanks!

zhouhj1994 · 2024-05-17T05:10:00Z

When time is a continuous variable, fit$output$time$log2FoldChange means the effect of time on the response variable (if there are no interactions considered). We might say, a one unit increase in time corresponds to change of fit$output$time$log2FoldChange in response.

zhouhj1994 · 2024-05-17T05:12:47Z

Here, the response variable is the log2-transformed microbiome abundance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LinDA for longitudinal microbiome analysis #7

LinDA for longitudinal microbiome analysis #7

EOMAK91 commented Feb 2, 2023

zhouhj1994 commented Feb 4, 2023

PeterCx commented Mar 15, 2023 •

edited

Loading

zhouhj1994 commented Mar 16, 2023

PeterCx commented Mar 16, 2023

zhouhj1994 commented Mar 24, 2023

PeterCx commented Mar 27, 2023 •

edited

Loading

zhouhj1994 commented Mar 27, 2023

Ivy-ops commented Oct 9, 2023

zhouhj1994 commented Oct 12, 2023

Ivy-ops commented Oct 16, 2023

zhouhj1994 commented Oct 19, 2023

Ivy-ops commented Oct 23, 2023

PengfanZhang commented May 15, 2024

zhouhj1994 commented May 17, 2024

zhouhj1994 commented May 17, 2024

LinDA for longitudinal microbiome analysis #7

LinDA for longitudinal microbiome analysis #7

Comments

EOMAK91 commented Feb 2, 2023

zhouhj1994 commented Feb 4, 2023

PeterCx commented Mar 15, 2023 • edited Loading

zhouhj1994 commented Mar 16, 2023

PeterCx commented Mar 16, 2023

zhouhj1994 commented Mar 24, 2023

PeterCx commented Mar 27, 2023 • edited Loading

zhouhj1994 commented Mar 27, 2023

Ivy-ops commented Oct 9, 2023

zhouhj1994 commented Oct 12, 2023

Ivy-ops commented Oct 16, 2023

zhouhj1994 commented Oct 19, 2023

Ivy-ops commented Oct 23, 2023

PengfanZhang commented May 15, 2024

zhouhj1994 commented May 17, 2024

zhouhj1994 commented May 17, 2024

PeterCx commented Mar 15, 2023 •

edited

Loading

PeterCx commented Mar 27, 2023 •

edited

Loading