Skip to content

How to implement more complex contrast formulas #305

@CarlottaHS

Description

@CarlottaHS

Hi!

Thanks a lot for the great tool! We already got some interesting results, but now are running into issues using more complex design and contrast matrices.

I tried following your vignette in your “Making comparisons for differential abundance using contrasts” notebook and the limma syntax for specifying contrast and design matrix, but so far no success.

You point to the limma and edgeR bioconductor pages for in-depth discussion on deciding which design is useful. But now are running into problems when trying to use these designs.

We have a design matrix, which includes many different factors.

> head(combined_design)
          sampleName Diagnosis Visit Study.ID
A_009_V1 A_009_V1   Healthy    V1    A_009      
B_004_V4 B_004_V4       SMM    V4    B_004      
B_014_V3 B_014_V3        MM    V3    B_014      
C_001_V2 C_001_V2     NSCLC    V2    C_001      
A_009_V3 A_009_V3   Healthy    V3    A_009      
B_004_V2 B_004_V2       SMM    V2    B_004      

What works are simple contrasts:

contrast.all <- c("DiagnosisHealthy - DiagnosisMM", "DiagnosisHealthy - DiagnosisNSCLC", "DiagnosisHealthy - DiagnosisSMM")
model <- model.matrix(~ 0 + Diagnosis, data = combined_design)
model.contrast <- makeContrasts(contrasts = contrast.all, levels = model)
model.contrast 

> Contrasts
Levels             DiagnosisHealthy - DiagnosisMM DiagnosisHealthy - DiagnosisNSCLC DiagnosisHealthy - DiagnosisSMM
  DiagnosisHealthy                              1                                 1                               1
  DiagnosisMM                                  -1                                 0                               0
  DiagnosisNSCLC                                0                                -1                               0
  DiagnosisSMM                                  0                                 0                              -1

and then running each contrast sequentially. E.g.:

DA_woEry1 <- testNhoods(combined_milo, design = ~ 0 + Diagnosis, design.df = combined_design,
model.contrasts = "DiagnosisHealthy - DiagnosisMM", fdr.weighting = "graph-overlap")

But we further also want to include the timepoint/ Visit into the analysis and not just the diagnosis. Here, we run into issues. The code is running, but the results are not what we would expect.

model3 <- model.matrix(~0+Diagnosis+Visit, data = combined_design)

model.contrast3 <- makeContrasts(Healthy_V1_vs_V2 = DiagnosisHealthy - VisitV2,
    Healthy_V1_vs_V3 = DiagnosisHealthy - VisitV3,
    Healthy_V1_vs_V4 = DiagnosisHealthy - VisitV4,
    MM_V1_vs_V2 = DiagnosisMM - VisitV2,
    MM_V1_vs_V3 = DiagnosisMM - VisitV3,
    MM_V1_vs_V4 = DiagnosisMM - VisitV4,
    NSCLC_V1_vs_V2 = DiagnosisNSCLC - VisitV2,
    NSCLC_V1_vs_V3 = DiagnosisNSCLC - VisitV3,
    NSCLC_V1_vs_V4 = DiagnosisNSCLC - VisitV4,
    SMM_V1_vs_V2 = DiagnosisSMM - VisitV2,
    SMM_V1_vs_V3 = DiagnosisSMM - VisitV3,
    SMM_V1_vs_V4 = DiagnosisSMM - VisitV4,
    levels = model3)

# also testing each contrast sequentially 
DA_test <- testNhoods(combined_milo, design = model3, design.df = combined_design, model.contrasts = "DiagnosisHealthy - VisitV2" , fdr.weighting = "graph-overlap")

image

image

We also tried using some other model matrices such as “~0 + Diagnosis + Diagnosis:Visit” etc. but the results are similar.

We are a bit lost as we are following the limma syntax and are using their tutorial to create our model and contrast matrices and they have the expected format, so we are wondering what we need to do differently for the Milo package.

I hope my explanations are clear and sufficient. If not I am happy to provide more context!

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions