Skip to content

Mixscape classification looks wrong in identifying NP population #5134

Closed
@RuiyuRayWang

Description

@RuiyuRayWang

Hi, Seurat team

I'm trying to analyze a Perturb-seq dataset following the mixscape vignette.

I encountered two issues:

  1. After RunMixscape(), R console show several messages "WARNING! NOT CONVERGENT! ". I know this warning message shows up when normalmixEM() failed at finding optimal parameters. But what I'm curious is that when I do the plots, a large amount of cells were predicted as NP (see plots below).
    Rplot1
    Rplot2
    My first question: is this normal?

  2. To check that mixscape is assigning the correct perturbation status to cells, I used PlotPerturbScore() function as suggested by the mixscape vignette to visualize the results from RunMixscape().
    It seems this visualization does not work for genes with 0 KO. i.e. for Adnp,
    PlotPerturbScore(cells, target.gene.ident = "Adnp", mixscape.class = "mixscape_class", col = "coral2") + labs(fill = "mixscape class")
    returned
    Error in do.call(rbind, prtb_score_list) : second argument must be a list

Additionally, when I visualize one of the genes that do have KOs, i.e. Cul3
PlotPerturbScore(mg, target.gene.ident = "Cul3", mixscape.class = "mixscape_class", col = "coral2") + labs(fill = "mixscape class")
image
It seems that a small percentage of cells on the right tail of the population were also assigned as NP. Why this behavior? Is there a way to adjust the parameters so that I could get better results?

Thanks!
Ray

Activity

epapalexi

epapalexi commented on Oct 4, 2021

@epapalexi
Contributor

Hi Ray,

Regarding your first question: I happen to be familiar with this particular dataset you are trying to analyze. May I ask if you are taking into consideration that there are different cell types in the dataset before running mixscape? If not I would suggest running mixscape and specifying split.by = "cell.type" where cell.type is a metadata column with your cell type classifications. This helps a lot in cases where a perturbation has a cell-type specific effect that can be missed if you run mixscape pooling all cell types together. In the past, when I analyzed this dataset using split.by I was able to get a much higher % of perturbed cells.
Regarding the error on PlotPerturbScore: we do not calculate a perturbation score if mixscape finds that there are no differentially expressed genes between a target gene class and NTs. This is because the score is calculated based on the differentially expressed genes. That's why it throws an error. I will modify the code a bit so that in the feature instead of an error users get a message explaining this to them to avoid confusion.
Finally, about the Cul3 NP cells that fall into the KO cell distribution, I am not sure why this happens. If you can share your code with me I can take a look and figure out what it going on.

I hope this helps.

Best,
Efi

RuiyuRayWang

RuiyuRayWang commented on Oct 6, 2021

@RuiyuRayWang
Author

Hi Efi, Thanks for your response!

Yes, I'm aware that there are different cell types in the dataset, but since I'm only interested in one particular cell type (microglia), I extracted Microglia into a single object and ran mixscape on these subsetted cells. This gave me the low perturbed cell counts above.

I tried the approach you mentioned, which is RunMixscape() on the entire object with split.by="CellType". This indeed gave me slightly increased percentage (from 5.3% to 10.8%) of "KO" cells, compared to running it on stand-alone Microglia object.

Regarding PlotPerturbScore, here's the code I used, which is basically codes from the mixscape vignette. The only difference is that in RunMixscape() calls, I tuned min.de.genes and iter.num slightly since I saw "WARNING! NOT CONVERGENT!" messages from mixtools with default params.

# Data, metadata and tSNE coordinates are downloaded from broad institute single cell portal
cells <- CreateSeuratObject(counts = data, 
                            project = "JIN", 
                            names.field = 2, 
                            names.delim = "_", 
                            meta.data = meta)
cells[["tsne"]] <- CreateDimReducObject(embeddings = as.matrix(tsne_coord), key = "TSNE_", assay = "RNA")


# Subset high quality microglia and calculate Cell Cycle Score
mg <- subset(cells, subset = CellType == "Microglia" & isKey == TRUE)  # isKey: high-quality cells
s.genes <- stringr::str_to_title(tolower(cc.genes$s.genes))
g2m.genes <- stringr::str_to_title(tolower(cc.genes$g2m.genes))
mg <- CellCycleScoring(mg, s.features = s.genes, g2m.features = g2m.genes, set.ident = TRUE)


# Reassign perturbation status: Perturbed vs NT
Idents(mg) <- "Perturbation"
mg <- SetIdent(mg, cells = Cells(mg)[!is.na(Idents(mg))], value = "Perturbed")
mg <- SetIdent(mg, cells = Cells(mg)[is.na(Idents(mg))], value = "NT")
mg[["crispr"]] <- Idents(mg)

Idents(mg) <- "Perturbation"
mg <- SetIdent(mg, cells = Cells(mg)[is.na(Idents(mg))], value = "NT")
mg <- SetIdent(mg, cells = WhichCells(mg, expression = Perturbation == "GFP"), value = "NT")
mg[["gene"]] <- Idents(mg)


# Run Standard Seurat workflow on "RNA" assay
DefaultAssay(mg) <- "RNA"
mg <- mg %>%
  SetAssayData(slot = "data", new.data = GetAssayData(mg, slot = "counts")) %>%
  # NormalizeData() %>%  ## Input data already normalized
  FindVariableFeatures() %>%
  ScaleData() %>%
  RunPCA() %>%
  RunUMAP(dims = 1:40)


# Calculating local perturbation signatures to mitigate confounding effects
mg <- CalcPerturbSig(object = mg, 
                     assay = "RNA", 
                     slot = "data", 
                     gd.class ="gene", 
                     nt.cell.class = "NT", 
                     reduction = "pca", 
                     ndims = 40, 
                     num.neighbors = 20, 
                     split.by = "Batch",
                     new.assay.name = "PRTB")


# Run Seurat standard workflow on "PRTB" assay
# Normalize data, find variable features and center data.
DefaultAssay(object = mg) <- 'PRTB'

# Use variable features from "RNA" assay.
VariableFeatures(object = mg) <- VariableFeatures(object = mg[["RNA"]])
mg <- ScaleData(object = mg, do.scale = F, do.center = T)

# Run PCA to reduce the dimensionality of the data.
mg <- RunPCA(object = mg, reduction.key = 'PCAprtb_', reduction.name = 'prtbpca')


# Identifying perturbations with no detectable effect
mg <- RunMixscape(object = mg, 
                  assay = "PRTB", 
                  slot = "scale.data", 
                  labels = "gene", 
                  nt.class.name = "NT", 
                  min.de.genes = 5, 
                  iter.num = 20, 
                  de.assay = "RNA", 
                  verbose = F,
                  prtb.type = "KO")


# Plotting Cul3 mixscape class identified by normalmixEM
PlotPerturbScore(mg,
                 target.gene.ident = "Cul3",
                 mixscape.class = "mixscape_class",
                 col = "coral2") + labs(fill = "mixscape class")

Best,
Ray

kaanokay

kaanokay commented on Jun 13, 2022

@kaanokay

Hi @epapalexi,

I have an issue with PlotPerturbScore function. I got this error: "Error in "[.data.frame`(prtb_score, , target.gene.class) : undefined columns selected" when I run PlotPerturbScore function with my CRISPR seurat object. Almost everything in my Seurat object is the same with yours (i.e., eccite Seurat object in Mixscape pipeline). Only difference is that my Seurat object have mouse gene names. I'm attaching my Seurat data with rds format (https://drive.google.com/drive/folders/1DJhqlk5hxLffDhGtZ9Gu3XIiGfTowvbN).

Could you help me with this?
Bw.

epapalexi

epapalexi commented on Jun 15, 2022

@epapalexi
Contributor

Hi Kaan,

Thanks for sending your seurat object. Just looking at your object it looks like you only calculated the perturbation signature of your cells. If you want to obtain perturbation scores you will have to run the RunMixscape command and then try using the PlotPerturbScore function to visualize the scores.
In case you already ran RunMixscape function and you are still getting the error above, can you please provide an updated object alongside the script you used to generated it?

Efi

kaanokay

kaanokay commented on Jun 15, 2022

@kaanokay

Hi Efi,

I ran RunMixscape function to obtain perturbation scores:

Mixscape_seurat <- RunMixscape( object = Mixscape_seurat, assay = "PRTB", slot = "scale.data", labels = "gene", nt.class.name = "NT", min.de.genes = 5, iter.num = 10, de.assay = "RNA", verbose = F, prtb.type = "KO", split.by = "replicate")

I used split.by argument because I have two replicates. If I do not use split.by parameter for my replicates, I'm getting a warning :"WARNING! NOT CONVERGENT!".

and then I ran PlotPerturbScore function but I got an error: "Error: Must request at least one colour from a hue palette". This error is different than previous one but I did not understand why I got this error.

Code was like:

PlotPerturbScore(object = Mixscape_seurat, target.gene.ident = "Gata3", mixscape.class = "mixscape_class", col = "coral2") + labs(fill = "mixscape class")

I updated my Seurat object in the same link. I used this updated Seurat object in PlotPerturbScore function.

Thank you for quick reply and developing such good tool for Perturb-seq.

Bw.

epapalexi

epapalexi commented on Jun 23, 2022

@epapalexi
Contributor

Hi Kaan,

I was able to figure out what was giving you the above error. I have merged a fix in the develop branch of Seurat. Here are instructions on how to use the new version to be able to plot the perturbation score:

install.packages('remotes')
remotes::install_github(repo = 'satijalab/seurat', ref = 'develop')
library(Seurat)

Once you do this, you should be able to use PlotPerturbScore.
Efi

kaanokay

kaanokay commented on Jun 24, 2022

@kaanokay

Hi Efi,

Thank you so much for your help! I'll try it.

~Kaan

littleju777

littleju777 commented on Nov 1, 2022

@littleju777

Hi Kaan,

I was able to figure out what was giving you the above error. I have merged a fix in the develop branch of Seurat. Here are instructions on how to use the new version to be able to plot the perturbation score:

install.packages('remotes') remotes::install_github(repo = 'satijalab/seurat', ref = 'develop') library(Seurat)

Once you do this, you should be able to use PlotPerturbScore. Efi

Hi @epapalexi ,

I got the same error in plotperturbsocre, Error in do.call(rbind, prtb_score_list) : second argument must be a list.

I try your solution, but it doesn't work. Could you help me with that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Participants

    @RuiyuRayWang@epapalexi@kaanokay@littleju777

    Issue actions

      Mixscape classification looks wrong in identifying NP population · Issue #5134 · satijalab/seurat