We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'FindAllMarkers' and 'FindMarkers' appear to be pulling fold change data from the wrong slot.
# insert reproducible example here > library(dplyr) Attaching package: ‘dplyr’ The following objects are masked from ‘package:stats’: filter, lag The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union > library(Seurat) > pbmc_small <- pbmc_small > View(pbmc_small) > pbmc_small <- SCTransform(pbmc_small) Calculating cell attributes from input UMI matrix: log_umi Variance stabilizing transformation of count matrix of size 220 by 80 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 220 genes, 80 cells |======================================================================================================================================================| 100% Second step: Get residuals using fitted parameters for 220 genes |======================================================================================================================================================| 100% Computing corrected count matrix for 220 genes |======================================================================================================================================================| 100% Calculating gene attributes Wall clock passed: Time difference of 0.5991514 secs Determine variable features Place corrected count matrix in counts slot Centering data matrix |======================================================================================================================================================| 100% Set default assay to SCT > pbmc_small <- FindVariableFeatures(pbmc_small) > pbmc_small <- RunPCA(pbmc_small, features = pbmc_small@assays$SCT@var.features) Warning in irlba(A = t(x = object), nv = npcs, ...) : You're computing too large a percentage of total singular values, use a standard svd instead. PC_ 1 Positive: NKG7, CCL5, CST7, GZMA, CTSW, PRF1, GNLY, IL32, FGFBP2, GZMM CD7, CD3D, LCK, GZMB, RARRES3, CD247, XBP1, LAMP1, CD3E, GZMH SPON2, KLRD1, ALOX5AP, CCL4, EIF4A2, C12orf75, GYPC, KLRG1, LYAR, AKR1C3 Negative: CST3, LYZ, LST1, S100A9, IFITM3, TYMP, HLA-DRA, AIF1, HLA-DPA1, SERPINA1 S100A8, HLA-DPB1, CFD, FCN1, GRN, HLA-DMA, HLA-DRB5, CTSS, LGALS2, HLA-DRB1 IFI30, LGALS3, CFP, FCGRT, BID, HLA-DQB1, S100A11, CD68, HLA-DQA1, SAT1 PC_ 2 Positive: PPBP, SDPR, CLU, GNG11, PF4, GPX1, SPARC, CD9, HIST1H2AC, GP9 TUBB1, NRGN, NCOA4, TPM4, CA2, NGFRAP1, ITGA2B, TREML1, TMEM40, PGRMC1 RUFY1, ODC1, CCL5, MYL9, FERMT3, PTCRA, ACRBP, PARVB, TSC22D1, TALDO1 Negative: NKG7, IFITM2, CST7, GZMA, FCGR3A, IL32, CTSW, EIF4A2, SSR2, FGFBP2 CD7, GNLY, CD3D, TYROBP, SRSF7, GZMM, PRF1, HSP90AA1, LCK, LGALS1 CXCR4, HNRNPA3, XBP1, GZMB, LTB, S100A11, FCER1G, HNRNPF, CD247, YWHAB PC_ 3 Positive: HLA-DRA, HLA-DQB1, LTB, HLA-DRB1, HLA-DPB1, MS4A1, CD79A, HLA-DPA1, HLA-DQA1, HLA-DRB5 HLA-DMB, CD79B, CXCR4, TCL1A, SP100, LY86, NT5C, SNHG7, HLA-DQA2, HLA-DMA HVCN1, TMEM123, FCER2, LINC00926, PPAPDC1B, FCRLA, BANK1, EAF2, TNFAIP8, RP11-693J15.5 Negative: TYROBP, FCER1G, FCGR3A, LGALS1, LST1, CFD, CD68, AIF1, S100A9, IFITM3 S100A8, SERPINA1, FCN1, GZMB, CST3, SAT1, RHOC, PRF1, S100A11, GNLY TYMP, PSAP, CARD16, FGFBP2, CTSS, GZMA, NKG7, RGS2, SMCO4, LILRA3 PC_ 4 Positive: FGFBP2, GZMB, GNLY, PRF1, NKG7, CST7, HLA-DPB1, HLA-DRB1, HLA-DPA1, CTSW HLA-DQB1, HLA-DRA, HLA-DQA1, CCL4, GZMA, AKR1C3, GZMH, HLA-DMA, IGFBP7, TTC38 HLA-DMB, KLRD1, HLA-DRB5, C12orf75, SPON2, IL2RB, PCMT1, CLIC3, ALOX5AP, GSTP1 Negative: CD3D, LTB, LDHB, IL7R, CD3E, TMEM123, EIF4A2, NOSIP, THYN1, IFITM2 TAF7, TAGAP, COTL1, MPHOSPH6, IL32, ACAP1, SRSF7, PIK3IP1, CD2, MAL DNAJB1, EPC1, SAFB2, AIF1, GYPC, CXCR4, SAT1, ASNSD1, CTSS, CCR7 PC_ 5 Positive: LYZ, LGALS2, S100A9, FCER1A, CLEC10A, GSTP1, MS4A6A, S100A8, HLA-DMA, IL1B RGS1, LGALS3, CD14, ASGR1, CD3D, GRN, FCGRT, LDHB, TYMP, CST3 SRSF7, GPX1, CFP, IL32, IL7R, CD1C, MPEG1, FUOM, RNF130, LCK Negative: FCGR3A, CD79B, RHOC, RP11-290F20.3, TNFRSF1B, MS4A1, HVCN1, LILRA3, SAT1, AIF1 IFITM2, MS4A7, CD79A, SERPINA1, CD68, LST1, CTSS, FCER1G, FAM96A, TCL1A IFITM3, ADAR, EPC1, WARS, C5AR1, RGS2, HCK, LTB, PSAP, LINC00926 > pbmc_small <- FindNeighbors(pbmc_small, dims = 1:30) Computing nearest neighbor graph Computing SNN > pbmc_small <- FindClusters(pbmc_small) Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 80 Number of edges: 1876 Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.5397 Number of communities: 4 Elapsed time: 0 seconds > FindAllMarkers(pbmc_small, pseudocount.use = 0, assay = 'SCT', slot = 'data') |> head() Calculating cluster 0 |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s Calculating cluster 1 |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s Calculating cluster 2 |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s Calculating cluster 3 |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene LCK 1.652211e-11 5.946419 0.733 0.02 3.634864e-09 0 LCK CD3D 5.627207e-11 7.451211 0.700 0.02 1.237985e-08 0 CD3D CD7 6.484134e-11 6.380822 0.700 0.02 1.426510e-08 0 CD7 CST7 7.186854e-10 5.886713 0.667 0.04 1.581108e-07 0 CST7 IL32 7.878490e-10 6.544321 0.667 0.04 1.733268e-07 0 IL32 GZMM 9.576371e-10 Inf 0.600 0.00 2.106802e-07 0 GZMM # 'FoldChange' gives a different result than is reported by 'FindAllMarkers' when both use 'assay = "SCT"' and 'slot = "data"' > FoldChange(pbmc_small, pseudocount.use = 0, ident.1 = 0, ident.2 = c(1:3), features = 'LCK', assay = 'SCT', slot = 'data') avg_log2FC pct.1 pct.2 LCK 5.627499 0.733 0.02 #specifying 'slot = "counts"' returns the same fold change reported by 'FindAllMarkers' that was set to 'slot = "data"' > FoldChange(pbmc_small, pseudocount.use = 0, ident.1 = 0, ident.2 = c(1:3), features = 'LCK', assay = 'SCT', slot = 'counts') avg_log2FC pct.1 pct.2 LCK 5.946419 0.733 0.02 #the same is true for 'FindMarkers' > FindMarkers(pbmc_small, pseudocount.use = 0, ident.1 = 0, ident.2 = 1, features = 'LCK', assay = 'SCT', slot = 'data') |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s p_val avg_log2FC pct.1 pct.2 p_val_adj LCK 3.088212e-07 5.003002 0.733 0.038 6.794065e-05 # 'FoldChange' gives a different result than is reported by 'FindMarkers' > FoldChange(pbmc_small, pseudocount.use = 0, ident.1 = 0, ident.2 = 1, features = 'LCK', assay = 'SCT', slot = 'data') avg_log2FC pct.1 pct.2 LCK 4.684082 0.733 0.038 #again specifying 'slot = "counts"' returns the same result that 'FindMarkers' reports when 'slot = "data"' > FoldChange(pbmc_small, pseudocount.use = 0, ident.1 = 0, ident.2 = 1, features = 'LCK', assay = 'SCT', slot = 'counts') avg_log2FC pct.1 pct.2 LCK 5.003002 0.733 0.038 #pulling the values out manually reveals 'FoldChange' is reporting the correct values while 'FindAllMarkers' and 'FindMarkers' #are pulling fold change data from the wrong slot. > pbmc_small@assays$SCT@data['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters == 0)))] |> mean() [1] 0.6853326 > pbmc_small@assays$SCT@data['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters == 1)))] |> mean() [1] 0.02665951 > pbmc_small@assays$SCT@data['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters != 0)))] |> mean() [1] 0.01386294 > log2(0.6853326/0.02665951) [1] 4.684082 > log2(0.6853326/0.01386294) [1] 5.627499 > pbmc_small@assays$SCT@counts['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters == 0)))] |> mean() [1] 1.233333 > pbmc_small@assays$SCT@counts['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters == 1)))] |> mean() [1] 0.03846154 > pbmc_small@assays$SCT@counts['LCK', (row.names(filter(pbmc_small@meta.data, seurat_clusters != 0)))] |> mean() [1] 0.02 > log2(1.233333/0.03846154) [1] 5.003002 > log2(1.233333/0.02) [1] 5.946419 > sessionInfo() R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Seurat_4.3.0 dplyr_1.1.0 SeuratObject_4.1.3 sp_1.6-0 loaded via a namespace (and not attached): [1] nlme_3.1-162 matrixStats_0.63.0 spatstat.sparse_3.0-0 RcppAnnoy_0.0.20 RColorBrewer_1.1-3 httr_1.4.4 [7] backports_1.4.1 sctransform_0.3.5 tools_4.2.2 utf8_1.2.3 R6_2.5.1 irlba_2.3.5.1 [13] KernSmooth_2.23-20 uwot_0.1.14 lazyeval_0.2.2 colorspace_2.1-0 withr_2.5.0 tidyselect_1.2.0 [19] gridExtra_2.3 compiler_4.2.2 progressr_0.13.0 cli_3.6.0 spatstat.explore_3.0-6 plotly_4.10.1 [25] scales_1.2.1 lmtest_0.9-40 spatstat.data_3.0-0 ggridges_0.5.4 pbapply_1.7-0 readODS_1.8.0 [31] goftest_1.2-3 stringr_1.5.0 digest_0.6.30 spatstat.utils_3.0-1 pkgconfig_2.0.3 htmltools_0.5.4 [37] parallelly_1.34.0 limma_3.52.4 fastmap_1.1.0 htmlwidgets_1.6.1 rlang_1.0.6 rstudioapi_0.14 [43] shiny_1.7.4 generics_0.1.3 zoo_1.8-11 jsonlite_1.8.3 spatstat.random_3.1-3 ica_1.0-3 [49] car_3.1-1 magrittr_2.0.3 patchwork_1.1.2 Matrix_1.5-3 Rcpp_1.0.10 munsell_0.5.0 [55] fansi_1.0.4 abind_1.4-5 reticulate_1.28 lifecycle_1.0.3 stringi_1.7.12 carData_3.0-5 [61] MASS_7.3-58.2 Rtsne_0.16 plyr_1.8.8 grid_4.2.2 parallel_4.2.2 listenv_0.8.0 [67] promises_1.2.0.1 ggrepel_0.9.3 deldir_1.0-6 miniUI_0.1.1.1 lattice_0.20-45 cowplot_1.1.1 [73] splines_4.2.2 tensor_1.5 pillar_1.8.1 igraph_1.3.5 spatstat.geom_3.0-6 future.apply_1.10.0 [79] reshape2_1.4.4 codetools_0.2-19 leiden_0.4.3 gprofiler2_0.2.1 glue_1.6.2 data.table_1.14.6 [85] png_0.1-8 vctrs_0.5.2 httpuv_1.6.9 polyclip_1.10-4 gtable_0.3.1 RANN_2.6.1 [91] purrr_1.0.1 tidyr_1.3.0 scattermore_0.8 future_1.31.0 ggplot2_3.4.1 mime_0.12 [97] broom_1.0.3 xtable_1.8-4 rstatix_0.7.2 later_1.3.0 survival_3.5-3 viridisLite_0.4.1 [103] tibble_3.1.8 cluster_2.1.4 globals_0.16.2 fitdistrplus_1.1-8 ellipsis_0.3.2 ROCR_1.0-11
The text was updated successfully, but these errors were encountered:
Thanks for the bug report and reproducible example @jbeanphd! This is now fixed in the develop branc e08e108
You can check out the latest updates by installing the develop branch:
remotes::install_github("satijalab/seurat", ref="develop")
If you are still facing issues, please open a new issue.
Sorry, something went wrong.
No branches or pull requests
'FindAllMarkers' and 'FindMarkers' appear to be pulling fold change data from the wrong slot.
The text was updated successfully, but these errors were encountered: