Skip to content

Integration doesnt work in Signac_1.0.9003 #292

@gouthamatla

Description

@gouthamatla

Hi, I am trying to integrate data with 3 conditions. Here is my code that I used for all three conditions to create the Seurat objects: I quantified all the three conditions using custom peak set, such that the features are consistent across,

library(Signac)
library(Seurat)
library(GenomicRanges)
library(future)

set.seed(1234)

regulome_peaks = read.table("scATAC/regulome_clustering_V7.txt")
regulome_peaks = StringToGRanges(regulome_peaks$V1, sep = c(":", "-"))

Cond1_metadata <- read.csv(
  file = "scATAC/Cond1/outs/singlecell.csv", eader = TRUE, ow.names = 1
)

Cond1_fragment.path <- 'scATAC/Cond1/outs/fragments.tsv.gz'

Cond1_fragment <- CreateFragmentObject(
  path = Cond1_fragment.path, alidate.fragments = FALSE
)

Cond1_drug_FM <- FeatureMatrix(
  fragments = Cond1_fragment,features = regulome_peaks,
)

rowcounts <- rowSums(Cond1_drug_FM)
Cond1_drug_FM <- Cond1_drug_FM[rowcounts>30,] #To eliminate features with less than 30 counts

Cond1_drug_assay <- CreateChromatinAssay(
  counts = Cond1_drug_FM, ep = c("-", "-"), enome = 'hg19', in.features = 5000, in.cells = 0
)

Cond1_drug <- CreateSeuratObject(
  counts = Cond1_drug_assay, ssay = "peaks",meta.data = Cond1_metadata
)

Fragments(Cond1_drug) <- CreateFragmentObject(
  path = Cond1_fragment.path, cells = colnames(Cond1_drug), tolerance = 0.5
)

Cond1_drug <- NucleosomeSignal(object = Cond1_drug)
Cond1_drug$nucleosome_group <- ifelse(Cond1_drug$nucleosome_signal > 10, 'NS > 10', 'NS < 10')
Cond1_drug$pct_reads_in_peaks <- Cond1_drug$peak_region_fragments / Cond1_drug$passed_filters * 100
Cond1_drug$blacklist_ratio <- Cond1_drug$blacklist_region_fragments / Cond1_drug$peak_region_fragments
Cond1_drug <- subset(Cond1_drug, subset = peak_region_fragments > 3000 & peak_region_fragments < 25000 & pct_reads_in_peaks > 25 & blacklist_ratio < 0.05 & nucleosome_signal < 5)

Cond1_drug <- RunTFIDF(Cond1_drug)
Cond1_drug <- FindTopFeatures(Cond1_drug)
Cond1_drug <- RunSVD(object = Cond1_drug,assay = 'peaks', irlba.work=500
)

Cond1_drug <- RunUMAP(object = Cond1_drug, reduction = 'lsi', dims = 2:30)
Cond1_drug <- FindNeighbors(object = Cond1_drug, reduction = 'lsi', dims = 2:30)
Cond1_drug <- FindClusters(object = Cond1_drug, verbose = FALSE, algorithm = 3)

This was done for 3 conditions, which gives beautiful clusters that are cell-type specific. First I subset peaks that are common across all the conditions and then used them to subset my objects and tried integration.

Then I am trying to integrate the three data sets:

Cond1_peaks = StringToGRanges(rownames(Cond1_drug), sep=c("-","-"))
Cond2_peaks = StringToGRanges(rownames(Cond2_drug), sep=c("-","-"))
Cond3_peaks = StringToGRanges(rownames(Cond3_drug), sep=c("-","-"))

peaks.use_common_all = GRangesToString(Reduce(intersect, list(Cond1_peaks, Cond2_peaks, Cond3_peaks)), sep = c("-", "-"))

Cond1_drug_comm <- subset(Cond1_drug, features=peaks.use_common_all)
Cond2_drug_comm <- subset(Cond2_drug, features=peaks.use_common_all)
Cond3_drug_comm <- subset(Cond3_drug, features=peaks.use_common_all)

Cond3_drug_comm

An object of class Seurat
169722 features across 4025 samples within 1 assay
Active assay: peaks (169722 features, 166876 variable features)
2 dimensional reductions calculated: lsi, umap

Cond1_drug_comm

An object of class Seurat
169722 features across 2657 samples within 1 assay
Active assay: peaks (169722 features, 164343 variable features)
2 dimensional reductions calculated: lsi, umap

Cond2_drug_comm

An object of class Seurat
169722 features across 2699 samples within 1 assay
Active assay: peaks (169722 features, 165553 variable features)
2 dimensional reductions calculated: lsi, umap

Cond1_drug_comm$condition = "Cond1"
Cond2_drug_comm$condition = "Cond2"
Cond3_drug_comm$condition = "Cond3"

HI.anchors <- FindIntegrationAnchors(object.list = list(Cond2_drug_comm, Cond1_drug_comm,Cond3_drug_comm),
                                     anchor.features = peaks.use_common_all,
                                    reduction="cca",
                                    reference = 1)
This throws the following error:

Warning message in CheckDuplicateCellNames(object.list = object.list):
"Some cell names are duplicated across objects provided. Renaming to enforce unique cell names."Scaling features for provided objects
Warning message:
"UNRELIABLE VALUE: Future ('future_lapply-1') unexpectedly generated random numbers without specifying argument '[future.]seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify argument '[future.]seed', e.g. 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use [future].seed=NULL, or set option 'future.rng.onMisuse' to "ignore"."Warning message:
"UNRELIABLE VALUE: Future ('future_lapply-2') unexpectedly generated random numbers without specifying argument '[future.]seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify argument '[future.]seed', e.g. 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use [future].seed=NULL, or set option 'future.rng.onMisuse' to "ignore"."Warning message:
"UNRELIABLE VALUE: Future ('future_lapply-3') unexpectedly generated random numbers without specifying argument '[future.]seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify argument '[future.]seed', e.g. 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use [future].seed=NULL, or set option 'future.rng.onMisuse' to "ignore"."Finding anchors between all query and reference datasets
Running CCA
Merging objects

Error in CreateChromatinAssay(data = merged.data, min.cells = 0, min.features = 0, : Length of supplied genomic ranges does not match number
           of rows in matrix
Traceback:

1. FindIntegrationAnchors(object.list = list(Cond2_drug_comm, 
 .     Cond1_drug_comm,Cond3_drug_comm), anchor.features = peaks.use_common_all, 
 .     reference = 1)
2. my.lapply(X = 1:nrow(x = combinations), FUN = function(row) {
 .     i <- combinations[row, 1]
 .     j <- combinations[row, 2]
 .     object.1 <- DietSeurat(object = object.list[[i]], assays = assay[i], 
 .         features = anchor.features, counts = FALSE, scale.data = TRUE, 
 .         dimreducs = reduction)
 .     object.2 <- DietSeurat(object = object.list[[j]], assays = assay[j], 
 .         features = anchor.features, counts = FALSE, scale.data = TRUE, 
 .         dimreducs = reduction)
 .     suppressWarnings(object.1[["ToIntegrate"]] <- object.1[[assay[i]]])
 .     DefaultAssay(object = object.1) <- "ToIntegrate"
 .     if (reduction %in% Reductions(object = object.1)) {
 .         slot(object = object.1[[reduction]], name = "assay.used") <- "ToIntegrate"
 .     }
 .     object.1 <- DietSeurat(object = object.1, assays = "ToIntegrate", 
 .         scale.data = TRUE, dimreducs = reduction)
 .     suppressWarnings(object.2[["ToIntegrate"]] <- object.2[[assay[j]]])
 .     DefaultAssay(object = object.2) <- "ToIntegrate"
 .     if (reduction %in% Reductions(object = object.2)) {
 .         slot(object = object.2[[reduction]], name = "assay.used") <- "ToIntegrate"
 .     }
 .     object.2 <- DietSeurat(object = object.2, assays = "ToIntegrate", 
 .         scale.data = TRUE, dimreducs = reduction)
 .     object.pair <- switch(EXPR = reduction, cca = {
 .         object.pair <- RunCCA(object1 = object.1, object2 = object.2, 
 .             assay1 = "ToIntegrate", assay2 = "ToIntegrate", features = anchor.features, 
 .             num.cc = max(dims), renormalize = FALSE, rescale = FALSE, 
 .             verbose = verbose)
 .         if (l2.norm) {
 .             object.pair <- L2Dim(object = object.pair, reduction = reduction)
 .             reduction <- paste0(reduction, ".l2")
 .             nn.reduction <- reduction
 .         }
 .         reduction.2 <- character()
 .         object.pair
 .     }, pca = {
 .         common.features <- intersect(x = rownames(x = Loadings(object = object.1[["pca"]])), 
 .             y = rownames(x = Loadings(object = object.2[["pca"]])))
 .         object.pair <- merge(x = object.1, y = object.2, merge.data = TRUE)
 .         projected.embeddings.1 <- t(x = GetAssayData(object = object.1, 
 .             slot = "scale.data")[common.features, ]) %*% Loadings(object = object.2[["pca"]])[common.features, 
 .             ]
 .         object.pair[["projectedpca.1"]] <- CreateDimReducObject(embeddings = rbind(projected.embeddings.1, 
 .             Embeddings(object = object.2[["pca"]])), assay = DefaultAssay(object = object.1), 
 .             key = "projectedpca1_")
 .         projected.embeddings.2 <- t(x = GetAssayData(object = object.2, 
 .             slot = "scale.data")[common.features, ]) %*% Loadings(object = object.1[["pca"]])[common.features, 
 .             ]
 .         object.pair[["projectedpca.2"]] <- CreateDimReducObject(embeddings = rbind(projected.embeddings.2, 
 .             Embeddings(object = object.1[["pca"]])), assay = DefaultAssay(object = object.2), 
 .             key = "projectedpca2_")
 .         object.pair[["pca"]] <- CreateDimReducObject(embeddings = rbind(Embeddings(object = object.1[["pca"]]), 
 .             Embeddings(object = object.2[["pca"]])), assay = DefaultAssay(object = object.1), 
 .             key = "pca_")
 .         reduction <- "projectedpca.1"
 .         reduction.2 <- "projectedpca.2"
 .         if (l2.norm) {
 .             slot(object = object.pair[["projectedpca.1"]], name = "cell.embeddings") <- Sweep(x = Embeddings(object = object.pair[["projectedpca.1"]]), 
 .                 MARGIN = 2, STATS = apply(X = Embeddings(object = object.pair[["projectedpca.1"]]), 
 .                   MARGIN = 2, FUN = sd), FUN = "/")
 .             slot(object = object.pair[["projectedpca.2"]], name = "cell.embeddings") <- Sweep(x = Embeddings(object = object.pair[["projectedpca.2"]]), 
 .                 MARGIN = 2, STATS = apply(X = Embeddings(object = object.pair[["projectedpca.2"]]), 
 .                   MARGIN = 2, FUN = sd), FUN = "/")
 .             object.pair <- L2Dim(object = object.pair, reduction = "projectedpca.1")
 .             object.pair <- L2Dim(object = object.pair, reduction = "projectedpca.2")
 .             reduction <- paste0(reduction, ".l2")
 .             reduction.2 <- paste0(reduction.2, ".l2")
 .         }
 .         object.pair
 .     }, stop("Invalid reduction parameter. Please choose either cca or rpca"))
 .     internal.neighbors <- internal.neighbors[c(i, j)]
 .     anchors <- FindAnchors(object.pair = object.pair, assay = c("ToIntegrate", 
 .         "ToIntegrate"), slot = slot, cells1 = colnames(x = object.1), 
 .         cells2 = colnames(x = object.2), internal.neighbors = internal.neighbors, 
 .         reduction = reduction, reduction.2 = reduction.2, nn.reduction = nn.reduction, 
 .         dims = dims, k.anchor = k.anchor, k.filter = k.filter, 
 .         k.score = k.score, max.features = max.features, nn.method = nn.method, 
 .         eps = eps, verbose = verbose)
 .     anchors[, 1] <- anchors[, 1] + offsets[i]
 .     anchors[, 2] <- anchors[, 2] + offsets[j]
 .     return(anchors)
 . })
3. future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...), 
 .     get_chunk = `[`, expr = expr, envir = envir, future.globals = future.globals, 
 .     future.packages = future.packages, future.scheduling = future.scheduling, 
 .     future.chunk.size = future.chunk.size, future.stdout = future.stdout, 
 .     future.conditions = future.conditions, future.seed = future.seed, 
 .     future.lazy = future.lazy, future.label = future.label, fcn_name = fcn_name, 
 .     args_name = args_name, debug = debug)
4. value(fs)
5. value.list(fs)
6. resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE)
7. resolve.list(y, result = TRUE, stdout = stdout, signal = signal, 
 .     force = TRUE)
8. signalConditionsASAP(obj, resignal = FALSE, pos = ii)
9. signalConditions(obj, exclude = getOption("future.relay.immediate", 
 .     "immediateCondition"), resignal = resignal, ...)

This worked with my previous installation with 2 conditions, and now it doesnt work even with 2 conditions. I am trying to use the merge protocol, but I think integration gives better results.

sessionInfo()

R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)

Matrix products: default
BLAS/LAPACK: /rds/general/user/gatla/home/anaconda3/envs/r361_signac/lib/R/lib/libRblas.so

locale:
[1] C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] future_1.19.1 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1
[4] IRanges_2.20.2 S4Vectors_0.24.4 BiocGenerics_0.32.0
[7] Seurat_3.2.2 Signac_1.0.9003

loaded via a namespace (and not attached):
[1] reticulate_1.16 tidyselect_1.1.0
[3] RSQLite_2.2.1 AnnotationDbi_1.48.0
[5] htmlwidgets_1.3 grid_3.6.1
[7] BiocParallel_1.20.1 Rtsne_0.15
[9] munsell_0.5.0 codetools_0.2-16
[11] ica_1.0-2 pbdZMQ_0.3-3
[13] miniUI_0.1.1.1 colorspace_1.4-1
[15] Biobase_2.46.0 OrganismDbi_1.28.0
[17] knitr_1.22 uuid_0.1-2
[19] rstudioapi_0.11 ROCR_1.0-11
[21] tensor_1.5 listenv_0.8.0
[23] labeling_0.3 repr_0.19.2
[25] GenomeInfoDbData_1.2.2 polyclip_1.10-0
[27] farver_2.0.3 bit64_4.0.5
[29] vctrs_0.3.4 generics_0.0.2
[31] xfun_0.6 biovizBase_1.34.1
[33] BiocFileCache_1.10.2 lsa_0.73.2
[35] ggseqlogo_0.1 R6_2.4.0
[37] rsvd_1.0.3 AnnotationFilter_1.10.0
[39] bitops_1.0-6 spatstat.utils_1.17-0
[41] reshape_0.8.8 DelayedArray_0.12.3
[43] assertthat_0.2.1 promises_1.1.1
[45] scales_1.1.1 nnet_7.3-12
[47] gtable_0.3.0 globals_0.13.1
[49] goftest_1.2-2 ggbio_1.34.0
[51] ensembldb_2.10.2 rlang_0.4.8
[53] RcppRoll_0.3.0 splines_3.6.1
[55] rtracklayer_1.46.0 lazyeval_0.2.2
[57] acepack_1.4.1 dichromat_2.0-0
[59] checkmate_1.9.1 BiocManager_1.30.10
[61] reshape2_1.4.3 abind_1.4-5
[63] GenomicFeatures_1.38.2 backports_1.1.4
[65] httpuv_1.5.4 Hmisc_4.2-0
[67] RBGL_1.62.1 tools_3.6.1
[69] ggplot2_3.3.2 ellipsis_0.3.1
[71] RColorBrewer_1.1-2 ggridges_0.5.2
[73] Rcpp_1.0.1 plyr_1.8.4
[75] base64enc_0.1-3 progress_1.2.2
[77] zlibbioc_1.32.0 purrr_0.3.4
[79] RCurl_1.98-1.2 prettyunits_1.1.1
[81] rpart_4.1-15 openssl_1.4.3
[83] deldir_0.1-29 pbapply_1.4-3
[85] cowplot_1.1.0 zoo_1.8-8
[87] SummarizedExperiment_1.16.1 ggrepel_0.8.2
[89] cluster_2.0.8 magrittr_1.5
[91] RSpectra_0.16-0 data.table_1.12.2
[93] lmtest_0.9-38 RANN_2.6.1
[95] SnowballC_0.7.0 ProtGenerics_1.18.0
[97] fitdistrplus_1.1-1 matrixStats_0.57.0
[99] hms_0.5.3 patchwork_1.0.1
[101] mime_0.6 evaluate_0.13
[103] xtable_1.8-4 XML_3.99-0.3
[105] shape_1.4.5 gridExtra_2.3
[107] compiler_3.6.1 biomaRt_2.42.1
[109] tibble_3.0.4 KernSmooth_2.23-17
[111] crayon_1.3.4 htmltools_0.5.0
[113] mgcv_1.8-28 later_1.1.0.1
[115] Formula_1.2-3 tidyr_1.1.2
[117] DBI_1.1.0 dbplyr_1.4.4
[119] MASS_7.3-51.3 rappdirs_0.3.1
[121] Matrix_1.2-17 igraph_1.2.6
[123] pkgconfig_2.0.2 GenomicAlignments_1.22.1
[125] foreign_0.8-71 IRdisplay_0.7.0
[127] plotly_4.9.2.1 foreach_1.5.1
[129] XVector_0.26.0 stringr_1.4.0
[131] VariantAnnotation_1.32.0 digest_0.6.18
[133] sctransform_0.3.1 RcppAnnoy_0.0.16
[135] graph_1.64.0 spatstat.data_1.4-3
[137] Biostrings_2.54.0 leiden_0.3.3
[139] fastmatch_1.1-0 htmlTable_1.13.1
[141] uwot_0.1.8 curl_4.3
[143] shiny_1.5.0 Rsamtools_2.2.3
[145] lifecycle_0.2.0 nlme_3.1-139
[147] jsonlite_1.7.1 viridisLite_0.3.0
[149] askpass_1.1 BSgenome_1.54.0
[151] pillar_1.4.6 lattice_0.20-38
[153] GGally_2.0.0 fastmap_1.0.1
[155] httr_1.4.2 survival_2.44-1.1
[157] glue_1.4.2 spatstat_1.64-1
[159] iterators_1.0.13 png_0.1-7
[161] glmnet_4.0-2 bit_4.0.4
[163] stringi_1.4.3 blob_1.2.1
[165] latticeExtra_0.6-28 memoise_1.1.0
[167] IRkernel_0.8.15 dplyr_1.0.2
[169] irlba_2.3.3 future.apply_1.6.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions