Skip to content

Issue with GeneActivity function #865

@sloanlewis

Description

@sloanlewis

Hello,
I am trying to run the GeneActivity function on my 10X scATAC dataset. I am following this vignette (https://satijalab.org/signac/articles/pbmc_vignette.html), but my data is from macaques so I have a feeling the issue is with the annotation file.

## Here is how I am adding the annotation:
#make a reference object
>library(ensembldb)
#load the GTF file
>gtffile <- "Macaca_mulatta.Mmul_8.0.1.97.filtered.gtf"
#generate the SQLite database file
>DB <- ensDbFromGtf(gtf = gtffile)
#load the DB file directly
>EDB <- EnsDb(DB)
>annotations <- GetGRangesFromEnsDb(EDB)
>Annotation(pbmc) <- annotations

## It runs properly and adds the annotation:
> pbmc[['peaks']]

ChromatinAssay data with 205828 features for 7588 cells
Variable features: 205828 
Genome: Mmul_8.0.1 
Annotation present: TRUE 
Motifs present: FALSE 
Fragment files: 1

> Annotation(pbmc)

GRanges object with 1414714 ranges and 5 metadata columns:
                     seqnames      ranges strand |              tx_id   gene_name            gene_id   gene_biotype     type
                        <Rle>   <IRanges>  <Rle> |        <character> <character>        <character>    <character> <factor>
  ENSMMUE00000311984        1 25432-25503      + | ENSMMUT00000008326      SAMD11 ENSMMUG00000005947 protein_coding     exon
  ENSMMUE00000311984        1 25432-25503      + | ENSMMUT00000015569      SAMD11 ENSMMUG00000005947 protein_coding     exon
  ENSMMUE00000311984        1 25432-25503      + | ENSMMUT00000047681      SAMD11 ENSMMUG00000005947 protein_coding     exon
  ENSMMUE00000311984        1 25432-25503      + | ENSMMUT00000054447      SAMD11 ENSMMUG00000005947 protein_coding     exon
  ENSMMUE00000311984        1 25432-25503      + | ENSMMUT00000063154      SAMD11 ENSMMUG00000005947 protein_coding     exon
                 ...      ...         ...    ... .                ...         ...                ...            ...      ...
  ENSMMUT00000038269       MT   8357-8563      + | ENSMMUT00000038269     MT-ATP8 ENSMMUG00000028684 protein_coding      cds
  ENSMMUT00000038271       MT   7532-8215      + | ENSMMUT00000038271      MT-CO2 ENSMMUG00000028686 protein_coding      cds
  ENSMMUT00000038274       MT   5850-7418      + | ENSMMUT00000038274      MT-CO1 ENSMMUG00000028689 protein_coding      cds
  ENSMMUT00000038280       MT   4421-5462      + | ENSMMUT00000038280      MT-ND2 ENSMMUG00000028695 protein_coding      cds
  ENSMMUT00000038284       MT   3259-4213      + | ENSMMUT00000038284      MT-ND1 ENSMMUG00000028699 protein_coding      cds
  seqinfo: 23 sequences from Mmul_8.0.1 genome

## Everything shown below runs fine, but when I try to run the GeneActivity function, I get the error at the bottom:
>pbmc <- NucleosomeSignal(object = pbmc)
>pbmc <- TSSEnrichment(object = pbmc, fast = FALSE)
>pbmc$pct_reads_in_peaks <- pbmc$peak_region_fragments / pbmc$passed_filters * 100
>pbmc$high.tss <- ifelse(pbmc$TSS.enrichment > 2, 'High', 'Low')
>pbmc$nucleosome_group <- ifelse(pbmc$nucleosome_signal > 4, 'NS > 4', 'NS < 4')
>pbmc <- subset(
  x = pbmc,
  subset = peak_region_fragments > 1000 &
    peak_region_fragments < 50000 &
    pct_reads_in_peaks > 15 &
    nucleosome_signal < 4 &
    TSS.enrichment > 2
)
>pbmc <- RunTFIDF(pbmc)
>pbmc <- FindTopFeatures(pbmc, min.cutoff = 'q0')
>pbmc <- RunSVD(pbmc)
>pbmc <- RunUMAP(object = pbmc, reduction = 'lsi', dims = 2:30)
>pbmc <- FindNeighbors(object = pbmc, reduction = 'lsi', dims = 2:30)
>pbmc <- FindClusters(object = pbmc, verbose = FALSE, algorithm = 3)

## GeneActivity error
> gene.activities <- GeneActivity(pbmc)
Extracting gene coordinates
Extracting reads overlapping genomic regions
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=13m 15s

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 
  'NA' indices are not (yet?) supported for sparse Matrices

## Any help solving this would be appreciated! Thanks.

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
attached base packages:
[1] stats4    parallel  stats     graphics  utils     datasets  grDevices methods   base     
other attached packages:
 [1] ensembldb_2.12.1        AnnotationFilter_1.12.0 GenomicFeatures_1.40.1  AnnotationDbi_1.50.3    Biobase_2.48.0          future_1.21.0           GenomicRanges_1.40.0    patchwork_1.1.1         ggplot2_3.3.3          
[10] GenomeInfoDb_1.24.2     IRanges_2.22.2          S4Vectors_0.26.1        BiocGenerics_0.34.0     Signac_1.4.0.9002       SeuratObject_4.0.2      Seurat_4.0.5            colorout_1.2-2            

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions